Institut für Übersetzen und Dolmetschen (IÜD), Universität Heidelberg

Logo

Professor Dr. Bogdan Babych, Institut für Übersetzen und Dolmetschen, Universität Heidelberg, Plöck 57a, 69117 Heidelberg, Tel +49 6221 547230, email bogdan.babych@iued.uni-heidelberg.de

View My GitHub Profile

Prof. Dr. Bogdan Babych

Professor of Translation Studies, Department of Translation, Communication and Technology

Institute for Translation and Interpreting, Universität Heidelberg

Home Research Teaching Collaboration Techologies Image Image

Development of translation technologies and applications

Research and teaching experiments

  1. cgiBLEU: On-line interface to BLEU / NIST evaluation toolkit: cgiBLEU
  2. colloc4nlg Experiments on using collocations for Natural Language Generation: colloc4nlg; web; code
  3. Kyiv2020 Kyiv Translation Summer School 2020 at Igor Sikorsky Kyiv Polytechnic Institute presentation and materials
  4. smallSMT A prototype SMT system for Lingenio’s FlexNeuroTrans project: web interface web file interface
  5. Ukrainian corpus and morphological tools:
  6. Processing CORD19 corpus
    • This notebook downloads and reads the free cord19 corpus into one file.
    • The notebook is hosted at IÜD, Heidelberg University github repository https://github.com/iued-uni-heidelberg/cord19.
    • CORD19 (covid-19) is an open-source corpus available from https://www.semanticscholar.org/cord19/download. Documentation is available at https://github.com/allenai/cord19. The original files are in json format.
    • The output file is in plain text format; documents are separated (by default) by <doc id=”doc1000001”> … </doc> tags. The purpose of the plain text file is for further processing, e.g., generating linguistic annotation using the TreeTagger or the Stanford parser for part-of-speech annotation or dependency / constituency parsing.