Institut für Übersetzen und Dolmetschen (IÜD), Universität Heidelberg

Logo

Institut für Übersetzen und Dolmetschen, Universität Heidelberg, Plöck 57a, 69117 Heidelberg. Contact Professor Dr. Bogdan Babych, email bogdan.babych@iued.uni-heidelberg.de

View the Project on GitHub bogdanbabych/courses

Institute for Translation and Interpreting, Heidelberg University

Summer Semester 2020-2021

Virtual Seminar Room

Time: 22 червня о 14:00 / Kyiv time Zoom link: https://us02web.zoom.us/j/81027581757 pwd 111

Course materials

Thanks for attending the class!

Outcome: Google Spreadsheet with comparison results / improvement

Controlled Language specification documents and slides: https://heibox.uni-heidelberg.de/d/ef3cc0abfbf94193be11/

Many thanks to those who have uploaded their worksheets!!!

Instructions for the course
  1. ( All files are in the shared folder under https://heibox.uni-heidelberg.de/d/b5ce10402b7d4ca48722/ ).
  2. In this collaborative project we will test Controlled Language (CL) developed by ourselves and other people; Please save this document as *-yourInitials-yourTL.docx At the end please upload it to the shared project folder on the Google Drive, so other people could test your CL.
  3. For the overview, you can watch video from last year, starting at 1:00:00, for 7 min: https://youtu.be/rWwBqZY2F9c?t=3618 .
  4. Download and fill in the worksheet Word file https://heibox.uni-heidelberg.de/d/6c2f29986c11488c9fd9/ .
    • To do this, click on the “download” button on the right, do not click on the file name.
    • (We aim to cover Stages 01 through 06)
    • Copy 2 left columns with numbers and the ‘baseline’ unrestricted source text,
    • Open Google Translate https://translate.google.com/
    • Generate Ukrainian translation.
  5. Stage01: Evaluate the translation in the Google Form: https://forms.gle/qVRx1Vf9VCEmWa2S8
    • Approx. time ~10 min.
    • Type your name or initials, etc.
    • Choose Stage01 of the experiment (baseline Google evaluation)
    • In the “For which Target Language… “ choose ‘uk’
    • For each sentence, record the score (1 to 7) based on the specified criteria: suitability for post-editing
    • When finished click “submit”, then click on “Submit another response”, getting ready for Stage02
  6. Copy the Google Translate output and paste it into the Word worksheet, section Stage01, under the table.
  7. Stage02 of the experiment, approx. time ~30 min.
    • In Google Translate, modify each English sentence on the left Using Controlled Language guidelines, to improve translation quality, e.g., by introducing implicit information, replacing ambiguous source words, passive constructions, one idea per sentence, etc.
    • The sentence may not be ideal, but it should improve post-editing time and/or effort.
    • Some sentences may be not possible to improve.
    • Copy resulting Controlled Language sentences into the worksheet (column 5), saving the file after each sentence.
    • If time allows, in Column 6 after each sentence note which CL rules you applied (you can use categories before the table, or come up with your own categories)
  8. Repeat evaluation experiment in Google Form for Stage 02,
    • Approx. ~10 min.
    • recording your name (please use the same name / initials),
    • Stage02, uk as Target Language and
    • your evaluation scores; most of them will be very high
    • Click Sumbit, then again “Submit another response”
    • Go back to Google Translate and copy the resulting Ukrainian output into the worksheet, Section for Stage02 after the table
  9. Stage03 and Stage04 Repeat the baseline (without CL) and CL-pre-edited evaluation for a ‘surprise’ MT system (Bing MT https://www.bing.com/translator.
    • Approx. ~10 min each stage
    • The aim is to find out if the same CL, which we developed for Google, would also work for improving another MT system (Bing)
    • Bing has 1k character limit, so you will need to do translation/evaluation in two stages, e.g., Segments 1-10, then Segments 11-17
    • In Stage03 paste ‘baseline’ text from Columns 1 and 2 in the worksheet (Seg 1- 10, then Seg 11-17), Evaluate the translation quality
    • Each time copy Bing Ukrainian output into Section Stage03 at the end of your worksheet.
    • In Stage04 we will try to find out, whether the same CL (developed for Google) will still work for Bing?: We expect that not all rules will work, and the quality may be lower, but some changes will still work for different MT systems.
    • So, in Stage04 you DO NOT CHANGE the CL, just copy the CL which you developed in Stage02, from the Columns 4 and 5 in your table
    • Paste the text into Bing (Seg 1- 10, then Seg 11-17), then record evaluation scores
    • Each time copy Bing Ukrainian output into Section Stage03 at the end of your worksheet.
  10. If time allows, in Stage05 and (optionally) Stage06 we will try to find out, if the CL works across different target languages. You will download worksheets from other students, who developed the CL for translating into Italian and Polish. We will test if it works for improving translation into Ukrainian.
    • Stage05: download a worksheet with CL produced for another Target Language (Italian, Polish) from: https://heibox.uni-heidelberg.de/d/d4648849cbd24c8889ec/ .
    • Will the these CLs work for improving translation into Ukrainian? Translate with Google and evaluate MT quality in the Google Form, indicate which file you used (it or pl)
    • Each time copy Ukrainian MT output into your worksheet, into Stage05 Section
    • Stage06: [optional] download another worksheet for another Target Language (Polish / Italian; in a sub-folder German, Greek also available, but have 12 segments in each) from the same link https://heibox.uni-heidelberg.de/d/d4648849cbd24c8889ec/ . Repeat translation (Google) and evaluation, copy MT output into Stage06 section.
  11. The evaluation scores are recorded in a spreadsheet: https://docs.google.com/spreadsheets/d/1jVxMpbs5t3EDR6FEUgzYRD1Ws4ByGhbAgym1rJayb9A/edit?usp=sharing We will look analysis of this data in the class.
  12. Stage07 and Stage08 (homework) – would you be able to improve Systran Ukrainian translation (Systran: https://translate.systran.net/?lang=en ; Stage07 – baseline; Stage08 – CL? Record your scores and results, you will be able to see the results in the same Google Spreadsheet.
    • You can generate graphs for your presentation in the spreadsheet
    • Analyse and interpret your results, using examples from your sentences.
    • Prepare a short Powerpoint presentation for your colleagues about effectiveness of the CL for MT-supported localisation workflow
  13. When finished, please re-name your worksheet file, using your initials or name and your target language
  14. Upload your worksheet to the HeiBox link: Upload: https://heibox.uni-heidelberg.de/u/d/dc0b51baeb1c4f668e2a/ Download (check if your file is there): https://heibox.uni-heidelberg.de/d/ce325481bf9143bb8e6f/

Once again, a shorter overview of the stages Please evaluate MT output; Also, each time save the Target Text in in this worksheet, as Section Stage01 … Stage06)