Apertium

Examples of minimum files where an Apertium language pair messes up (X)HTML formatting

Sometimes, an Apertium language pair takes a valid HTML/XHTML source file but delivers an invalid HTML/XHTML target file, regardless of translation quality. This can usually be blamed on incorrect handling of superblanks in structural transfer rules. The task: (1) select a language pair (2) Install Apertium locally from the Subversion repository; install the language pair; make sure that it works (3) download a series of HTML/XHTML files for testing purposes. Make sure they are valid using an HTML/XHTML validator (4) translate the valid files with the language pair (5) check if the translated files are also valid HTML/XHTML files; select those that aren't (6) find the first source of non-validity and study it, and strip the source file until you just have a small (valid!) source file with some text around the minimum possible example of problematic tags; save each such file and describe the error.

For further information and guidance on this task, you are encouraged to come to to our IRC (http://wiki.apertium.org/wiki/IRC) channel.

Task tags

  • language_data

Students who completed this task

Darkgaia

Task type

  • done_all Quality Assurance
close

2015