Apertium

Examples of minimum files where Apertium-en-ca messes up (X)HTML formatting

Sometimes, Apertium-en-ca takes a valid HTML/XHTML source file but delivers an invalid HTML/XHTML target file, regardless of translation quality. This can usually be blamed on incorrect handling of superblanks in structural transfer rules. The task: (1) select a language pair (2) Install Apertium locally from the Subversion repository; install apertium-en-ca; make sure that it works (3) download a series of HTML/XHTML files for testing purposes. Make sure they are valid using an HTML/XHTML validator (4) translate the valid files with the language pair (5) check if the translated files are also valid HTML/XHTML files; select those that aren't (6) find the first source of non-validity and study it, and strip the source file until you just have a small (valid!) source file with some text around the minimum possible example of problematic tags; save each such file and describe the error.

For further information and guidance on this task, you are encouraged to come to to our IRC (http://wiki.apertium.org/wiki/IRC) channel.

Task tags

  • english
  • catalan
  • html
  • superblanks
  • format

Students who completed this task

Darkgaia

Task type

  • code Code
close

2015