
Create a corpus of Medumba

Create a corpus of Medumba by looking for texts on the internet and copying them into a file in the following format:

# URL: http://url-to-site.com/page.html
This is sentence 1.
This is sentence 2.
This is the first sentece of paragraph 1.
This is the 2nd sentence of paragraph 2.

That is, one sentence per line, and paragraphs split by ¶. Then fix the encoding errors. Make sure to use combining diacritic marks, and the correct Unicode symbols. The corpus should contain at least 10,000 tokens.

Task tags

  • corpus
  • medumba

Students who completed this task

Ngadou Sylvestre

Task type

  • assessment Outreach / Research