Wikimedia

In Pywikibot's download_dump.py file, use "response.iter_content"

Pywikibot is a Python-based framework to write bots for MediaWiki (more information).

Thanks to work in Google Code-in, Pywikibot now has a script called download_dump.py. It downloads a Wikimedia database dump from http://dumps.wikimedia.org/, and places the dump in a predictable directory for semi-automated use by other scripts and tests.

As @zhuyifei1999 wrote in https://gerrit.wikimedia.org/r/#/c/399179/14/scripts/maintenance/download_dump.py@84 , the script should use response.iter_content instead of response.raw. Also, it should use stream=True when fetching the content.

Reference: https://github.com/wikimedia/pywikibot/blob/master/pywikibot/page.py#L2686-L2691

You are expected to provide a patch in Wikimedia Gerrit. See https://www.mediawiki.org/wiki/Gerrit/Tutorial for how to set up Git and Gerrit.

Task tags

  • python

Students who completed this task

Rafid Aslam

Task type

  • code Code
close

2017