Wikimedia

Pywikibot: Create a subpages page filter

Pywikibot pagegenerators module selects pages to be processed, and several filters also exist to reduce the number of pages to be processed.

Each Mediawiki namespace may be configured to indicate that '/' means a subpage, like a directory tree. Users may wish to only process the top level page, and not the subpages under it. See https://phabricator.wikimedia.org/T120587 for an example usage.

The MediaWiki API does not provide a generator that automatically excludes subpages, so that type of filtering needs to be performed on the client after fetching a larger set of pages to process; i.e. in Pywikibot. Pywikibot APISite.namespaces includes namespace metadata, including whether the namespace has subpages.

Pywikibot is a Python-based framework to write bots for MediaWiki. See https://www.mediawiki.org/wiki/Manual:Pywikibot for more information. See https://www.mediawiki.org/wiki/User:John_Vandenberg/GCI_walk-through for a short introduction to using Pywikibot. Patches can be submitted via Gerrit (you need a MediaWiki.org account). See https://www.mediawiki.org/wiki/Manual:Pywikibot/Gerrit. After you have successfully claimed this task on this site please do use the task in Phabricator for communication instead. This allows more PWB developers to be reached! General development questions can be asked on the Pywikibot mailing list at https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l and the #pywikibot IRC channel (see https://www.mediawiki.org/wiki/MediaWiki_on_IRC ).

Task tags

  • python
  • mediawiki
  • generators
  • pywikibot
  • filters

Students who completed this task

Geoffrey Mon

Task type

  • code Code
close

2015