GCI Wiki Study / Create a query to get statistics about Wikipedia: Biggest articles created by anonymous

As one of the many editors of the English Wikipedia, I'm asking myself different thoughts regarding wiki statistics and I made a formal study of it. You can help me by answering these questions!

Basic to middle SQL skills are needed to perform these tasks depending on the indicated difficulty.

The target is to have for each question an SQL request and its result in the format of a table. Finally, we need to present the data visually as a graph. See linked page below for detailed instructions.

Biggest articles created by anonymous

Wikipedia contributors can create accounts to get their edits associated with their own username. If they are not logged-in, their edits will be credited using their IP, as anonymous.

Question: We want to get the top 100 biggest in size articles created by anonymous users.

Tip: use revision table with null rev_parent_id to get page creations. Use a "not in" SQL operator to omit registered users by looking if they exists in user table. If you come across a query timeout problem (take too long compute time), you can calculate that only for 2018 revisions (rc_timestamp >= 201800000000)

Difficulty: average

Task tags

  • gci wiki survey
  • statistics
  • article creation
  • sql

Students who completed this task


Task type

  • code Code
  • assessment Outreach / Research