Google Summer of Code

For the full list of issues, see our GitHub repositories /openZIM and /Kiwix

New library catalogue UI for Kiwix-serve

Skill level: MEDIUM is the Web frontend of the offline content catalogue whose code is part of kiwix-tools. Now that users have thousands of zim files to chose from, this page is very heavy and does not allow users to find what they need. This project is about building a better presentation/search experience so that users can quickly/easily find what they are looking for.

Improvements to be made:

  • Discuss with the team the features/approaches of the new page;
  • Design a new UI mockup of with filters;
  • Prepare the new page HTML with CSS;
  • Prepare the javascript to dynamically create/apply filters to the page;
  • Allow for widgets.

Skills needed:

  • Good understanding of HTTP / Rest API (OPDS format)
  • Good Javascript / jQuery
  • Good HTML/CSS UI design
  • Good understanding of git and featured branch based workflow.


Improve WP1 selection tool

Skill level: MEDIUM

We have a working Wikipedia article assessment solution called WP1 (repo is here). This solution is currently been extended to
allow users to make -based on the WP1 assessments and other key values- to make selections such as Wikimed, the medical encyclopedia.

Improvements to be made:

Skills needed:

  • Minimal HTML/Javascript understanding
  • Good understanding of HTTP / Rest API (OPDS format)
  • Good Python3
  • Good understanding of git and featured branch based workflow
  • Notions of software architecture.

Python scraper for Khan Academy

Skill level: MEDIUM

All ZIM files we produce are made using scrapers, the most notorious one being mwoffliner for MediaWiki websites. Others run custom scrapers written over the years, mostly using Python.
Khan Academy is a highly valuable online learning resource that we want to offer as a ZIM file.

The project goal is to create a Zimfarm-integrated scraper that produce high-quality ZIM files for various Khan Academy languages.

Improvements to be made:

  • Document existing offline packages of Khan Academy with their strengths and limits.
  • Identify a technical strategy towards a scraper for Khan Academy supporting multiple languages (English, French, Arabic, Spanish and others).
  • Develop a prototype of the scraper using the discussed strategy.
  • Integrate the scraper with Zimfarm.
  • Receive testers’ input.
  • Improve scraper for usability and maintainability.

Skills needed:

  • Familiar with Python, Docker, GNU/Linux, Openzim scrapers and Github workflow.
  • Taste for maintainable code
  • Sense of UI/UX.

Improve Kiwix Android library

Skill Rating: Medium

The Kiwix Android Library is the entry point to manage local content library but as well to download new contents. This library would benefit of a few improvements.

Improvements to be made:

  • Manage multiple flavours of the same content under one item
  • Add new filters to better find content
  • Improve/Complete JNI to the libkiwix

Skills needed:

  • Good Kotlin developement skills
  • Little bit of C++ understanding
  • Good understanding of git and featured branch based workflow
  • Notions of software architecture
  • Good Android development understanding

Improve Kiwix fulltext / Suggestion solution

Skill Rating: HARD

The whole Kiwix fulltext/suggestion system is an efficient search engine which deals with search indexes embedded in the ZIM file themselves (see the Kiwix-lib repo). The system works efficiently both for the index and the searching of patterns, thanks to the quality of the Xapian library. But over years, a few edge cases keep annoying users and developers.

Improvements to be made:

  • Improve fulltext search unit testing in libzim
  • Filter out duplicates (because of redirects) in suggestions
  • Improve a few problems around stopwords
  • Re-archicture the libkiwix search result retrieval (two methods)
  • Build dynamic search results/snippet retrieval in Kiwix Serve
  • Improve suggestions retrieval in Kiwix Serve

Skills needed:

  • High C++ skills and good compilation understanding
  • Good HTTP / Rest understanding
  • Minimal Javascript/JQuery knowledge
  • Good ability to talk on technical tickets
  • Good understanding of git and featured branch based workflow
  • Notions of software architecture.

Want to join?

Then think hard about what you want to do, and go to the Google Summer of Code website between March 29-April 13, 2021, to submit your projects!

After reviewing all proposal, students projects will be announced on May 17, 2021.

How to apply

Please read our guide to Writing your Google Summer of Code application

Do you have questions?

Then come and join us on our Slack channel!