Similar pages for Wikipedia

Wikipedia is one of the most widely used websites globally. We built a simple extension to that displays similar pages at the top of every Wikipedia page!

The Unknown Perils of Mining Wikipedia

If a machine is to learn about humans from Wikipedia, it must experience the corpus as a human sees it and ignore the overwhelming mass of robot-generated pages that no human ever reads. We provide a cleaned corpus (also a Wikipedia recommendation API derived from it).