We have many different ways of delivering the Lateral API to clients who would like to install it in their own environment. One of those is as an Azure VHD for deployment to Azure VMs. In this post I will cover how to create a VHD that is fully compatible with Azure from an Ubuntu Cloud Image base.
A technique we use to visualise how Lateral recommendations would look and work on a website is to create a Chrome extension that inserts the recommendations at load time. This is useful because: No access is required to the websites source files The extension shares assets with the page, so matching styling is easy It allows […]
Give me five is an open source Chrome extension that allows you to recommend the content you push to Lateral based on the content of the page you’re currently visiting. It’s the same code base that the NewsBot Chrome extension is built upon. The screencast shows the extension in action: You can find the source code […]
We recently had to migrate our multiple PostgreSQL databases between cloud providers. We wanted to keep downtime to an absolute minimum. This is what we did using Londiste.
Previously we’ve written about how machines can learn meaning. One of the exciting opportunities of this approach is that it also means they can learn new languages very quickly. All you need is enough text data. Wikipedia offers a great starting point and partnering with content providers enables us to quickly gather additional data. We […]
Today we are pleased to announce the release of our Article Extractor API! When recommending content it’s important to ensure you are only recommending for the relevant text of an article. We have often faced this challenge with online articles and blogs. We’d want to fetch a URL but just extract the main body of […]
For the last few months, we’ve been doing occasional work on an approximate nearest neighbours (ANN) vector search tool, written in Python. It’s still not finished and there are many rough edges, but it comes with a working DynamoDB adaptor and hence operates out-of-memory, one our main requirements. On the down side, it isn’t as fast […]
The arXiv is a repository of over 1 million preprints in physics, mathematics and computer science. It is truly open access, and the preprints are an excellent dataset for testing out all sorts of language modelling / machine learning prototypes.
We recently wrote about what our NewsBot Chrome extension does today I’m going to add to that and explain how it works behind the scenes. When building this project we approached it as if we were a user of the Lateral API to see what we could build.