Breaking documents into “chunks”, like sections and subsections, is easy for humans, but surprisingly hard for computers. In this post we explain why this is, why it’s a valuable problem to solve, and we introduce our new solution.
Previously we’ve written about how machines can learn meaning. One of the exciting opportunities of this approach is that it also means they can learn new languages very quickly. We have recently started working on supporting new languages, and thought we would share some initial impressions here.
Computers consist of on/off switches and process meaningless symbols. So how is it that we can hope that computers might understand the meaning of words, products, actions and documents? If most of us consider machine learning to be magic, it is because we don’t yet have an answer to this question. Here, I’ll provide an answer in the context of machines learning the meaning of words. But as we’ll see, the approach is the same everywhere.
Having recently released our TED talks demo we felt another interesting application would be the thoughts of one person. No one fits that description more than Maria Popova’s excellent collection of ideas on her Brain Pickings site. With thoughts on music to philosophy we felt it would be an excellent exploration of how our technology represents thoughts.