It’s easy to believe that machine learning is hard. An arcane craft known only to a select few academics.
After all, you’re teaching machines that work in ones and zeros to reach their own conclusions about the world. You’re teaching them how to think! However, it’s not nearly as hard as the complex and formula-laden literature would have you believe.
Like all of the best frameworks we have for understanding our world, e.g. Newton’s Laws of Motion, Jobs to be Done, Supply & Demand — the best ideas and concepts in machine learning are simple. The majority of literature on machine learning, however, is riddled with complex notation, formulae and superfluous language. It puts walls up around fundamentally simple ideas.
Let’s take a practical example. Say we wanted to include a “you might also like” section at the bottom of this post. How would we go about that?
To clarify the idea, let’s look at a naive solution:
Using this method to find similar posts on this blog to “How The Support Team Improves The Product”, gives us the following top 10:
As you can see, posts about running an effective support process have little in common with cohort analysis, or debate around the merits of design. We can do better.
Let’s try a real machine learning approach. We’re going to break this into two parts:
If we can represent our posts mathematically, we can plot the posts, compare distances between posts, and identify clusters of similar posts.
Mapping each post to a mathematical representation is easy, we can do it in two steps:
If @words
equaled:
['hello', 'inside', 'intercom', 'readers', 'blog', 'post']
A post with the body “hello blog post readers” would be mapped to:
[1,0,0,1,1,1]
We don’t have simple tools for plotting vectors in 6-dimensions, like we do for those in 2-dimensions — but concepts like distance are easily extrapolated. (It’s also still useful to use the 2-dimensional example).
Now we have a mathematical representation of our blog posts — let’s try find clusters of similar posts. To do this we’re going to use a crazy simple clustering algorithm called K-Means, it can be described in 5 steps:
Let’s visualize these steps. First, we choose 2 (i.e. k = 2) random points, in the same space as our posts:
We assign each document to its closest point:
We re-evaluate the center of each of these clusters, to be the average of all posts in that cluster:
That’s the end of our first iteration. Now we re-assign each post to its new closest point:
We’ve found our clusters! We know this because it’s obvious in further iterations that the assignments would not change.
Here’s the top 10 similar posts to “How The Support Team Improves The Product”, with this method:
The results speak for themselves.
We achieved all of this with less than 40 lines of code, and some simple algorithms that can be described in a blog post. However, you would never know how simple some of these ideas are from reading academic literature. Here’s an excerpt from the paper introducing K-Means (it’s hard to pinpoint the exact first introduction of K-Means, but this was the first paper to use the term “K-Means”):
The academic literature can often be useful, if you’re willing to work through the notation. However, there are a lot of excellent alternative resources that are more practical and approachable:
Want to suggest tags in your project management app? Or assignees in your customer support tool? Or members of a group on a social network? The chances are some simple code, and an easy algorithm will get there. So, when faced with a challenge in your product where you believe machine learning can help, don’t be discouraged.
Machine learning is easier than you might think.
聯(lián)系客服