When you are starting out with machine learning, you’re often taught with what’s called the iris challenge.
For those that don’t know, iris is a plant species of flowers. Here’s a photo below that I’ll use to explain the challenge.
The background is that flowers are grouped into different categories based on their characteristics like size and color. An iris can be one of three types in this case versicolor, setosa and virginica. Each has unique characteristics.
We start with a data set to train our model or machine learning algorithm. This means we take data for a range of accurate classified flowers based on a set of characteristics. This way the machine can learn how to perform this for us on a larger set of data.
Once ready, we can pass a seemingly limitless amount of data through our algorithm to get predictions accurate up to and even beyond 97%.
So what’s the take away here?
Well if you’re an expert botanist (someone who studies plants) you may be able to classify flowers based on what you already know or can easily find out because it’s your field of study. However, if that same expert had to classify 1,000 flowers or 10,000 flowers how long would that take?
Get our awesome product content delivered daily-ish to your inbox
I can tell you how long it would take using machine learning…seconds.
That’s right, the machine learning algorithm (once trained) can pour over a mountain of data in less time than it would take us to open the file. That’s the power of machine learning.
How cool is that?
PS. If you want to know more about this challenge check out this website. It’s where data scientists like me go to experiment :).