Machine Learning – Is it hard to learn?
I was recently at a friend’s place for dinner and they had a Google Home device. Google Home is little squat device with a built-in speaker and microphone that you talk to. I was astounded how well it worked. Just ask to play a song and the device plays it – seemingly almost any song from any era or any genre. Also, ask a question and it will give you some sort of answer – asking what the weather is works well. Google Home did not need any individual voice training even as the night’s festivities went on! How did it work? It seemed magical.
So, the following week I purchased a Google Home device online for about 150 bucks – it seemed cheap! I plugged it in, downloaded the Google Home app on my smart phone and connected it with no hassle. Now I find out a bit more about how it works. Ah, to play any song you need a Spotify Premium account! But you can play Spotify stations for free if you do not mind the odd add.
What does this mean? First, the Google Home device has very good contextual speech recognition – this is the magical part. Second, Google Home interfaces to Google Home Assistant and Spotify (in my case) via the cloud. It can also hook into Netflix (given the right interface and Netflix account level) and can even control your lights (given they are wired in with a compatible controller). Google Home has an extensive vocabulary, such as “OK Google, Play ”. There is much more.
Google Home is built on top of the Google Home Assistant SDK which utilises technologies such as Google’s Cloud Speech API. These technologies enable developers to convert audio to text by using Deep Neural Networks. You can decode the text of users dictating to an application’s microphone, process voice commands through, transcribe audio files and other use cases.
From an application development perspective, we can see that given a well-defined API, software development could be straight forward – well maybe! The magical part is the Speech Recognition. Google Home is an example of using a prebuilt and pre-trained Speech Recognition ML Model. To achieve the level of sophistication and accuracy, Google’s speech recognition has been trained on a lot of data – probably petabytes.
Google, AWS, Microsoft, IBM and others offer prebuilt ML Models to solve generalised problems, such as Speech Recognition, Natural Language Processing and Image Recognition. You can also build your own ML Models using specialised software, such as TensorFlow, to solve your specific problems, such as Weather Forecasts, Market Trends, Future House Prices and Credit Worthiness. This now becomes complicated and highly specialised! In simple terms, machine learning can be summarised as:
ML Models are trained using data to predict future values.
Training an ML Model involves providing a ‘learning algorithm’ with ‘training data’ to learn from. For example, a Neural Network using a Gradient Decent algorithm to adjust the internal weights of the Neural Network for a best fit solution. The learning algorithm finds patterns in the training data that map to the answer that you want to predict. The ‘trained ML Model’ is the artefact (application) that is created by the training process and deployed for prediction purposes. To build a ML Model requires:
Data – As a data driven process, the data must be the correct data, relevant and in the right form. Data may have to be normalised, transformed and derived so it can be presented to an ML Model in a form the model can be trained on. Typically, you need a lot of data.
Model Design – Model design is at the heart of ML. There are many types and variants such as Classifiers and Regressors using Deep Neural Networks. In addition, there are parameters (hyper-parameters) that define how the model learns and when to stop learning.
Training – Training a ML Model involves training with batches of data repeatedly to find a best-fit answer.
Evaluation – Evaluating the ML Model is important. You may have trained your ML Model well. But will it perform well (predict accurately) on an independent set of data and in real life?
Deployment – Deploying a ML Model involves how will you integrate your model in your application.
In answering the question: Is Machine Learning hard to learn?
The answer is: It depends:
1. If you are using a prebuilt and pre-trained ML Model, such as a Speech API, then it is a relatively straight forward process of integrating the API into your application.
2. If you are using a prebuilt ML model that requires training then you need to gather lots of the correct data, make sure it is in the right form and understand how to train the ML Model, evaluate the Model and deploy the Model.
3. If you need to develop your own ML Model then it will take significant expertise to build, train, evaluate and deploy a ML Model.
Machine learning is exciting and makes us think in different ways about solutions and even what the problems are. The paradigm shift has been summarised as:
Instead of programming computers we train them.
From a practical perspective, we are at the start of this paradigm shift where there is great opportunity to integrate ML Models into software applications to do some magical things.