Lesson video

In progress...

Hello, my name is Mrs. Holborow and welcome to Computing.

I'm so pleased you've decided to join me for the lesson today.

In today's lesson, we'll be exploring how machine learning models are trained.

What's the difference between supervised and unsupervised learning? Welcome to today's lesson from the unit, "Data science, AI and Machine Learning".

This lesson is called, "Approaches to Training Machine Learning Models", and by the end of today's lesson, you'll be able to explain the difference between supervised and unsupervised machine learning models.

Shall we make a start? We will be exploring these keywords throughout today's lesson.

Let's take a look at them together now.

Supervised learning.

Supervised learning, a form of machine learning where the model is trained using labelled data.

Unsupervised learning.

Unsupervised learning, a form of machine learning where the model is trained on an unlabeled dataset.

The model is designed to detect patterns, hidden relationships or structures within the data.

Look out for these keywords throughout today's lesson.

Today's lesson is broken down into two parts.

We'll start by describing supervised learning and then we'll move on to describe unsupervised learning.

Let's make a start by describing supervised learning.

Machine learning can be categorised into three main types, supervised learning, unsupervised learning and reinforcement learning.

Supervised learning uses examples where answers are already known.

It's good for problems such as recognising animals in photos for example.

Unsupervised learning finds patterns in data without being provided with the answers.

It's good for problems such as grouping similar music.

Reinforcement learning is trained using trial and error, getting rewards or penalties.

It's good for problems such as training a computer to play a game.

In this lesson, we're going to be exploring supervised and unsupervised learning.

Supervised learning is a form of machine learning where the model is trained using labelled data.

In this context, labelled data means that each example in the training dataset is paired with its correct label.

So we've got some examples here of some fruits.

So we have an orange, a banana and an apple, and each have been labelled with a label which matches the fruit.

Imagine an art teacher who is teaching their students to identify paintings by a particular artist.

They do not tell the students the style or characteristics they are looking for.

They just show them a slideshow with hundreds of paintings that are labelled with the name of the artist who produced them.

The students spend time trying to identify common characteristics in the paintings that were produced by the artist in question, such as the subject matter, use of colour and brush stroke style and technique.

Once they have been trained, the students study paintings they have not seen before and predict whether they were painted by the artist.

If the training has been successful, they are likely to make accurate predictions.

This is exactly the same as the model used in supervised learning.

The model is trained to detect patterns, relationships and features that map each piece of training data to its associated label.

Classification is a common algorithm used in supervised learning.

For example, email filters use classification to detect whether emails are spam or not.

So in this diagram, we can see that the data has been classified into two groups, not spam, which are the dots and the triangles, which are spam.

Once trained, the model can be used to predict a label for new unprocessed data.

In the spam filter example, a newly arrived email can be labelled as spam or not spam based on the patterns detected in the training data.

The term supervised comes from the idea that performance is being guided by the labelled data provided by a human.

Okay, time to check your understanding.

I have a true or false statement for you.

In supervised learning, the model stores the training data to label new unprocessed data.

Is this true or false? Pause the video whilst you have a think.

This is false.

A supervised learning model does not simply store the training data in its memory.

It detects patterns and relationships in the training data and stores these.

If it only stored the training data, it would not be able to accurately label new unprocessed data.

Some benefits of supervised learning.

It's highly accurate.

When trained with enough high labelled data, models can make very reliable predictions.

It has wide applications.

Supervised learning is used in many real world applications, such as spam filtering, voice recognition and medical diagnosis.

There are some drawbacks of supervised learning though.

It's time consuming and requires large amounts of training data to be provided by humans for accuracy.

There is also a risk of overfitting.

The model may learn too well from the training data and perform badly on new unprocessed data.

Okay, time to check your understanding.

I have a question for you.

What does overfitting mean in supervised learning? Is it A, the model performs well on new unprocessed data, but poorly on the training data, B, the model ignores the training data and makes random predictions, C, the model accurately processes the training data but performs poorly with new data or D, the model cannot identify any patterns from the data? Pause the video whilst you think carefully about your answer.

Did you select C? Well done.

Remember, overfitting means that the model accurately processes training data and works well on the training data, but performs poorly with new unprocessed data.

Okay, we're moving on to our first task of today's lesson, task A.

Describe supervised learning using a common application of supervised learning to illustrate your answer.

Pause the video whilst you have a go at the task.

How did you get on? Did you manage to describe supervised learning in your answer? Well done.

Let's have a look at a sample answer together.

Remember, this sample answer may include an example that you haven't included in your answer, but that's absolutely fine.

Supervised learning is a form of machine learning where the model is trained on labelled data.

For example, in medical diagnosis, doctors can train a model with patient data, like age, symptoms and test results that is labelled with whether the patient has a certain disease or not.

The model analyses the patterns in the data and then can predict for new patients whether they are likely to have the disease.

This way, supervised learning helps make useful predictions by using historic labelled data.

Remember, if you want to pause the video here and add any detail to your answer, you can do that now.

Okay, so we've described supervised learning.

Let's now move on to describe unsupervised learning.

In unsupervised learning, the model is trained on an unlabeled dataset, meaning that no target labels are provided during training.

The model is designed to detect patterns, hidden relationships or structures within the data.

A common goal of unsupervised learning is to group similar data points into clusters.

So you can see here, the data has been grouped into three different clusters.

Clustering involves grouping a set of objects in such a way that objects in the same group are more similar to each other than those in other groups.

Okay, time to check your understanding.

I have a question for you.

In unsupervised learning, what does clustering mean? Is it A, grouping data points into predefined categories with labels, B, automatically grouping similar data points together without labels or C, randomly assigning data points into groups? Pause the video whilst you think carefully about your answer.

Did you select B? Well done.

In supervised learning, clustering means automatically grouping similar data points together without labels.

Online shopping platforms use clustering to make recommendations.

They can group together users who have similar buying patterns in order to recommend new products to them.

Anomaly detection in unsupervised learning involves identifying data points or patterns that significantly deviate from the majority of the data, without prior labelling of what constitutes an anomaly.

So you can see here on our diagram, the anomalies are data points that fall outside of our clusters.

In network security, unsupervised learning models and anomaly detection are used to spot unusual network traffic patterns.

Anomalies or unusual patterns may signal potential cyber threats, like intrusion or malware activities.

Okay, let's check your understanding.

State which description matches the approach.

So the two approaches we have are supervised learning and unsupervised learning, and the descriptions are, a form of machine learning where the model is trained using labelled data, the model is trained to find the patterns, relationships and features that map each piece of the training data to its associated label, a form of machine learning where the model is trained on an unlabeled dataset.

The model must identify patterns, hidden relationships or structures within the data.

Pause the video whilst you have a go.

Did you match them correctly? Let's have a look at the answer.

So the top one, which is a form of machine learning where the model is trained using labelled data, the model is trained to find patterns, relationships and features that map each piece of training data to its associated label is supervised learning.

Unsupervised learning is a form of machine learning where the model is trained on an unlabeled dataset.

The model must identify patterns, hidden relationships or structures within the data.

Okay, we're moving on to our second task of today's lesson, task B.

Describe unsupervised learning using a common application of unsupervised learning to illustrate your answer.

Pause the video whilst you have a go at the task.

How did you get on? Did you manage to describe unsupervised learning? And did you use a common application to illustrate your answer? Well done.

Let's have a look at a sample answer together.

Unsupervised learning is a form of machine learning that finds patterns in data without being given the correct answers.

For example, music apps use unsupervised learning to group songs into playlists.

The app doesn't know the right playlist for each song, but by analysing features like tempo, mood and lyrics, the app can automatically cluster similar songs together.

This way, it can suggest playlists, such as chill study music or energetic workout tracks without needing labels.

Remember, if you used a different common application to illustrate your answer, that's absolutely fine.

If you want to, you can pause the video here to add extra detail or make amendments to your answer.

Okay, we've come to the end of today's lesson, "Approaches to Training Machine Learning Models", and you've done a fantastic job, so well done.

Let's summarise what we've learned together in this lesson.

Supervised learning approaches use large amounts of data labelled by people with relevant information.

One type of supervised learning is classification.

Machine learning developers train unsupervised learning models to organise data based on similarities.

One type of unsupervised learning is clustering.

I hope you've enjoyed today's lesson and I hope you'll join me again soon.

Bye.

I've finished the video