New
New
Lesson 5 of 8
  • Year 10
  • OCR

Data driven models

I can recognise that AI systems rely on data-driven models and the importance of data quality.

Lesson 5 of 8
New
New
  • Year 10
  • OCR

Data driven models

I can recognise that AI systems rely on data-driven models and the importance of data quality.

These resources will be removed by end of Summer Term 2025.

Switch to our new teaching resources now - designed by teachers and leading subject experts, and tested in classrooms.

These resources were created for remote use during the pandemic and are not designed for classroom teaching.

Lesson details

Key learning points

  1. Data-driven models find patterns in data to make decisions or predictions.
  2. Accurate, complete and unbiased data is essential for effective AI.
  3. Poor data leads to wrong predictions and unreliable AI results.

Keywords

  • Bias - when something is unfair towards or against something or someone

  • Cleaning - dealing with various issues that are commonly found in raw data sets, such as missing data, duplicated records and outliers

Common misconception

A model can be improved by providing more training data.

Although a large data set is important, more data doesn't necessarily improve output. It is important that training data is high-quality, accurate and diverse.


To help you plan your year 10 computer science lesson on: Data driven models, download all teaching resources for free and adapt to suit your pupils' needs...

This lesson explores bias in training data, you could get pupils to test an AI content generator to see if they think the output displays any bias or preference towards particular groups or individuals.
Teacher tip

Equipment

Licence

This content is © Oak National Academy Limited (2025), licensed on Open Government Licence version 3.0 except where otherwise stated. See Oak's terms & conditions (Collection 2).

Lesson video

Loading...

Prior knowledge starter quiz

Download quiz pdf

6 Questions

Q1.
Which of the following is a potential risk associated with the introduction of AI systems?

increased outdoor activities
lower electricity bills
Correct answer: job displacement
better handwriting

Q2.
Why is privacy an important issue in the context of AI?

Correct answer: It protects personal information from misuse.
It helps computers run faster.
It increases internet speed.
It improves graphics quality.

Q3.
What is the term for when an AI system produces results that are unfair towards certain groups or individuals?

Correct Answer: bias

Q4.
Arrange these steps in order to help make AI systems fairer:

1 - collect diverse data
2 - check for bias in data
3 - train the AI model
4 - review the AI model's decisions

Q5.
Why is it important for AI systems to be transparent?

to make them more expensive
to reduce the number of users
to keep the technology secret
Correct answer: to allow users to understand and question decisions

Q6.
Which statement about AI systems and access to technology is correct?

Correct answer: AI systems can actually increase inequality of access to technology.
AI systems guarantee equal access to digital tools for everyone.
Only wealthy individuals benefit from AI systems.
AI systems are free for all users worldwide.

Assessment exit quiz

Download quiz pdf

6 Questions

Q1.
When choosing data to train an AI system, what should you look for?

Correct answer: data that is representative, high quality, and ethically sourced
any data you can find, regardless of quality or source
data that is biased towards one group so the model learns faster
data that is random and unverified

Q2.
A team is preparing raw data to train an AI model. They notice some missing entries, repeated rows, and strange values that don’t fit the pattern. What process should they carry out?

data mining
Correct answer: data cleaning
data visualisation
data compression

Q3.
Match the keyword to the definition.

Correct Answer:bias,when something is unfair towards or against something or someone

when something is unfair towards or against something or someone

Correct Answer:cleaning,dealing with various issues that are commonly found in raw data sets

dealing with various issues that are commonly found in raw data sets

Correct Answer:outlier,a data point that's significantly different from others in the dataset

a data point that's significantly different from others in the dataset

Q4.
Which of the following is NOT a common cause of duplicated records in a dataset?

data entry errors
combining data from multiple sources
Correct answer: storing data in a secure, encrypted format

Q5.
Data points that are significantly different from other data points in the data set are called:

Correct Answer: outliers, outlier

Q6.
A student builds a machine learning model to recognise cats. They train it mostly with pictures of black cats and very few brown or white cats. What problem might occur?

The model will identify all cats equally well.
Correct answer: The model may fail to identify all cats correctly because of data bias.
The model may fail to identify black cats.
The model will be completely random.