video

Lesson video

In progress...

Loading...

Hello everyone, it's Mr. Millar here.

In this lesson, we're going to be looking at univariate and bivariate data.

So first of all, I hope that you are all doing well.

And just in case you haven't seen any of my videos on Oak before, my name is Mr. Miller, I'm a math teacher at a secondary school in central London.

And yeah, I'm really excited to be doing these lessons for you, particularly because I'm really interested in this topic, which is statistics.

Maybe you've already seen the lessons on the unit before on univariate data.

This unit is going to be on bivariate data, so I really hope you enjoy it.

Anyway, without further ado, let's have a look at the try this task.

So Zaki and Cala have recorded height and weight data from 10 people.

What's the same and what's different about their data sets? So write a sentence or two about what's the same, a sentence or two about what is different between these two data sets, Pause the video now for three or four minutes to have a think.

Okay, great.

Let's go through this.

So first of all, what's the same is, of course, they are both recording height and weight.

And what is also the same is that the data that they're getting is the same.

So if you have a look at all the heights that Zaki has recorded, they're the same as all the heights that Cala has recorded.

And the same with the weights.

So that's what is the same, but there is an important difference here, which is that Zaki, of course, has recorded the heights and the weights separately, whereas Cala has recorded them together.

And that is a really important difference because Zaki's got all this data, but he doesn't know, for example, the weight of someone who is 1.

61 metres.

Whereas a Cala does, so for example, the fourth person in her data, she knows is both 1.

72 metres tall and 73.

5 kilogrammes weight.

So that's a really important difference.

Can you think why it's more useful to have Cala's data? Why might that be more useful to have? Well it's more useful because if you were interested in looking at the relationship between height and weight, then you would want Cala's data because that would allow you to look at the relationship of those two, of those two things together.

And it's this relationship between two different variables that we're going to be looking at across this unit.

So let's have a look at the connect task.

Okay.

Let's have a read of this together.

If we want to see if there is a relationship between two variables, for example, a person's height and weight, we need to record both variables for each data point.

So first of all, this word variable, in case you haven't come across this before, it's a really important word in statistics.

And what it means is it's essentially something that we can record or something that we can measure.

So for example, in the previous slide, both the height and the weight were variables because we could record them or measure them.

So in this task, what I want you to have about think about is what would you record if you wanted to explore what the most popular sport is, what the most popular sport is for different age groups, and so on.

So what variables would you want to record to explore these three things? Write a sentence on each of these things.

Pause the video now for two or three minutes to think about this slide.

Okay, great.

Let's go through this together.

So the first one, what the most popular sport is, you would want to ask people what their favourite sport is.

And that would get you some interesting results maybe.

You might find that the most popular sport is, is tennis or football, or whatever.

So that's what you would record.

And this is typically called a univariate data because this is only one variable.

So we are only recording one different thing here.

Now, the second one, what the most popular sport is for different age groups, there are two different things that we need to record here.

First of all, favourite sport.

And second of all, age.

So there are two things, two variables that we would need to record here, so this is what we call bivariate data.

And bivariate comes from, the Latin word bi means two.

So we are collecting the data on two different variables here, and this would allow us to, what would this allow us to discover? Well, this would allow us to discover the, the most popular sport is depending on how old you are.

So maybe younger people prefer football.

And if you're slightly older, you prefer cricket or something like that.

So this would give us more data, or richer, richer data to have a look at.

How about the last one? How much time people of different ages spend on social media? What would you record for this one? Well again, you would record age here because we're looking at different ages.

And again, you would record time spent on social media.

And again, that will give you a relationship between how old you are and how much time you spend on social media, which is really useful.

So this may seem like a simple task, but this is just getting you to think about why it's important to record sometimes two different things.

Let's have a look at the independent task.

Okay, so this task needs some explaining.

So it says here, describe in a sentence what each of the results of these data tables would be able to show you.

So we've got four different data tables here.

And for each one, I want you to write a sentence that would describe what the results of this table would show you.

So for example, if we have a look at the first one here.

Let's say that this table was completed.

It doesn't matter what the results are, but let's just say a date, the data table was completed like this.

What would this show you? Well, this would show you, you know, you could see that three people take the bus, one person walks, five people take the train.

So it would show you the most common form of transports, most common form of transports.

So this first table is only recording one thing and it would tell us the most common form of transport.

You've got three more to do here.

What would the results of each table show you? Pause the video, write down a sentence for each of these, four or five minutes.

Pause the video now.

Okay.

So let's go through this and starting off with the one on the left-hand side, because it is the most straightforward, the journey duration in minutes and frequency.

Well again, you're only recording one thing here.

You're only recording how long the journey is.

So again, this would tell you something like the most common length of journey.

So maybe for example, you would find that that the most common form of journey is one that takes 11 to 15 minutes, and only one person has a journey of more than 20 minutes.

Maybe that's what this would would tell you.

The other two are a little bit more tricky because there are a number of things that are being recorded here.

So let's first have a look at this one down here.

So we've got different students and we are asking them for their journey duration and their journey length.

So maybe student number one, their journey takes 10 minutes, and they're going two kilometres.

Maybe student two is going for 25 minutes, and their journey is six kilometres.

What would the results of this table show you? Well, there's actually a few things that this table could show you.

So first of all, it would, you could work out the average journey time.

You could work out the average journey length, and the most common journey time, the most common journey length.

And so you could work out different pieces of data for both of these variables.

But there's also something that you could have a look at if you looked at these two variables together.

So if you looked at these two variables together, journey duration and journey length, you might be able to look at the relationship between them.

So for example, is it true that longer journeys take a longer time? Maybe it's true, but this table would allow you to look at that, look at that relationship.

And the final one, frequency by mode of transport and journey duration.

Well, this time we're looking at four different modes of transport.

So bus, walk, train, et cetera, how long does each take? And the results of this table, again, you could see the most common form of transport.

You could see the most common journey duration, that kind of thing.

But it would also tell you, for each mode of transport, how long does each journey take? So if, for example, you were interested in seeing which mode of transport had the shortest journey duration, this table set up would allow you to work that out.

So what this is telling us is that, for these two starred tables, because we are recording more than one variable, it allows us, it would allow us to look at the relationship between those two variables.

So it would allow us to tell us a lot more things.

And this is going to be the focus of this unit.

Let's have a look at the final slide, the explore task.

Okay.

Antoni is interested in finding out how the sun's intensity varies over time.

He has a solar metre which can measure how strong the sun is at any given moment.

He wants to take 20 recordings.

What should he record? And what would you expect the findings to be? Okay.

So just to make sure you understand this, we are interested in the sun's intensity, and we have got 20 recordings that we could take.

So your job is to tell me what he should record.

There might be, there's probably more than one thing you should look at here.

So for each recording, you can, you can record more than one thing for each recording, just to make that clear.

So anyway, there's more than one right answer, right answer here.

So pause the video and have a think.

Write down a couple of sentences.

What would you record? What would you expect the findings to be? Okay, great.

So let's talk through this.

And there's a number of things that you could have said, and two things that I will go over.

The first is that you could record the solar intensity at different times in the day.

So maybe you were interested in finding out at what time the sun is strongest.

But what you could do is you could record the solar intensity at, I don't know, eight o'clock in the morning, and then nine o'clock in the morning, at 10 o'clock, et cetera.

And you might find that, you know, as you get later in the day, the intensity goes up because the sun is getting higher.

And then maybe it's strongest at three o'clock, maybe something like that.

And then after three o'clock, the intensity goes down.

So in this case, you're recording two different things.

You're recording the time of day and the solar intensity.

So that is one thing that you could try.

Another thing that you could try is, and maybe you've thought of this, but not necessarily recordings during, sort of, within each day, but you could, you might be interested in what time of year is the sun strongest.

So is the sun strongest in June, or July, or August, or whatever? So if you were interested in that, you might do a recording each month.

So here's May, here's June, here's July.

And you might find that, you know, as you get into summer, the solar intensity goes up.

And then maybe as you get into winter, it goes down again, something like that.

But it's really important that if you decided to do this, you would need to record the solar intensity at the same time of day.

Because if, for example, you took the intensity in May at three o'clock, but June at, at eight o'clock, then that would give you unreliable results.

So it must be the same time of day.

So again, you're recording two different variables here, the, the, the month, I guess, the time that, the month that you're recording and the solar intensity.

And again, this would allow you to look at the relationship between the two things.

So hope that this has whetted your appetite for the rest of this unit.

Because looking at the relationship between two things is really the key of statistics.

So allowing us to understand this is going to really help us out in the future.

So anyway, thanks very much for watching today's lesson.

That is all for today.

Next time we're going to be looking at scatter graphs, so that's going to be really fun.

So thanks very much for watching.

Have a great day.

See you next time.

Bye-bye.