Lesson video

In progress...

Hello, my name's Mr. Davidson.

I'm going to be guiding you with our learning today.

Today's lesson is called "Structuring a Branching Database" from the unit, "Organising Data Using Databases." We are going to work really hard together today, but I'm here to help and we can learn together.

By the end of today's lesson, you are going to be able to explain why it is useful for a database to be well structured.

There are going to be two keywords that we see in the lesson today.

The first is order, and that's the way that objects or people are arranged in a sequence.

The second keyword is structure, and that refers to the way that related parts are organised.

There are two learning cycles today.

We're gonna start with the first, which is compare two branching database structures and just to remind ourselves what we mean by a branching database, we have an example here.

In this example, we've got objects that are dinosaurs and we put them into different groups based on their attributes that we asked through yes, no questions.

We started with a first question, "Does it have horns?" where six dinosaurs were put into two groups of three.

By going down the yes branch with the thumbs up, for our three dinosaurs that we had remaining, we asked a second question about their attributes, "Does it walk on four legs?" for which there was one dinosaur that has two and another two that do have four legs.

So we needed to ask a third question about their attributes.

And we asked, "Does it have bones along its back?" which allowed us to split the remaining two dinosaurs into their own groups.

Similarly, on the right hand side, we had three dinosaurs that did not have horns, so that followed the thumbs down, which was our no branch to that answer.

We then categorised the three dinosaurs into other groups by using another attribute about dinosaurs being able to fly, where we asked the yes, no question, "Does it fly?" There is one of those three that didn't, so it ended up in its own group, but with the two remaining, we needed a further attribute and therefore a further yes, no question of, "Does it have a beak?" of which one did and another didn't, and that completed our branching database.

And remember, it doesn't matter what the objects are, we still follow that same process.

We start with a question at the top and we start splitting the objects down into groups.

Our aim is always to get an object into its own individual group.

And we try and split those equally by asking the right questions about the attributes of each of the objects.

Now is there anything you notice about this particular branching database? Jacob's had a look and he says it looks just like the branching databases that he's been making, and that's good.

Aisha says that, "It follows the tree structure well.

The branches are evenly spread on both sides." Again, that's a good observation and it does.

Remember, we have our branching tree diagram and we create branching databases from a similar principle.

We have our main question at the root of the tree at the top, and then we try and spread them evenly as we go further down the branches and as the branches split.

If we explore that further, we can see that there are five attributes in this particular branching database.

We asked the question, "Does it have horns?" because we wanted a group that have horns and a group that do not have horns.

We looked at a further attribute by asking the question, "Does it walk on four legs?" So there will be a group that has four legs and a group that does not have four legs.

On the same level, with another group, we asked, "Does it fly?" And the attribute that we'd be looking for on that is can fly or cannot fly.

Go back to the left hand side.

We asked, "Does it have bones along its back?" So we ended up with a group that had bones on back or a group that do not have bones on back.

Lastly, our fifth attribute was found by asking the yes, no question, "Does it have a beak?" And then we were able to identify our dinosaurs on the bottom row.

To get to those on the bottom row, we'd have to ask a maximum of three questions, which is good.

Two of the dinosaurs, we only needed to ask two questions.

Let's check that.

Think, "Does it have horns?" For the group that do have horns, then ask, "Does it walk on four legs?" Then for those that walk on four legs, let's ask our third question.

"Does it have bones along its back?" We can check that again in another way, if we follow from our first question, "Does it have horns?" We find dinosaurs that don't, so then we ask, "Does it fly?" We find the ones that do fly, and then we ask, "Does it have a beak?" At which point we find the two remaining dinosaurs at the bottom of the diagram.

And that's a really good example of a branching database.

We would say that branching database is well structured.

It's well structured because the first question splits the objects roughly in half, so the objects are evenly spread as you move down the tree structure.

Now having said that and seen a good example, what do you notice about this branching database? Jacob spotted the difference.

He's saying, "The branches are a bit one-sided." Aisha also says, "It doesn't follow the tree structure." It looks like a lopsided tree.

We are only going down one side and we are not creating equal groups.

We put our tree image underneath that, we can see that there's part of the tree that hasn't spread out.

We look a bit more closely at that.

The initial question, "Does it float?" doesn't split the objects into two similarly sized groups.

By asking that as our first yes, no question, and creating a group that has that attribute and a group that doesn't have that attribute, the group that does is only one object, which is our boat.

The other objects that we have all belong to the, no, it does not float group.

Which means we are then working more to break that into individual groups by asking other questions and that carries on down through the rest of the diagram.

It doesn't really look like we're asking the right questions.

It means that even though it works, the branching database is one-sided and very long.

We therefore say that the branching database doesn't have a good structure.

There are seven attributes now in this branching database.

"Does it float?" "Does it travel on tracks?" "Is it green?" "Does it have a rotor?" "Does it fly?" "Does it have less than four wheels?" "Does it hold more than five people?" At each point checking those attributes, we are creating a separate group of one and we're not splitting them evenly.

If we need to identify a bus or a car on the bottom row, we therefore need to ask seven questions and that would take a long time and become very frustrating if that's what we're looking for, because we would need to answer all of those questions proceeding it.

So, let's think about that.

Is this statement true or false? You can pick any question to start your branching database.

You just need to make sure you link it to the attributes of the objects you are trying to identify.

What do you think? I agree.

That statement's false.

Why do we think it's false? Well, I think it's best to choose a question that splits the whole group roughly in half so that the branching database follows the tree structure.

Let's think of another question and another statement this time.

Is this true or false? It is important that branching databases are well structured.

What do you think? Yes, of course it's true.

It's very important that a branching database is well structured.

Why is that though? I Think a branching database that is well structured allows the user to identify objects efficiently.

By efficiently we mean that we don't have to ask many questions to get to what we're looking for.

Now, let's remember that word attribute.

An attribute is a word or phrase that can be used to describe an object such as its colour, size, or price.

We've got three pencils there.

We are able to determine the difference between each of those three pencils, if we look at the colour attributes.

We have a red, a green, and a blue pencil.

Remember, attributes can be what we want them to be as long as we can compare the two objects using that same attribute.

In this case, we could look at size.

We would say the elephant is big and the mouse is small.

That size attribute is different.

Lastly, as well, we could look at an attribute such as price, where we could actually give a value to something and work out what it is based on that attribute.

Now, I'm gonna get you to have a go at that.

You are going to look at two different branching databases.

I'm gonna want you to answer the questions about each branching database.

Then, I want you to use your answers to help you decide which branching database has the best structure, and I want you to give reasons why that structure is best in the way that it is.

So, you are going to look and answer the following questions for that branching database that you can see there, for the animals.

I want you to tell me how many attributes have been used in the database and how many questions you need to answer to identify an object in the bottom row.

Well done.

As you worked through those questions methodically, you start to understand database structures better.

The first question was, how many attributes have been used in the database? Well, my first question asked about animals and their legs.

So that's one attribute.

We then asked, "Is it a pet?" of which some are and some aren't.

We asked, "Does it live in water?" That's a third attribute.

"Does it meow?" A fourth.

And, "Does it have fins?" Our last and fifth one.

So our answer to how many attributes have been used in the database is five.

Secondly, I asked, how many questions do you need to answer to identify an object in the bottom row? If we start with our first question, "Does it have legs?" We have a group that do have legs and a group that don't.

If we follow the group that do, we then asked a second question about an attribute, "Is it a pet?" Again, which some do and some don't.

If we follow the yes branch, we still have two animals that do have legs and are pets.

We then separated it by third question, "Does it meow?" In which case, we ended up identifying each in their own group.

Therefore, how many questions do you need to answer to identify an object in the bottom row? Three.

We probably need to check that on the right hand side as well because we might see our tree structure is not even.

It should be, but always check that it is.

So, "Does it have legs?" We go to the no group where we ask again, "Does it live in water?" We go to the group that adds another attribute through another question, "Does it have fins?" And again, just to check that answer, it was the same as the left hand side.

So how many questions do you need to answer to identify an object in the bottom row? In both cases, in both routes through, it is still three.

Now I want you to consider another branching database, again with the same questions.

Pause the video again and have a go at that structure.

Well done.

You're really getting the hang of this now.

How many attributes have been used in the database? We need to just count the amount of questions.

There are five, 'cause they provide different attributes as we ask them.

And how many questions do you need to answer to identify an object in the bottom row? Well, if we follow our bottom row path with the dog and the cat, we can count there that, oh, there are four questions.

The right hand side only has two to get to that point.

So our tree isn't quite symmetrical.

So thinking about all of that, which of these databases do you think has the best structure? Is it A or B? So what did you get for that? Jacob has said, "The first branching database is structured better.

Both databases have five attributes," that we determine from five questions, "but in the first database, we will never have to ask more than three questions to identify an object." I'd agree with Jacob.

It's better if the tree is even and we don't have to ask too many questions to get to the data that we want.

Well done.

Let's get straight onto the second part of today's lesson, which is change the order of a branching tree diagram.

Now, remember ordering questions is very important.

We have there six objects, six robots.

When we create a branching tree diagram or a branching database, we need to think carefully about the order of the questions.

Have a think about how we could identify these robots using a branching tree diagram or a branching database.

Remember importantly, to create a branching structure, you need to think of possible attributes to base your questions on.

We've got lots of different attributes we could consider.

We could consider colour, number of legs, claws, eyes.

We could probably think of other attributes we could write questions about.

What do you think? Once we have our attributes, we need to write questions.

When we decide on them, we start to write the questions based on what we're looking for.

Again, we have our group of robots.

If I'm considering the attribute has claws, the yes, no question that we'd have to answer would be, "Does it have claws?" We could also ask questions such as, "Does it have three eyes?" "Is it pink?" "Does it have fewer than two eyes?" "Does it have legs?" As we go through, we start sorting into yes or no answers.

So again, considering the question, "Does it have claws?" where we're looking at the attribute has claws.

We have a group that do have claws and we have a group that have no claws.

If we look at our question, "Does it have three eyes?" We have one that does and five that don't.

We're starting to see how our questions and choice of attributes are giving different numbers for each group.

Let's think about, "Is it pink?" We have one group that says, yes, it is pink and five that don't.

"Does it have fewer than two eyes?" Again, just one has fewer than two eyes.

The rest have two or three eyes.

How about, "Does it have legs?" We have a group that do and a group that don't.

Notice how this group has an equal amount in each group.

So, having seen all of those questions and seen the groups that they produce, what do you think would be a good question to start with? But I don't just want you to say which one it is.

I want you to think why.

There's our questions again, just for you to think about.

So what do we think? Which question would be a good one to start with and why? Aisha thinks she should start with "Does it have legs?" "This question is gonna split the objects into two similar-sized groups and that will give branching databases a good tree structure with branches on both sides." That seems very sensible.

Let's look at what she means by that.

If we start with, "Does it have legs?" We end up with two equal groups.

It will mean later on we have to ask fewer questions, which is great.

We said we wanted a nice even branching tree diagram structure.

So we consider that the ordering of the questions is very important.

The first question in a branching tree structure is really important that we split the groups into two equally sized groups.

If we have our robots there, we've got to consider the importance of placing the other questions in the correct order as we move down the branching tree diagram.

Each time we ask a new question, it's going to create different groups.

So don't always assume that what you got at the start will be the same number that you get further on down the diagram.

Now let's just check that word order.

What's the word order mean? Is it the way objects or people are arranged into a sequence? Is it a digital tool for identifying objects using yes or no questions? Or is it the way related parts are organised? Well done.

It's the way that objects or people are arranged in a sequence.

Order is very important in computing.

Now let's put that into practise with a diagram.

Can any of you spot the error in this branching tree diagram? Now it helps if we actually use the objects we're applying this to.

We don't have anything in the remaining groups.

Let's consider one at a time.

The first robot, does it have legs? Well, yes it does.

And so it moves on to the next question where we check the attribute, "Does it have three eyes?" In this case, we see that does.

So it goes on to the third question about its attribute, "Does it have claws?" In this case, the robot doesn't have claws, so it ends up in a group on its own.

So far so good.

We take a second robot.

This time, does it have legs? No, it doesn't.

So then we ask our next question, "Does it have fewer than two eyes?" No, it doesn't.

So then we check, "Is it pink?" In which case it is, so again, that ends up in its own group.

This seems to be going quite well so far.

We have a third robot.

"Does it have legs?" Yes it does.

"Does it have three eyes?" No, it doesn't.

Fourth robot.

"Does it have legs?" No, it doesn't.

"Does it have fewer than two eyes?" Well, one eye is fewer than two eyes, so it ends up in its own group.

Another robot.

"Does it have legs?" No, it doesn't.

"Does it have fewer than two eyes?" No, it doesn't.

It has exactly two eyes.

"Is it pink?" No, it's not.

We get to our last robot.

We ask, "Does it have legs?" Which is yes.

"Does it have three eyes?" Well, no it doesn't.

It has two and oh dear.

We've not thought about that correctly.

We've ended up with two robots in the same group with no follow-up question that's going to split it.

Our order of our questions are not quite right.

There's two objects in one group and we don't want to have that.

Really, we can see, we don't need to ask if the robot has claws if we know it has three eyes.

It's not a useful follow-up question.

So our problem is around this point here.

A nice way to fix that is to swap those two.

Think through what will happen now these two questions are swapped.

So let's swap them around and let's take that one robot back to, "Does it have claws?" this time.

So before we're asking, "Does it have three eyes?" This time we're waiting to ask that until we find the ones that have claws.

That robot will go down to the next group.

"Does it have three eyes?" That will end up as a no.

Much better.

Changing the order of our questions means that all objects can be identified in the way that we need them to be.

So let's check you've understood that.

Again, I've got another true or false statement.

I want you to decide which it is.

Only the first question matters when you are thinking about ordering the questions in your branching database.

Do you think that's true or false? Good.

It's false.

But why? I agree.

The first question is really important, but you also need to think carefully about the order of the other questions so that the branching database works properly and all the objects can be identified.

You are gonna try and put some of that into practise now.

Look at this branching tree diagram.

Can you reorder the questions so that all the objects can be identified and then explain what you did? Pause the video and have a go at that now.

Well done.

You probably had to consider the questions in order and think where the groups had to be moved.

Reordering the branching tree diagram makes more sense in the way that we've shown it here.

Aisha said that she swapped the two questions on the left and the two questions on the right.

This made the branching database work for her.

So, "Are they bald?" "Do they have black hair?" They are swapped round.

Similarly, "Do they have blonde hair?" "Do they have a hairband?" They were also swapped round.

Let's try that again with another diagram.

Now let's just check that with some examples of people.

Let's start with, "Do they have facial hair?" And this first person does have facial hair, so they move to the second question where we ask, "Do they have black hair?" They do.

So then we ask, "Are they bald?" That person is, so it ends up in the yes group.

Our second person there, "Do they have facial hair?" Yes they do.

"Do they have black hair?" Yes they do.

"Are they bald?" No, they're not.

They end up in their own group as well.

Our third person there.

"Do they have facial hair?" Yes.

"Black hair?" No.

They end up in their own group as well.

Another person here.

"Do they have facial hair?" No.

"Do they have a hairband?" No.

"Do they have blonde hair?" No.

Again, they end up in their own group.

This person, "Do they have facial hair?" No.

"Do they have a hairband?" No.

"Do they have blonde hair?" Finally, we have our final person who doesn't have facial hair but does have a hairband.

They end up in their own group.

That branching tree diagram works perfectly.

Well done.

You did really well today.

Let's just think back to what we saw.

We know it's helpful to have well-structured data that can be split equally as we move through a branching database.

We want to able to compare two database structures to see if they give the same results.

We also compared two database structures to see if they gave the same results.

Lastly, we saw that the order in which questions are asked needs to be considered carefully 'cause the order of questions can alter the structure of a branching database and result in a structure that we don't want and is uneven.

I've finished the video