Lesson video

In progress...

Loading...

Hello! My name is Mrs. Holborow, and welcome to Computing.

I'm so pleased you've decided to join me for the lesson today.

In today's lesson, we're going to be comparing CSV and text files, and then we'll learn how to read data from a CSV file into a Python program.

Welcome to today's lesson from the unit Programming: dictionaries and data files.

This lesson is called Reading CSV files, and by the end of today's lesson, you'll be able to read and structure data in a CSV file.

Shall we make a start? We will be exploring these keywords in today's lesson.

CSV.

CSV.

Comma-separated values are a plain text data file where each value is separated by a single comma.

Tabular.

Tabula.

Display data in a grid of rows and columns.

strip().

strip().

A Python method which removes any leading or trailing white spaces or characters that you specify.

split().

split().

A Python method that is used to break a string into a list based on a specified delimiter.

Watch out for these keywords throughout today's lesson.

Today's lesson is split into three sections.

We'll start by comparing CSV and text files.

We'll then read data from a CSV file.

We'll then finish by reading a CSV file to a 2D list.

Let's make a start and compare CSV and text files.

CSV stands for comma-separated values.

CSV files can be read by a range of different spreadsheet software packages, as well as Python files.

So, here's an example of a CSV file called players.

In the file, we have Player name, Player score.

On the next line, we have Aisha, 0.

And then on the next line, we have Jun, 2.

CSV files contain data structured so that a comma separates individual items in the file and each record is on a new line of the file.

So, Player name here is an individual item.

The comma separates the individual items. And each line is a record.

So, Jun, 2 is a record.

A CSV file can be created in a text editor by changing the extension of the file to.

csv.

You can also create CSV files in spreadsheet software by exporting them as a.

csv file.

When you open a CSV file In spreadsheet software, the commas are used as a guide to display the data neatly in rows and columns.

When a CSV file is opened in spreadsheet software, it is displayed in a tabular format.

So, here, we have the CSV file, and on the right-hand side we have how this would be displayed in spreadsheet software.

A CSV file is still a text file.

Each row is stored as a string.

Modern spreadsheet software will typically recognize a number and store it as that data type.

Python will not do this automatically, though.

Any data is always read as a string and will need to be casted before it is used as any other data type.

Time to check your understanding.

What does CSV stand for? a, computer system variables, b, comma-separated variables, or c, comma-separated values? Pause the video whilst you have a think.

Did you select c? Well done.

CSV stands for comma-separated values.

Each row in a CSV file represents, a, an attribute, b, a record, or c, an item? Pause the video whilst you have a think.

That's right, b, a record.

Each row in a CSV file represents an individual record.

This table compares some of the features of.

csv and.

txt files.

Let's have a look at each feature.

Format.

In a.

txt file, the format is plain text, which can be structured or unstructured.

In a.

csv file, structured text is in a tabular form.

Feature, delimiter.

In a.

txt file, you may or may not use delimiters.

In.

csv files, commas are used as delimiters.

Data structure.

A.

txt file can contain any type of text, often unformatted.

A.

csv file is organized in rows and columns.

Let's have a look at some common use cases.

.

txt files are commonly used for notes, logs, configuration files.

.

csv files are commonly used for storing tabular data like spreadsheets or databases.

Okay, we've come to our first task of today's lesson, Task A.

For part 1, explain the difference between a text file, a.

txt file, and a.

csv file.

For part 2, give one example of a type of data that could be stored in a CSV file.

Pause the video whilst you answer the questions.

How did you get on? Did you manage to answer the questions? Well done.

Let's look at some sample answers together.

For part 1, you were asked to explain the difference between a.

txt file and a.

csv file.

A.

csv file is structured in tabular form and uses commas as delimiters.

Data in a.

csv file is organized into rows and columns.

A.

txt file can be unstructured and may not include delimiters.

For part 2, you were asked to give one example of a type of data that could be stored in a CSV file.

A CSV file could be used to store usernames and passwords.

Did you think of any other examples? So, we've compared CSV and text files.

Now, let's move on to read data from a CSV file.

A CSV file is just another text file, so you can use all of the same methods that you already know with a CSV file.

The only difference here you'll see is the file name has the extension.

csv, rather than.

txt.

Using the read() method, we'll read all of the data from the file in the same way as it did with your text file, so here, we have the line data = file.

read().

That will read the entire contents of the CSV file.

So, in this example, you can see our CSV file contains Player name, Player score, and then it has Aisha and Jun and their scores.

The modes are also the same, so you can use r for read, w for write, and a for append.

The difference with a CSV file is that you might want to perform different operations with it in comparison to a standard text file.

This means that you need to think carefully about how your data might be structured as it is read into your Python program.

Data represented in a CSV file is often in tabular form.

This means that it's stored in rows and columns.

Lucas says, "What data structure could I use for data that is stored in rows and columns?" Can you think about what data structure might be appropriate? Maybe pause the video here and have a think.

Ah, a 2D list or a list of records.

Did you have that as your answer? Well done.

Reading data from a CSV file and holding it in a list is very similar to the method used for standard text files.

Here is a CSV file with one column of data.

So, we have playernames.

csv, and we have three names, Aisha, Jun, and Laura.

When you assign the contents of the file to a variable, it's held like this.

So, here, we have data, and it holds Aisha, Jun, and Laura, but notice the new line characters after Aisha and Jun.

We don't have a new line character after Laura 'cause it's the last line in the file.

Just as with a standard text file, the new line character is included.

To assign this data to a list data structure, you need to append each line to a list.

So, we can do this using a for loop.

So, for line in file, data.

append(line).

The list, which we've called data now looks like this.

So, you can see we have each item separated by a comma, but notice we still have the new line characters here.

Each item is now in the list, but you still have an extra piece of data to deal with, the new line character.

What method could you use to remove unwanted pieces of data? Is it a, remove(), b, pop(), c strip(), or d, delete()? Pause the video here whilst you have a think.

That's right, the strip method can be used to remove unwanted pieces of data.

For each line in the file, you need to strip the new line character.

We can do this by adding the line line = line.

strip().

The strip() method defaults to removing any white space from the start and end of the string, unless you specify the characters or arguments.

You now have a list with all of the extra pieces of data removed, so we now have open square brackets for the list and then we have our three names, Aisha, Jun, and Laura.

You can now interrogate this data as you would with any other list.

Okay, we're moving on to our second task of today's lesson, and you're doing a fantastic job so far, so well done.

I'd like you to open the starter program at oak.

link/game-scores.

Note, this program already contains a CSV file called scores.

csv.

Create a program that, a, reads all the data from scores.

csv, b, appends the data from the file to a list, c, displays the highest score, and d, displays the lowest score.

Pause the video here whilst you complete the activity.

How did you get on? Did you manage to create your program? Great work.

Let's have a look at some sample code together.

So, on line 1, we're opening the file in read mode, so scores.

csv.

On line 2, we're initiating a list called data, but we're keeping it empty at this point.

On line 4, we start our for loop, so for line in file line, line = line.

strip() to remove the new line character.

And then on line 6, we're appending the data to the list, so data.

append(line).

We're then closing the file on line 8.

And on line 10, we have datalength = len(data).

Remember, the len() function returns the length of a list.

Program's then continued.

So, on line 12, we have another for loop, so for score in range (datalength), data[score] is equal to int(data[score]).

And then we're using the sort() method to sort the list, so data.

sort().

That will put all of the scores in order from lowest to highest.

We can then work out which is the highest score and which is the lowest score.

So, highest = data[datalength-1], lowest = data position 0.

On line 20 and 21, we're then printing out these results to the user.

Did you have something similar or did you code it in a slightly different way? Remember, if you need to pause the video here and make any corrections to your program to get it to work, you can do that now.

So, we've read data from a CSV file, but now, we're going to read a CSV file into a 2D list.

CSV files present data in tabular form.

In the last example, you used a CSV file with only one column of data.

You are now going to work with a CSV file that has two columns of data, so this time, we have the player name and then their score.

The players.

csv file has data stored in two columns and three rows.

If you use the same code as the previous activity with this CSV file, then the data is held in a single list, so you can see we have 'Player name, Player score', 'Aisha 0', 'Jun, 2' all in the same list.

Each row in the CSV file is held as an item in the list.

This means that each item has two pieces of data held within it, and that isn't very practical.

Note that the commas from the CSV file are part of the string here and they're not separating the items in the list.

So, actually, Player name and Player score are one item in the list.

For each piece of data to be held as an item, the data needs to be split at the comma of each line.

The split() method takes a string and splits it into a list of items. Which method takes a string and separates it into a list of items? Is it a, split(), b, strip(), or c, slice()? Pause the video whilst you have a think.

Did you select a, split()? Well done.

The words variable holds a string of words, so here, we've got our string, "here are lots of words".

The split() method splits the string where it finds a space and returns it into a list, so we now have 'here', 'are', 'lots,' 'of', 'words' as single items in the list.

The character that is passed as an argument into the method will be used to split the string.

So, in this case, we use the space character.

Changing the space to an o will result in a different list.

Here, we have 'here l', 'ts', 'f w', and then 'rds'.

Note that the letter o is also removed from the string.

How might the split() method help with the CSV file? Pause the video here whilst you have a think.

That's right, the split() method can be used to split each line when it finds a comma in the CSV file.

By splitting each line into a list, you now have a 2D list, so we have data, and notice we've got double square brackets now, 'Player name, Player score', close the first square bracket, and then we have the next list, which is 'Aisha', '0'.

And then on the next line, the next list, which is 'Jun' with '2'.

And then we have two square brackets because we're closing both lists there.

This will now allow you to perform operations on the data.

Okay, we've come to our last activity of today's lesson, and you've done a fantastic job so far, so well done.

I'd like you to open the starter program at oak.

link/display-scores.

Note, this program already contains a CSV file called players.

csv.

For part 2, create a program that performs the following, a, reads all the data from the players.

csv file, b, asks the user to enter a player name, c, displays the score for that player.

Hint, the in operator may help you with this program.

Pause the video whilst you have a go.

How did you get on? Did you manage to create your program? Well done.

Let's have a look at the sample answer.

I've split this over two slides so that we can see it a bit more clearly.

So, we start by opening the file players.

csv in read mode and we initiate an list called data, but at this point, we're going to have it as an empty list.

We then go through each line in the file with a for loop, so for line in file, line = line.

strip() to remove the new space character.

And then we're using line = line.

split() to split it with the commas.

And then we're appending that to line.

And then on line 9, we're closing the file.

Okay, so here's the program continued.

On line 11, we have a print statement which asks the user to enter the player's name, and we hold that in a variable called player on line 12.

On line 14, we're setting an empty string variable called location.

On line 16, we start our second for loop, so for item in data, if player in item, location is going to be equal to data.

index(item).

On line 20, we have score is equal to data, open square brackets, location, close square brackets, square brackets, 1.

And then finally, we print out the message to the user.

So, "The last score for {player} was," and then their score.

Did you manage to create your program successfully? If you need to, you can always pause your video here and make any corrections.

We've come to the end of today's lesson, Reading CSV files.

You've done a fantastic job to get this far, so well done.

Let's summarize what we've learned together.

Comma-separated values, CSV text files, group data together on individual lines and separate values in these groups by commas.

Each item in a CSV file is separated by a comma.

Each record is separated over a new row in the file.

The strip() method defaults to removing any white space from the start and end of a string, unless you specify the characters as arguments.

The split() method is used to break a string into a list based on a specified delimiter.

I hope you've enjoyed today's lesson and I hope you'll join me again soon.

Bye!.