Loading...
Hello, my name is Ms. Powell, and welcome to Computing.
I'm so pleased that you decided to join me here today.
In today's lesson, we're going to be learning about Big Data and Data Protection.
Let's get started.
Welcome to today's lesson from the unit Online Safety.
Today's lesson is called Big Data and Data Protection.
And in today's lesson we'll explain what the term "big data" means and how data can be valuable to both legitimate organisations and for malicious purposes.
We'll be using these keywords throughout today's lesson.
Let's take a look at them.
The first word is big data.
Big data: this means data that is too large and complex to be dealt with by traditional processing methods and systems. The next word is ethical.
Ethical: this means morally right.
And the last word is analytics.
Analytics: this means the analysis of data or statistics.
Lesson outline: Big data and data protection.
This lesson is split into three sections.
In the first section, we'll define big data.
In the second section, we'll use data to identify patterns in behaviour, and in the third section we'll evaluate what data online is valuable and to whom.
Let's get started with our first section: Define big data.
Can you think of any examples of data? Pause the video and have a quick think.
Online data refers to information stored and transmitted through the internet, encompassing everything from personal profiles on social media to the data generated by website activity.
It is a broad term that includes both user generated content and data collected by websites such as behaviour, location, and device information.
Examples could be: purchases, location, mouse movements, search history, content you've streamed or downloaded, which adverts you've watched or clicked, demographic information, e.
g.
age, gender, or job, and computer characteristics, e.
g.
browser, battery life, and screen size.
Conversations and contact details.
Did you know companies share data about mouse movements? This can track employee activity or to improve interface design.
Big data is defined as large data sets that are analysed in order to identify patterns and trends often used by organisations to better understand their customers.
Big data is characterised by the three Vs: Volume, the amount of data in the dataset, Variety, the different types of data in the dataset and Velocity, the speed at which the dataset is produced and how long it remains accurate.
I'd like you to define big data.
Large blank that are blank in order to blank patterns and blank, often used by organisations to better understand their customers.
Take a look at the words in purple: trends, data sets, identify and analysed.
Try and use them to fill in the blanks.
Pause the video to have a try.
Let's look at the answers.
Define big data.
Large data sets that are analysed in order to identify patterns and trends, often used by organisations to better understand their customers.
True or false, big data is characterised by the three Vs: Volume, variety, and velocity? Pause the video and have a quick think.
Is that true or false? The answer is true.
Why is that? Volume is the amount of data in the dataset.
Variety is the different types of data in the dataset and velocity is the speed at which the dataset is produced and how long it remains accurate.
Big data can be used for legitimate, ethical reasons such as: education: to personalise learning, health: to develop new treatments or track diseases.
However, sometimes people misuse the data and use it for unethical practises.
Have a look at some examples on the following slides.
Privacy violation: Big data sets often contain lots of personal information about their data subjects.
Discrimination: Big data is used to make the adverts you see more relevant to you.
Inaccuracies: Big data is used to streamline services and improve the decisions made by companies.
However, this only works if the data in the dataset is accurate.
Data theft: Big data sets can be valuable and contain valuable personal information.
They can therefore be the targets of theft.
Data subjects rely on the organisation managing the dataset to protect their information, but thefts do occur.
True or false: Big data can be used for ethical or unethical practises? Pause the video and have a think.
Is that true or false? The answer's true.
Why is that? Big data can offer powerful insights enabling people to make informed decisions based on the data.
However, some people choose to misuse the data and conduct practises which aren't ethical and sometimes are illegal.
I have a task here for you.
I'd like you to give it a try.
Write a definition in your own words to explain the term big data.
You could use these words in your answer: big data, data types, trends, data sets, volume, variety, velocity and analyse.
The three Vs of big data, volume, variety, and velocity.
Describe the key characteristics that define big data.
Volume refers to the massive size of the data.
Variety refers to the different types of the data, and velocity refers to the speed at which data is generated and processed.
Big data is characterised by its enormous size.
The sheer amount of data available for collection and analysis is a defining feature of big data.
Big data comes in many forms including structured data like database records, semi-structured like XML, and unstructured data like text documents, images and videos.
The speed at which data is generated and processed is also a key aspect of big data.
Modern data is often produced in real time or near real time, requiring fast processing and analysis.
Pause the video to finish the task.
I'd like to give you some feedback.
Define big data.
This is Sam's answer.
"Large data sets that can be analysed to track trends or patterns.
They can be used for lots of different reasons.
Companies can use big data to help them target specific products to their customers.
Big data is characterised by the three Vs: volume, variety, and velocity." Well done, that brings us to the end of the first section, Define big data.
Let's move on to the second section: Use data to identify patterns in behaviour.
Why do you think companies or organisations want to collect data from you? Pause the video and have a quick think.
Data is used to: one: improve or streamline services, two: target advertisements towards customers, three: optimise project management, four: identify criminal activity, five: predict voter behaviour.
Improve and streamline services.
Companies can analyse customer reviews and behaviour to determine what parts of their service are liked and what needs to change.
They can also work out when customers are most likely to use their service to ensure they can meet demand.
Target advertisements towards customers.
Companies can analyse the buying behaviour of their customers and advertise similar products to improve their adverts' effectiveness.
Optimise project management.
When planning a project, companies need to determine how long it will take and how much it will cost.
They can use big data to reduce both of these by working out the quickest and cheapest way of carrying out their project.
Identify criminal activity.
Big data sets that show customer behaviour can highlight unusual patterns such as buying unexpected items or using a card in another country.
This could be used to help identify fraudulent activity.
Predict behaviour.
Big data sets can also be used to predict what people will do next.
Datasets can be created that show how users have behaved in the past or how the people around them have behaved and can then be used to guess what will happen next.
Data analytics.
Big data sets are analysed using data analytics.
These are the techniques used to combine different types of data and extract meaning from them.
This might involve searching for common patterns in a data set or trying to characterise the behaviour of certain types of data subject.
Data analytics.
Data analytics is a process of examining data to extract meaningful insights that can be used to make informed decisions and improve performance.
It involves collecting, cleaning, and analysing data using various tools and techniques, ultimately leading to actionable recommendations.
Data analytics might involve: Number one: machine learning.
Number two: text analysis.
Number three: predictive analysis.
Number four: optimization problems. Number five: cleaning and combining data.
Big data and machine learning are closely related with machine learning algorithms, leveraging the vast amounts of data provided by big data to learn patterns and make predictions.
Text analysis, also known as text mining, is the process of extracting meaningful insights and knowledge from large volumes of unstructured text data.
In the context of big data, text analysis plays a crucial role in identifying patterns, sentiments, and key characteristics within massive data sets, enabling businesses and researchers to make data-driven decisions.
Predictive analytics leverages big data to anticipate future outcomes by identifying patterns in historical data.
Data optimization involves tackling complex problems using large data sets to find optimal solutions.
Data cleaning and combination are crucial steps in preparing big data for analysis, ensuring accuracy, consistency, and usability of the data.
I have some text here for you and I'd like you to fill in the blanks.
Data analytics.
Big data sets are blank using data analytics.
These are the techniques used to combine different types of data and blank meaning from them.
This might involve blank for common blank in a data set or trying to characterise the behaviour of certain types of data subject.
Let's take a look at the answers.
Big data sets are analysed using data analytics.
These are the techniques used to combine different types of data and extract meaning from them.
This might involve searching for common patterns in a dataset or trying to characterise the behaviour of certain types of data subjects.
True or false? Your online reputation can be valuable to organisations.
Pause the video and have a quick think, is that true or false? The answer's true.
Why is that? Companies can use your data to encourage you to use or buy their products.
They can streamline services, target advertisements, and even change the content that is shown to you online.
I have a task here for you.
I'd like to give it a try.
Look at the three data sets provided and answer these questions.
Number one: You are StreamNow.
What patterns do you notice in your user behaviour? Number two: You are BuyOnline.
Which type of users are spending the most money on your site? Number three: you are BestAnalytics.
What insights can you offer to StreamNow to help them improve their services? Number four: You are BestAnalytics.
What insights can you offer to BuyOnline to help them increase their number of premium customers? Pause the video to finish the task.
I'd like to give you some feedback.
You are StreamNow.
What patterns do you notice in your user behaviour? Jacob says, "I noticed that the premium users tend to use the service more than other account holders.
I also saw that people who use the service for several hours a day tend to watch romances." You are BuyOnline.
Which type of users are spending the most money on your site? Jacob says, "I noticed that premium users tend to spend more time browsing before buying something, make more purchases and spend more money on the site.
Basic users are more likely to enter the site via an advert." You are BestAnalytics.
What insights can you offer to StreamNow to help them improve their services? "I noticed that user 00017 doesn't like the current content and may need more comedy films and TV to encourage them to stay a member.
Romance content appears to be popular, so if it were only available to basic or premium members, this might encourage users to pay a subscription fee.
User 00014 could be targeted with an advert explaining that an account on StreamNow makes a great birthday gift." You are BestAnalytics.
What insights can you offer to BuyOnline to help them increase their number of premium customers? Jacob says, "I noticed if user 10104 was advertised some bowling shoes, they might buy them.
However, they haven't spent much money on the site and so the shoes advertised should be cheap.
Users 00011 and 10105 could be encouraged to create accounts, especially as user 10105 has made several purchases and 00011 has a premium account on StreamNow, and therefore may be interested in the benefits of becoming a premium member.
I also noticed that users 00011 and 00016 like romance films and user 00017 likes comedy.
They could be targeted with adverts that appealed to other customers with similar tastes.
User 00014 could be targeted with adverts for great gifts for mums. Well done.
That brings us to the end of the second section: Use data to identify patterns in behaviour.
Let's look at the third section: Evaluate what data online is valuable and to whom.
What data is valuable? Pause the video and have a quick think.
Names can be valuable.
While not always unique, names, especially when combined with other information like a location or other identifiers, can be used to pinpoint an individual.
This makes it valuable in various contexts, from privacy concerns to data-driven news.
Files are valuable data because they store, organise and allow easy access to information.
A computer file can store a wide variety of information including text, numbers, images, videos, audio, and even programmes.
This could range to a personal video to a confidential report.
Bank details naturally give access to financial information which could potentially be misused and a significant target for criminals.
Passwords are valuable data because they act as the gatekeeper to sensitive information and systems. Account names are valuable data because they can be used to identify and potentially compromise an individual's online presence.
Which of these types of data are valuable to other people and why? Pause the video and have a quick think.
Data theft.
Sometimes when people see value in your data, they might want to use it to benefit themselves in some way.
How can your data be stolen? How many different ways of stealing data can you think of? Pause the video and have a quick think.
Malware.
Malware is any form of software that is designed to disrupt, damage or gain unauthorised access to a computer.
There are lots of different types of malware, including worms and viruses.
Malware can gain access to a computer in lots of different ways, such as through an infected file by hiding in an email, or from an unsafe website.
Ransomware.
Ransomware is a particular form of malware that encrypts valuable data, belong to the victim and holds it to ransom.
To get their data back, the victim must pay a fee, often in cryptocurrency.
Hack.
A hack is when the attacker accesses the victim's data without their permission.
This could be done by guessing the victim's password or by exploiting a vulnerability in the system.
Guessing passwords.
Passwords can be guessed if they're too simple or if the attacker has the ability to try lots of different passwords until they find a correct one, in what is known as a brute force attack.
Guessing passwords.
An attacker may even use malware to guess your password.
Some forms of malware allow an attacker to infect a computer and record the order in which keys on a keyboard are pressed, enabling the attacker to record the victim's password.
Phishing scams. Another method for acquiring a victim's password or other sensitive information is a phishing scam.
This involves sending a victim an email or other message disguised to look as if it comes from a trustworthy source, thus tricking the victim into giving up valuable data.
True or false: In a ransomware attack, the victim can get their data back? Pause the video and have a quick think.
Is that true or false? The answer is true.
Why is that? The victim pays a fee, often in cryptocurrency, to get their data back.
However, in many cases, even if the victim pays a fee, they still may not get the data back.
Which of these could be methods of data theft? Is it A: malware, B: fishing, C: ransomware, or D: hacking? Pause the video and have a quick think.
The answer is A, C and D, malware, ransomware, and hacking.
The answer B: fishing is incorrectly spelled.
It would be P-H-I-S-H-I-N-G.
I have a task here for you and I'd like you to give it a try.
A famous footballer has lots of personal information about them online, which comes from: social media accounts and posts, news stories and Wikipedia, football club websites.
Number two: What information online about the football player could be valuable? Number three: How could this information be used maliciously? Number four: What steps could the footballer take to reduce some of the risks around their data? Pause the video to finish the task.
I'd like to give you some feedback on that task.
This is Jacob and Jacob said, "The footballer is famous so will have lots of data about them online.
This data could include their name, account names, names of family members, places they visit often, and their hobbies and interests.
This data is valuable because it could be used legitimately by companies to improve advertising and sales, but it could also be used maliciously to commit crimes or fraudulent activities." Jacob says, "Someone could try to steal personal information from the footballer.
They could create a phishing email asking the footballer to complete a form that looks like it's from a football authority.
This might make them think it's all official.
They could make a fake profile online, contact the footballer, and use the data to try and convince the footballer that they know them well.
They could find out lots more about them.
Being in the public eye, the footballer should be wary online, have strong passwords, not click on links or attachments in emails, or share their personal information.
They should also instal virus checkers on their devices to scan emails for malicious content.
If their data has been stolen, they can report this to the National Cybersecurity Centre." Let's summarise Big data and data protection.
Our online reputations are valuable to organisations who collect data to streamline services, target advertisements, and even change the content that is shown to us online.
A key tool for harnessing the value of this information is big data.
Big data is the term used to describe large data sets that are analysed in order to identify patterns and trends.
Some data can be used for legitimate reasons and legal purposes.
Sometimes data can be stolen and used for malicious purposes.