Lesson video

In progress...

Hello, my name is Mrs. Holborow, and welcome to Computing.

I'm so pleased you've decided to join me for the lesson today.

In today's lesson, you'll be exploring Large Language Models, LLMs, in terms of what their purpose is and looking at how sometimes they can produce outcomes that are biassed and not trustworthy.

Welcome to today's lesson from the Unit using data science and AI tools effectively and safely.

This lesson is called Chatbot Applications and other LLMs. And by the end of today's lesson, you'll be able to describe the purpose of LLMs and explain why the output may not be trustworthy.

Shall we make a start? We will be exploring these keywords throughout today's lesson.

Let's have a look at them now.

Language model.

Language model, an AI system used to produce or complete written text based on patterns identified in training data.

Prediction.

Prediction, an estimate of what might happen next based on patterns found in training data.

Bias.

Bias, when something is unfair towards or against something or someone.

Trust.

Trust, confidence that something will work as expected and produce reliable and fair results.

Look out for these keywords throughout today's lesson.

Today's lesson is split into two parts.

We'll start by describing the purpose of LLMs and then we'll move on to explain bias in LLMs. Let's make a start by describing the purpose of LLMs. Large language models, LLMs, are a specific type of AI model designed to generate text output by using patterns found in large amounts of training data.

Many different types of data are included in the training data.

Sofia says, "What types of data are included in the training data?" The training data includes text from things like books, websites, social media and articles.

Training data provides examples that AI systems use to find patterns and relationships in the data.

Jacob says, "LLMs must be really smart to understand the information in the training data." Do you think that's correct? Sam says, "LLMs don't understand language at all, Jacob." LLMs do not understand language like humans.

They generate text responses by predicting the most likely next word based on statistical patterns identified in training data, not the meaning.

The aim of LLMs is to predict the next word or phrase based on patterns in the user's input text.

LLMs are designed to generate a text response to a request as realistically as if you were speaking to a human.

LLMs can be used to complete tasks such as text completion, summarising, translation from one language to another, and classification.

This makes LLMs useful in AI based technology such as, chatbot applications, translation tools, and writing assistance.

Chatbot applications are designed to respond to written or spoken human input prompts.

The generated response is a prediction of which words will answer the prompt the application has been given.

Note that the responses based on the data that has been used to train the LLM.

Sam's got a really good question.

"What is a prompt?" Do you know the answer to this question? May you pause the video whilst you have a think.

Jacob says, "A prompt is the input text or questions humans give to LLMs." Well done, Jacob.

Prompts give instructions to the LLM as to what you want it to generate.

The quality of the prompt can influence how accurate or useful the output from the LLM is.

Translation tools are designed to convert text from one language to another.

They use LLMs that have identified language patterns in training data and aim to produce accurate translations between languages.

It's important to note that all AI systems, including LLMs, can make inaccurate predictions and can sometimes generate information that sounds real but isn't actually true.

Sofia's got a really good tip here.

"Always check and question the results of AI systems before making decisions based on what they output." Okay, I have a true or false question for you here.

Large language models are trained on small amounts of text data.

Is this true or false? Pause the video whilst you have a think.

Did you say false? Well done.

Large language models are trained on large amounts of text training data, not small amounts.

Another true or false statement.

LLMs can sometimes generate information that sounds real but isn't actually true.

Is this true or false? Pause the video whilst you have a think.

That's right.

It's true.

You should always check the responses that are produced by AI systems. Okay, a question for you now.

What is the main purpose of LLMs? Is it A, to create videos from animations.

B, to predict the weather from previous data.

Or C, to generate text based on patterns in training data.

Pause the video whilst you think carefully about your answer.

Did you select C? Well done.

The main purpose of LLMs is to generate text based on patterns in training data.

Okay, we are moving on to our first task of today's lesson.

Task A.

For part one, I'd like you to fill in the blanks below.

So we have a paragraph with some blanks and you have some words at the bottom to use.

LLM is short for large blank.

LLMs are trained on large amounts of blank data.

LLMs generate text based on blank found in the training data.

A user gives a LLM a blank to guide what kind of text it should produce.

It is always important to blank information produced by an LLM as they can sometimes generate information that sounds real but isn't actually blank.

And the words you've been provided are prompt, text, true, patterns, language model and fact check.

Pause the video whilst you fill in the blanks.

How did you get on? Did you manage to fill in the blanks correctly? Let's go through the answer together.

LLM is short for large language model.

LLMs are trained on amounts of text data.

LLMs generate text based on patterns found in the training data.

A user gives an LLMA prompt to guide what kind of text it should produce.

It is always important to fact check information produced by an LLM as they can sometimes generate information that sounds real, but isn't actually true.

Okay, part two.

Alex says, "I've heard lots of my friends talk about LLMS and chatbot applications.

I'm not sure how they could be of use to me." In your own words, describe one way that LLMs or chatbot applications might be of use to Alex.

Pause the video whilst you complete the task.

Did you manage to think of a way that the LLM or chatbot applications might be of use to Alex? Well done.

Let's have a look at a sample answer together.

"One way LLMs or chatbot applications might be useful to Alex is by helping him translate phrases and sentences from one language to another while on holiday.

This would make it easier for him to communicate with local people and understand signs or menus." Did you have this as an example or did you have something else? Remember, if you need to pause the video and revisit your answer, you can do that now.

Okay.

We are now moving on to the second part of today's lesson where we're going to explain bias in LLMs. Okay, Jacob's got a prompt here, and his prompt is, "How many countries are there starting with the letter V?" The chatbot has come back with a response and that response is, "There is only one country in the world that starts with the letter V and that is Vanuatu." Hmm.

Sam says, "That doesn't sound right, Jacob, what about Venezuela?" Jacob's written another prompt.

"Are you sure about that?" The chatbot application has another response.

"I apologise for my mistake.

Currently, there are no countries in the world whose name starts with V.

However, there have been several countries in the past whose names began with V such as Venezuela, Vietnam, and the city state of Vatican City.

I hope this clarifies my previous response." Sam says, "That still doesn't sound correct, Jacob.

I'm sure they're all still countries." There is no guarantee that the data used to train LLMs is accurate or trustworthy.

LLMs are trained on large amounts of public data, which may contain mistakes, stereotypes, or outdated information.

These issues can show up in LLM responses and result in incorrect or biassed information being output to users.

Sam says, "What does it mean when something is biassed?" Biassed means when something is unfair towards or against something or someone.

AI models are biassed if the model favours some things and deprioritizes or excludes others.

Sam's got a new prompt this time.

"List a few hobbies that my friend Jacob could try." The chatbot application response is, "Certainly, here are four hobbies that your friend Jacob could try.

One, football.

Two, video gaming.

Three, playing a musical instrument.

And four, woodworking." Do you think the chatbot application's output is biassed? Maybe pause the video whilst you have a think.

Jacob says, "I think the chatbot application might be biassed and is suggesting stereotypes of what a boy might like to do as a hobby.

I would actually like to try dancing." Sofia says, "And I've enjoyed playing football as a hobby for years." Bias in an LLM output can show up in a number of ways.

For example, a chatbot application might assume certain jobs and more suitable for men or certain cultures are less intelligent.

This output occurs because the AI model has identified patterns of real world stereotypes in the training data.

Note that not all bias is intentional, but it can still cause harm.

When building AI systems, it's important that developers train the models using data that is fair and unbiased as possible.

Bias affects how people trust and use chatbot applications and other technologies based on LLMs. For example, if a chatbot application gives unfair or incorrect answers, users may stop using it due to a lack of trust.

Trust is really important.

Jacob says, "How can bias be reduced in LLMs and chatbot applications?" Removing bias can be a complex operation.

A good start can be to train AI models on diverse, high-quality and inclusive data sets.

Sofia says, "Can bias be completely removed from LLMs and chatbot applications?" What do you think? Maybe pause the video and have a quick think.

Unfortunately, not.

Even with effort, some forms of bias may still remain and no AI model or application is ever completely free of bias.

Ongoing monitoring and human review are needed to help AI models become as unbiased and as fair as possible.

Jacob says, "Now I understand more about bias in LLMs. I feel like I can use them more responsibly and with more care." Sam says, "Now we know some of the risks.

We can create better prompts and check the LLM output and answers." Okay, I have a question for you.

Bias in AI models can, A, be completely removed.

B, never be reduced.

Or C, be reduced but never completely removed.

Pause the video whilst you have a think about your answer.

Did you select C? Well done.

Bias in AI models can be reduced but never completely removed.

I have a true or false statement for you now.

Bias can be reduced by training AI models on diverse, high-quality and inclusive data sets.

Is this true or false? Pause the video whilst you have a think.

That's right.

It's true.

We can reduce bias in AI systems by ensuring our training data is high quality.

Okay, we are moving on to our second task of today's lesson.

Task B.

In your own words, for part one, explain what bias is and how it can appear in the output of LLMs and chatbot applications.

And then for part two, explain how bias can be reduced in LLMs and chatbot applications.

Pause the video here, whilst you have a go at the tasks.

How did you get on? Did you manage to answer the questions? Well done.

For part one you are asked to explain what bias is and how it can appear in the output of LLMs and chatbot applications.

Here's a sample answer.

"Bias is when the output of an AI system is unfair, unbalanced, or favours one group over another.

Biassed output can happen when the training data contains stereotypes, missing or incomplete information, or is not representative of everyone.

For example, a chatbot application might assume certain jobs are more suitable for men or certain cultures are less intelligent than others.

Bias affects how people trust and use chatbot applications and other technology based on LLMs." For part two you were asked to explain how bias can be reduced in LLMs and chatbot applications.

Here's another sample answer.

"Reducing bias can be complex, but a good start can be to train AI models on diverse, high-quality and inclusive data sets.

Even with a lot of effort, some forms of bias may still remain and no AI model or application is ever completely free of bias.

Ongoing monitoring and human review are needed to help AI models become as unbiased and as fair as possible." Remember, if you want to pause the video here and add any detail to your answer or revisit any of the previous slides, you can do that now.

Okay, we've come to the end of today's lesson, chatbot applications and other LLMs. Let's summarise what we've learned together in this lesson.

Large language models, LLMs, are a specific type of AI model designed to generate a text output by using patterns found in large amounts of training data.

Chatbot applications use LLMs and are designed to provide realistic conversations by predicting the next word or phrase.

There is no guarantee that data used to train LLMs is accurate, unbiased and trustworthy.

Bias is when the output of an AI model favours some things and deprioritizes or excludes others.

You've done a fantastic job today, so well done.

And I hope you'll join me again soon.

Bye.

I've finished the video