Categories
AI in education
30 July 2025Building AI that’s safe for the classroom: what we have learned with Aila

Emma Searle
AI Product Manager
There has been rapid development in the use of AI tools across the education sector since the launch of tools like ChatGPT in November 2022, with almost 70% of teachers now using AI in their work and one in ten secondary teachers using it during lessons (Wespieser, 2024).1
This brings huge opportunities but also raises concerns regarding the safety and age-appropriateness of AI-generated content that is being created for use in classrooms.
When building Aila, our free AI lesson assistant, we placed an emphasis on building in safety guardrails to ensure that teachers can trust it to help them create lesson resources that are safe, high-quality and suitable for use in their classrooms.
We’ve learned a huge amount about AI safety guardrails in education since launching Aila, and it was great to share our learnings in our paper on safety guardrails in AI education tools for the 26th International Conference of Artificial Intelligence in Education (AIED 2025) last month.
Our paper focuses on the four-stage safety approach we have taken with Aila - here’s a quick overview:
Stage 1: Prompt engineering to create materials specifically designed for use in UK schools
We’ve built Aila to follow clear and detailed instructions (prompts) that help it generate lessons in line with both our curriculum principles and the national curriculum in England.
This helps Aila suggest lesson content that’s age-appropriate and relevant for classroom use, especially compared to general AI tools that aren’t tailored for schools.
Take, for example, a lesson on ‘drugs’ for students aged 15–16. Aila is guided to base this on the Relationships, Sex, and Health Education (RSHE) curriculum, so the lesson would focus on the effects of drug use, legal considerations, and its wider impact on society in a way that is suitable.This technique guides Aila to produce content that is appropriate for use in our classrooms, even for sensitive topics.
Stage 2: Input threat detection to identify harmful or malicious requests
Aila also uses a process called input threat detection to identify and flag users who could be trying to mislead it or use it for purposes other than its intended use - creating lessons and resources appropriate for UK schools. If potentially prohibited or malicious user requests are detected, they are blocked, and repeated issues lead to user restrictions. This layer ensures that only safe, well-intended requests reach Aila, protecting the integrity of the tool and giving teachers confidence in the tool’s security and reliability.
Stage 3: Independent Content Moderation to review the content produced
The content that you create with Aila is reviewed by an independent AI agent called an Independent Asynchronous Content Moderation Agent (IACMA). This agent, unaware of your instructions to Aila, evaluates content across four categories:
- Safe - content is safe for use in classrooms.
- Content guidance - content that is appropriate but may need teacher sensitivity when being used, for example, lessons discussing discrimination.
- Highly sensitive - topics that are too sensitive or require complete accuracy and we have therefore deemed unsuitable for AI to reliably generate content on, such as first aid or specific laws. Very soon Aila will not be able to produce lessons on this content.
- Toxic - inappropriate topics that encourage harmful or illegal behaviour will be blocked. Repeated attempts to produce this type of content will result in the user's access to Aila being restricted.
This cautious system prioritises safety and is intentionally overcautious, especially for younger pupils. If you ever think that we have miscategorised content, you can let us know straight away using the feedback option during lesson creation, or contact us at any time. Find out more about our content moderation categories.
Stage 4: Human-in-the-loop to provide the final check before sharing resources with pupils.
The final safety layer is you, the teacher. Aila is designed to support, not replace, your expertise. You know the students in your classrooms best, and what is and isn’t appropriate for them. Teachers guide the lesson planning process, reviewing Aila’s suggestions, and ensuring content aligns with their school policies and their pupils' needs. This human oversight is essential because we know that AI can make mistakes and that every class and pupil is different.
By combining these four stages, Aila offers a robust, safety-first approach to AI in education. Teachers can confidently use this tool, knowing it’s built to protect pupils while supporting high-quality, curriculum-aligned lesson planning.
How effective are these safety guardrails?
Before Aila was launched, we carried out a thorough evaluation of its safety features by testing them against a collection of Oak lessons previously developed and reviewed by expert teachers using our safety criteria. We also worked with external safety specialists to conduct "red-teaming" exercises (deliberate attempts to prompt Aila into generating inappropriate content) to verify that the tool met our high safety standards.
Following the launch, we extended our testing using a sample of over 10,000 lessons created through Aila. This analysis surfaced several nuanced challenges, including how Aila handles recent events not included in its training data (like the 2024 US election or ongoing global conflicts), as well as how it responds to topics that fall outside the national curriculum but are still commonly taught in schools, such as specific educational needs or outdoor learning.
What’s next?
Aila is still in a beta phase which means we are continuing to make improvements to it. We will continue to review our guardrails to ensure they are working as expected and identify new and emerging content areas that we may want to filter from Aila. Aila currently only produces text, but if we begin to integrate images and other mediums, we will need to ensure our safety guardrails cover these as well.