SQL Queries Revision Notes

    Subject: Computer Science | Level: A-Level | Exam Board: Edexcel

    Master Edexcel A-Level SQL with this comprehensive guide, packed with examiner insights, worked examples, and multi-modal resources. Learn to construct precise, mark-scoring queries and avoid the common pitfalls that cost candidates valuable marks.

    Revision Notes & Key Concepts

    ![Header image for Edexcel A-Level SQL Queries (A6.4)](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_056dc06f-c398-4cf7-adc9-cb79d69f4535/header_image.png) ## Overview Structured Query Language (SQL) is the universal language for managing and querying relational databases. For an Edexcel A-Level Computer Science student, proficiency in SQL is not just a desirable skill but a core requirement, frequently tested in Paper 1 and Paper 2. This topic, specification reference A6.4, requires you to move beyond simple data retrieval and master the art of constructing complex, multi-table queries that filter, sort, and aggregate data with precision. Examiners are looking for candidates who can not only write syntactically correct SQL but also interpret a given database schema to solve a specific problem. A typical exam question will present a scenario with a relational schema and ask you to write a specific query to extract information, often for 4-6 marks. This topic has strong synoptic links to 'Databases' (A6.1) and 'Relational Databases' (A6.2), as understanding database structure is fundamental to querying it effectively. ![Listen to our 10-minute podcast guide to mastering SQL for your exam.](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_056dc06f-c398-4cf7-adc9-cb79d69f4535/sql_queries_podcast.mp3) ## Key Concepts ### Concept 1: Data Manipulation Language (DML) vs. Data Definition Language (DDL) SQL commands are broadly split into two categories. **DML (Data Manipulation Language)** is used to interact with the data *within* the tables. This is what you'll use most in exams. It includes the 'big four': - **SELECT**: Retrieves data from one or more tables. - **INSERT**: Adds a new row of data to a table. - **UPDATE**: Modifies existing data in a table. - **DELETE**: Removes rows from a table. **DDL (Data Definition Language)** is used to define and manage the structure of the database itself. Think of it as the blueprint for your data. - **CREATE TABLE**: Builds a new table, defining its columns and their data types. - **ALTER TABLE**: Modifies an existing table, for example by adding or removing a column. - **DROP TABLE**: Deletes an entire table and all its data. Examiners test your ability to distinguish between these. If a question asks you to 'add a new customer', you need an INSERT statement (DML). If it asks you to 'add a new field for customer email', you need an ALTER TABLE statement (DDL). ![DML vs DDL: Manipulating data vs. Defining structure.](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_056dc06f-c398-4cf7-adc9-cb79d69f4535/sql_dml_ddl.png) ### Concept 2: The SELECT Statement and Filtering with WHERE The SELECT statement is the cornerstone of data retrieval. Its fundamental structure is `SELECT [columns] FROM [table]`. To retrieve all columns, you can use the asterisk wildcard (`*`), but marks are typically awarded for selecting the specific columns requested in the question. To filter the rows returned, you use the **WHERE** clause. This is where the logic of your query resides. **Example**: To find all customers named 'Smith' from a 'Customers' table. `SELECT CustomerID, FirstName, Postcode FROM Customers WHERE LastName = 'Smith';` **Crucial Examiner Point**: Notice the single quotes around 'Smith'. Any string or date value in a WHERE clause **must** be enclosed in single quotes. Forgetting this is one of the most common ways candidates lose a mark. Numeric values, however, do not use quotes (e.g., `WHERE Age > 18`). You can build complex conditions using boolean operators `AND` (both conditions must be true) and `OR` (at least one condition must be true). ### Concept 3: Joining Tables Real-world databases are relational, meaning data is split across multiple tables linked by keys. To retrieve data from multiple tables in a single query, you must use a **JOIN**. The most common type you'll need is the **INNER JOIN**. An INNER JOIN returns only the rows where the key exists in *both* tables. **Example**: You have a `Students` table (with `CourseID`) and a `Courses` table (with `CourseID` and `CourseName`). To list each student's name and the name of their course, you join them on their common `CourseID` field. `SELECT Students.Name, Courses.CourseName FROM Students INNER JOIN Courses ON Students.CourseID = Courses.CourseID;` **Crucial Examiner Point**: When a column name exists in both tables (like `CourseID` here), you **must** prefix it with the table name (e.g., `Students.CourseID`). This is called 'qualifying the column' and failure to do so when ambiguity exists will cost you a mark. ![Visualising how INNER and LEFT JOINs combine data from two tables.](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_056dc06f-c398-4cf7-adc9-cb79d69f4535/sql_join_types.png) ### Concept 4: Aggregation with GROUP BY Sometimes you don't want individual rows; you want a summary. This is achieved using **aggregate functions** combined with the **GROUP BY** clause. Aggregate functions perform a calculation on a set of rows and return a single summary value. The five you must know are: - **COUNT()**: Counts the number of rows. - **SUM()**: Adds up the values in a column. - **AVG()**: Calculates the average of the values in a column. - **MIN()**: Finds the minimum value in a column. - **MAX()**: Finds the maximum value in a column. **Example**: To count the number of students in each course. `SELECT CourseID, COUNT(StudentID) FROM Students GROUP BY CourseID;` The `GROUP BY CourseID` clause tells SQL to bundle all the rows with the same `CourseID` together and then apply the `COUNT` function to each bundle. If you need to filter the results of an aggregate function (e.g., only show courses with more than 10 students), you use the **HAVING** clause, which works like WHERE but for groups: `HAVING COUNT(StudentID) > 10`. ## Mathematical/Scientific Relationships While SQL isn't mathematical in the sense of physics equations, the logical structure is paramount. The order of clauses is a strict rule that must be memorised. It is not flexible. **The Mandatory SQL Clause Order** (Must Memorise): 1. `SELECT` 2. `FROM` 3. `JOIN` 4. `WHERE` 5. `GROUP BY` 6. `HAVING` 7. `ORDER BY` ![Logical flow of data through an SQL query's clauses.](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_056dc06f-c398-4cf7-adc9-cb79d69f4535/sql_clause_flowchart.png) Think of it as a filtering pipeline. Data flows from the `FROM` clause, is filtered by `WHERE`, grouped by `GROUP BY`, filtered again by `HAVING`, and finally sorted by `ORDER BY` before being output by `SELECT`. Getting this order wrong results in a syntax error and will score zero marks. ## Practical Applications SQL is one of the most commercially valuable skills in computer science. It powers: - **E-commerce sites**: Finding products, managing customer orders, and analysing sales data. - **Social Media**: Storing user profiles, posts, connections, and serving content feeds. - **Banking Systems**: Managing customer accounts, transactions, and generating statements. - **Booking Systems**: Handling flight, hotel, or event bookings and checking availability. In your A-Level project, if you are creating a dynamic web application with a database backend, you will be using SQL (or an ORM that generates SQL) extensively to manage your application's data. This is a chance to demonstrate practical application of this topic, which examiners value highly.

    Key Terms & Definitions

    Relational Database
    A database structured to recognise relations between stored items of information. Data is organised as a set of formally-described tables from which data can be accessed or reassembled in many different ways without having to reorganise the database tables.
    Primary Key
    A field in a table which uniquely identifies each record in that table. It must contain a unique value for each row of data and cannot contain null values.
    Foreign Key
    A field in a table that is a primary key in another table. It acts to provide a logical link between the two tables.
    Cardinality
    The relationship between two tables, expressed as one-to-one, one-to-many, or many-to-many.
    Aggregate Function
    A function that performs a calculation on a set of values and returns a single value. Examples include COUNT, SUM, AVG, MIN, MAX.
    Alias
    A temporary name given to a table or a column in a query. An alias is created with the `AS` keyword.

    Worked Examples

    Practice Questions

    SQL Queries

    Edexcel
    A-Level
    Computer Science

    Master Edexcel A-Level SQL with this comprehensive guide, packed with examiner insights, worked examples, and multi-modal resources. Learn to construct precise, mark-scoring queries and avoid the common pitfalls that cost candidates valuable marks.

    7
    Min Read
    3
    Examples
    5
    Questions
    6
    Key Terms
    🎙 Podcast Episode
    SQL Queries
    0:00-0:00

    Study Notes

    Header image for Edexcel A-Level SQL Queries (A6.4)

    Overview

    Structured Query Language (SQL) is the universal language for managing and querying relational databases. For an Edexcel A-Level Computer Science student, proficiency in SQL is not just a desirable skill but a core requirement, frequently tested in Paper 1 and Paper 2. This topic, specification reference A6.4, requires you to move beyond simple data retrieval and master the art of constructing complex, multi-table queries that filter, sort, and aggregate data with precision. Examiners are looking for candidates who can not only write syntactically correct SQL but also interpret a given database schema to solve a specific problem. A typical exam question will present a scenario with a relational schema and ask you to write a specific query to extract information, often for 4-6 marks. This topic has strong synoptic links to 'Databases' (A6.1) and 'Relational Databases' (A6.2), as understanding database structure is fundamental to querying it effectively.

    Listen to our 10-minute podcast guide to mastering SQL for your exam.

    Key Concepts

    Concept 1: Data Manipulation Language (DML) vs. Data Definition Language (DDL)

    SQL commands are broadly split into two categories. DML (Data Manipulation Language) is used to interact with the data within the tables. This is what you'll use most in exams. It includes the 'big four':

    • SELECT: Retrieves data from one or more tables.
    • INSERT: Adds a new row of data to a table.
    • UPDATE: Modifies existing data in a table.
    • DELETE: Removes rows from a table.

    DDL (Data Definition Language) is used to define and manage the structure of the database itself. Think of it as the blueprint for your data.

    • CREATE TABLE: Builds a new table, defining its columns and their data types.
    • ALTER TABLE: Modifies an existing table, for example by adding or removing a column.
    • DROP TABLE: Deletes an entire table and all its data.

    Examiners test your ability to distinguish between these. If a question asks you to 'add a new customer', you need an INSERT statement (DML). If it asks you to 'add a new field for customer email', you need an ALTER TABLE statement (DDL).

    DML vs DDL: Manipulating data vs. Defining structure.

    Concept 2: The SELECT Statement and Filtering with WHERE

    The SELECT statement is the cornerstone of data retrieval. Its fundamental structure is SELECT [columns] FROM [table]. To retrieve all columns, you can use the asterisk wildcard (*), but marks are typically awarded for selecting the specific columns requested in the question.

    To filter the rows returned, you use the WHERE clause. This is where the logic of your query resides.

    Example: To find all customers named 'Smith' from a 'Customers' table.
    SELECT CustomerID, FirstName, Postcode FROM Customers WHERE LastName = 'Smith';

    Crucial Examiner Point: Notice the single quotes around 'Smith'. Any string or date value in a WHERE clause must be enclosed in single quotes. Forgetting this is one of the most common ways candidates lose a mark. Numeric values, however, do not use quotes (e.g., WHERE Age > 18). You can build complex conditions using boolean operators AND (both conditions must be true) and OR (at least one condition must be true).

    Concept 3: Joining Tables

    Real-world databases are relational, meaning data is split across multiple tables linked by keys. To retrieve data from multiple tables in a single query, you must use a JOIN. The most common type you'll need is the INNER JOIN. An INNER JOIN returns only the rows where the key exists in both tables.

    Example: You have a Students table (with CourseID) and a Courses table (with CourseID and CourseName). To list each student's name and the name of their course, you join them on their common CourseID field.

    SELECT Students.Name, Courses.CourseName FROM Students INNER JOIN Courses ON Students.CourseID = Courses.CourseID;

    Crucial Examiner Point: When a column name exists in both tables (like CourseID here), you must prefix it with the table name (e.g., Students.CourseID). This is called 'qualifying the column' and failure to do so when ambiguity exists will cost you a mark.

    Visualising how INNER and LEFT JOINs combine data from two tables.

    Concept 4: Aggregation with GROUP BY

    Sometimes you don't want individual rows; you want a summary. This is achieved using aggregate functions combined with the GROUP BY clause. Aggregate functions perform a calculation on a set of rows and return a single summary value.

    The five you must know are:

    • COUNT(): Counts the number of rows.
    • SUM(): Adds up the values in a column.
    • AVG(): Calculates the average of the values in a column.
    • MIN(): Finds the minimum value in a column.
    • MAX(): Finds the maximum value in a column.

    Example: To count the number of students in each course.
    SELECT CourseID, COUNT(StudentID) FROM Students GROUP BY CourseID;

    The GROUP BY CourseID clause tells SQL to bundle all the rows with the same CourseID together and then apply the COUNT function to each bundle. If you need to filter the results of an aggregate function (e.g., only show courses with more than 10 students), you use the HAVING clause, which works like WHERE but for groups: HAVING COUNT(StudentID) > 10.

    Mathematical/Scientific Relationships

    While SQL isn't mathematical in the sense of physics equations, the logical structure is paramount. The order of clauses is a strict rule that must be memorised. It is not flexible.

    The Mandatory SQL Clause Order (Must Memorise):

    1. SELECT
    2. FROM
    3. JOIN
    4. WHERE
    5. GROUP BY
    6. HAVING
    7. ORDER BY

    Logical flow of data through an SQL query's clauses.

    Think of it as a filtering pipeline. Data flows from the FROM clause, is filtered by WHERE, grouped by GROUP BY, filtered again by HAVING, and finally sorted by ORDER BY before being output by SELECT. Getting this order wrong results in a syntax error and will score zero marks.

    Practical Applications

    SQL is one of the most commercially valuable skills in computer science. It powers:

    • E-commerce sites: Finding products, managing customer orders, and analysing sales data.
    • Social Media: Storing user profiles, posts, connections, and serving content feeds.
    • Banking Systems: Managing customer accounts, transactions, and generating statements.
    • Booking Systems: Handling flight, hotel, or event bookings and checking availability.

    In your A-Level project, if you are creating a dynamic web application with a database backend, you will be using SQL (or an ORM that generates SQL) extensively to manage your application's data. This is a chance to demonstrate practical application of this topic, which examiners value highly.

    Visual Resources

    4 diagrams and illustrations

    The correct, non-negotiable order for SQL clauses.
    The correct, non-negotiable order for SQL clauses.
    Visualising how INNER and LEFT JOINs combine data from two tables.
    Visualising how INNER and LEFT JOINs combine data from two tables.
    Logical flow of data through an SQL query's clauses.
    Logical flow of data through an SQL query's clauses.
    DML vs DDL: Manipulating data vs. Defining structure.
    DML vs DDL: Manipulating data vs. Defining structure.

    Interactive Diagrams

    2 interactive diagrams to visualise key concepts

    A flowchart illustrating the logical processing order of an SQL SELECT statement. This fixed sequence is crucial for writing valid queries.

    A concept map showing the two main branches of SQL: Data Manipulation Language (DML) for working with data, and Data Definition Language (DDL) for managing the database structure.

    Worked Examples

    3 detailed examples with solutions and examiner commentary

    Practice Questions

    Test your understanding — click to reveal model answers

    Q1

    A database has a table tblPlayers(PlayerID, PlayerName, TeamID, GoalsScored). Write an SQL query to find the names of all players who have scored more than 20 goals.

    2 marks
    foundation

    Hint: You only need to select from one table and apply a simple filter.

    Q2

    Two tables exist: Customer(CustomerID, Name) and Orders(OrderID, CustomerID, OrderDate, Amount). Write an SQL query to list the names of all customers who have not placed any orders. Assume a customer has not placed an order if they do not appear in the Orders table.

    4 marks
    challenging

    Hint: This requires an 'anti-join'. Think about which type of JOIN returns all records from one table, even if there are no matches in the other. How can you then filter for the non-matches?

    Q3

    Using the Customer and Orders tables from the previous question, write an SQL query to find the total amount spent by each customer. The output should show the customer's name and their total order amount.

    4 marks
    standard

    Hint: You need to calculate a 'total', which suggests an aggregate function. Since you need it 'for each customer', what clause must you use with the aggregate?

    Q4

    Write an SQL query to find the names of all products from a Products table where the ProductName starts with the letter 'A'.

    3 marks
    standard

    Hint: The WHERE clause supports pattern matching. What keyword and wildcard character would you use for 'starts with'?

    Q5

    Write a DDL statement to create a new table called Subjects with two columns: SubjectID as an integer that is the primary key, and SubjectName as a variable-length string that cannot be empty.

    4 marks
    standard

    Hint: This requires a CREATE TABLE statement. Remember to define each column's name, data type, and any constraints like PRIMARY KEY or NOT NULL.

    Key Terms

    Essential vocabulary to know