What is the difference between a primary key and a foreign key?

A primary key is a unique identifier for each record in a table; it must contain unique values and cannot be NULL. A foreign key is a field in one table that refers to the primary key of another table, establishing a link between the two tables. Foreign keys can have duplicate values and can be NULL, but they must match a primary key value in the referenced table (referential integrity).

How do I normalise a database to 3NF?

To normalise to 3NF, follow these steps: 1) Ensure the table is in 1NF (atomic values, no repeating groups). 2) Remove partial dependencies to achieve 2NF (all non-key attributes must depend on the whole primary key). 3) Remove transitive dependencies to achieve 3NF (non-key attributes should not depend on other non-key attributes). For example, in a table with StudentID, CourseID, InstructorName, if InstructorName depends on CourseID (not StudentID), you would split into separate tables for Students, Courses, and Instructors.

What SQL command do I use to join two tables?

The SQL command to join two tables is JOIN, typically an INNER JOIN. For example: SELECT * FROM Students INNER JOIN Enrolments ON Students.StudentID = Enrolments.StudentID; This returns rows where there is a match in both tables. You can also use LEFT JOIN, RIGHT JOIN, or FULL OUTER JOIN depending on the desired result. Always specify the join condition using ON.

Why is normalisation important?

Normalisation reduces data redundancy and improves data integrity. By organising data into separate tables and eliminating duplicate information, you avoid update anomalies (e.g., updating a student's address in multiple places), insertion anomalies (e.g., unable to add a new course without a student), and deletion anomalies (e.g., losing course information when deleting a student). It also makes the database more flexible and easier to maintain.

What are ACID properties in databases?

ACID stands for Atomicity, Consistency, Isolation, and Durability. Atomicity ensures that a transaction is all-or-nothing; if any part fails, the entire transaction is rolled back. Consistency ensures that a transaction brings the database from one valid state to another, maintaining all rules (e.g., constraints). Isolation ensures that concurrent transactions do not interfere with each other. Durability guarantees that once a transaction is committed, it remains permanent even in case of a system failure.

How do I draw an ER diagram for a school database?

First, identify entities: e.g., Student, Teacher, Course, Classroom. Define attributes for each entity (e.g., Student: StudentID, Name, DateOfBirth). Determine relationships: e.g., a Student enrols in many Courses (many-to-many), so you need a linking entity Enrolment. A Teacher teaches one or more Courses (one-to-many). Use rectangles for entities, ovals for attributes, and diamonds for relationships. Label cardinalities (1, M, N) on the lines connecting entities.

Fundamentals of databases

AQA

A-Level

Structured Query Language (SQL) is the standard language for interacting with relational databases, enabling users to retrieve, insert, update, and delete data efficiently. Mastery of commands such as SELECT, JOIN, GROUP BY, and HAVING is essential for extracting valuable insights from data and forms a core competency in database management and application development. This subtopic emphasizes practical query construction and data manipulation techniques that are fundamental to real-world database systems.

Objectives

Exam Tips

Pitfalls

Key Terms

Mark Points

Subtopics in this area

Structured Query Language (SQL)

Database design and normalisation

Transaction processing

Relational databases

Security and integrity

Topic Overview

Databases are fundamental to modern computing, enabling efficient storage, retrieval, and management of large volumes of structured data. In the AQA A-Level Computer Science specification, this topic covers the principles of database design, the relational model, and the use of Structured Query Language (SQL) to manipulate data. You will learn how to design a database schema using entity-relationship (ER) diagrams, normalise data to reduce redundancy, and implement queries using SQL. Understanding databases is crucial for developing data-driven applications and is a core component of the course, often examined through both theoretical questions and practical SQL tasks.

The relational database model organises data into tables (relations) with rows (tuples) and columns (attributes). Each table represents an entity, and relationships between entities are established through foreign keys. You will study the three normal forms (1NF, 2NF, 3NF) to ensure data integrity and eliminate anomalies. Additionally, you will learn about transactions, ACID properties (Atomicity, Consistency, Isolation, Durability), and the role of database management systems (DBMS). This knowledge is not only examinable but also directly applicable to real-world scenarios, such as designing a school's student records system or an e-commerce platform.

Mastering databases is essential for any aspiring software developer or data analyst. The topic builds on earlier concepts of data structures and algorithms, and it connects to later topics like big data and data mining. In the A-Level exam, you may be asked to interpret a scenario, draw an ER diagram, normalise a set of attributes, write SQL queries, or discuss the advantages of using a DBMS. A strong grasp of databases will help you secure high marks in both Paper 1 (practical programming) and Paper 2 (theoretical concepts).

Key Concepts

Core ideas you must understand for this topic

→Relational model: data organised into tables with rows and columns; primary keys uniquely identify each row; foreign keys link tables.
→Normalisation: process of organising data to reduce redundancy and dependency, typically up to Third Normal Form (3NF).
→SQL: standard language for querying and manipulating databases; includes SELECT, INSERT, UPDATE, DELETE, and JOIN operations.
→Entity-Relationship (ER) diagrams: graphical representation of entities, attributes, and relationships (one-to-one, one-to-many, many-to-many).
→ACID properties: ensure reliable processing of database transactions (Atomicity, Consistency, Isolation, Durability).

Learning Objectives

What you need to know and understand

Construct SQL SELECT queries to retrieve specified data from single and multiple tables using appropriate filtering and sorting.
Apply INSERT, UPDATE, and DELETE statements to modify database contents while maintaining data integrity.
Design queries using various JOIN types (e.g., INNER, LEFT, RIGHT) to integrate related data from multiple tables.
Implement GROUP BY clauses with aggregate functions to summarize data at different levels of granularity.
Differentiate between WHERE and HAVING clauses to correctly filter rows before and after grouping.
Analyze SQL query outputs to verify correctness and identify potential errors in logic or syntax.
Define the rules that a relation must satisfy to be in first, second, and third normal form
Identify partial dependencies and transitive dependencies in a given unnormalised table
Apply the steps of normalisation to convert a dataset from unnormalised form to 3NF
Evaluate the trade-offs between normalised and denormalised database designs
Describe the ACID properties (Atomicity, Consistency, Isolation, Durability) and their role in transaction processing.
Explain how atomicity is enforced using commit and rollback operations.
Analyse the impact of different isolation levels on data consistency and performance.
Evaluate trade-offs between strict isolation and system throughput in concurrent transactions.
Apply the concept of durability to design a recovery strategy after a system crash.
Assess how consistency ensures database validity across multiple simultaneous transactions.
Define the fundamental components of a relational database: tables, records, fields, primary keys, foreign keys, and relationships.
Explain the function of primary and foreign keys in enforcing data integrity and establishing relationships.
Differentiate between one-to-one, one-to-many, and many-to-many relationships using concrete examples.
Construct a simple relational schema from a given set of requirements, identifying entities, attributes, and keys.
Analyze a given database design to identify potential redundancy and integrity issues.
Explain the role of access controls in database security, including user authentication and authorization mechanisms.
Evaluate the strengths and weaknesses of different encryption methods for protecting data at rest and in transit.
Design a backup and recovery plan for a given scenario, considering full, incremental, and differential backups.
Apply integrity constraints (e.g., primary keys, foreign keys, NOT NULL, CHECK) to ensure data validity in a relational database schema.
Analyse the impact of SQL injection attacks on database security and propose mitigation strategies.

Marking Points

Key points examiners look for in your answers

Award credit for accurate SQL syntax, including correct capitalization of keywords, use of quotes for strings, and appropriate punctuation.
When using JOIN, award credit for specifying the correct ON condition linking primary and foreign keys; penalize Cartesian products.
In GROUP BY queries, require all non-aggregated columns in SELECT to appear in the GROUP BY clause; award credit for appropriate use of aggregate functions (e.g., COUNT, SUM, AVG).
For HAVING, award credit when it is correctly applied to filter groups (not individual rows), and distinguish from WHERE usage.
In data manipulation commands, award credit for WHERE clauses that accurately target specific rows; penalize unqualified DELETE/UPDATE that could cause data loss.
Credit for correctly identifying repeating groups and removing them to achieve 1NF
Award marks for correctly identifying the primary key (simple or composite) in each relation
Expect learners to demonstrate identification of partial dependencies and their resolution into separate 2NF relations
Credit for detecting transitive dependencies and correctly separating non-key dependent attributes to achieve 3NF
Marks awarded for clear, structured presentation showing each normal form stage with appropriate relation schemas
Award credit for a precise definition of each ACID property with a relevant example.
Credit should be given for explaining that atomicity prevents partial updates even during system failures.
Look for clear distinction between isolation (concurrent execution as if serial) and serializability.
In scenario-based questions, reward identification of the specific ACID property violated and its consequence.
Expect high marks for linking durability to write-ahead logging and checkpointing techniques.
Award credit for correctly identifying primary and foreign keys in a provided table schema.
Full marks require precise definitions that distinguish between a field and a record.
Credit for demonstrating how a foreign key references a primary key in another table to link data.
When evaluating relationships, look for clear justification of relationship type (e.g., cardinality).
Award credit for clearly distinguishing between authentication (verifying identity) and authorization (granting permissions).
Credit for explaining how encryption transforms data into ciphertext and the importance of key management.
Award marks for describing the purpose of each backup type (full, incremental, differential) and appropriate use cases.
Expect reference to specific integrity constraints like entity integrity (primary keys) and referential integrity (foreign keys) with examples.
Look for understanding that integrity constraints prevent inconsistent data entry, e.g., orphan records.

Examiner Tips

Expert advice for maximising your marks

💡Always test your SQL queries on sample data before finalizing; predict the expected output to verify logic.
💡Remember the standard SQL execution order: FROM → JOIN → WHERE → GROUP BY → HAVING → SELECT → ORDER BY.
💡When working with GROUP BY, double-check that every column in the SELECT list is either in the GROUP BY clause or an argument to an aggregate function.
💡Use aliases for table names in multi-table queries to improve readability and avoid ambiguity.
💡Always begin by identifying all candidate keys and clearly marking the primary key for each relation
💡Systematically check each non-key attribute against the key to verify it is fully functionally dependent before progressing to the next normal form
💡Show all working stages (UNF→1NF→2NF→3NF) with relation outlines and dependencies labelled to secure method marks even if the final outcome is imperfect
💡When defining ACID, always accompany each property with a brief, concrete example (e.g., a bank transfer for atomicity).
💡In extended responses, compare the ACID approach with BASE (Basically Available, Soft state, Eventually consistent) for deeper analysis.
💡Use precise terminology: say 'all-or-nothing' for atomicity, 'serializable' for strict isolation, and 'non-volatile storage' for durability.
💡For scenario questions, identify which property is at risk and suggest a practical mechanism to enforce it.
💡In written answers, always provide an example when defining a term, e.g., 'A primary key, such as StudentID, uniquely identifies each record in the Students table.'
💡When drawing relationships, clearly mark the primary key and foreign key in each table.
💡For many-to-many relationships, always show the intermediate link table with composite primary key.
💡Revise ER modelling notation and practice translating between textual descriptions and schema diagrams.
💡When asked to describe a security measure, always link it to the CIA triad (Confidentiality, Integrity, Availability) in your answer.
💡Use precise technical terminology: e.g., specify 'role-based access control (RBAC)' rather than just 'access levels'.
💡For questions about backup strategies, draw a timeline showing full and incremental backups and explain recovery point objective (RPO) and recovery time objective (RTO).
💡In scenarios, always apply integrity constraints to given data sets and demonstrate how they prevent anomalies.
💡If asked to evaluate, balance benefits with drawbacks; e.g., encryption adds processing overhead but protects against data breaches.
💡When writing SQL queries, always specify the columns you need instead of using SELECT * to show you understand projection and to avoid unnecessary data retrieval.
💡In ER diagrams, clearly label relationship cardinalities (1:1, 1:M, M:N) and ensure entities have appropriate attributes. Many-to-many relationships must be resolved into two one-to-many relationships using a linking table.
💡For normalisation questions, start by identifying the primary key and functional dependencies. Check for partial dependencies (2NF) and transitive dependencies (3NF). Show your working step by step to gain method marks.

Common Mistakes

Pitfalls to avoid in your exam answers

Confusing WHERE and HAVING: applying HAVING for row-level filters rather than group-level filters.
Omitting GROUP BY for non-aggregated columns when using aggregate functions, leading to logical errors.
Incorrect JOIN conditions resulting in Cartesian products or mismatches; forgetting the ON clause.
Using DELETE or UPDATE without a WHERE clause, accidentally modifying or removing all rows.
Misunderstanding the order of SQL clause execution, leading to illogical queries (e.g., referencing aliases in WHERE before they are defined).
Confusing partial dependency with transitive dependency, leading to incorrect normalisation steps
Failing to create separate tables for repeating groups, instead flattening data into a single wide table
Leaving transitive dependencies unresolved by not moving non-key attributes into new relations
Incorrectly assuming a single attribute can be a composite key without considering functional dependency rules
Confusing atomicity with isolation—assuming that atomic transactions automatically prevent interference from other transactions.
Believing consistency only refers to data correctness before the transaction, not maintaining all defined rules and constraints afterwards.
Treating durability as being solely about regular backups, rather than persistence of committed changes despite immediate failures.
Misunderstanding isolation levels by assuming the highest level is always required without considering performance implications.
Confusing the terms 'field' and 'record', often interchanging them in descriptions.
Assuming a primary key can contain null values or duplicates.
Misidentifying foreign keys as attributes that must be unique across the table.
Believing that a relational database cannot represent many-to-many relationships directly (without a junction table).
Confusing authentication with authorization, or thinking that encryption alone ensures data integrity.
Assuming that backups are only about data duplication, without considering frequency, storage location, and testing restoration.
Failing to mention that integrity constraints must be enforced at the DBMS level, not just in application logic.
Misunderstanding that encryption protects data from unauthorized viewing but not from deletion or modification (unless combined with hashing or digital signatures).
Overlooking the importance of physical security and network security in conjunction with database access controls.
Misconception: A primary key can be NULL. Correction: A primary key must be unique and not NULL; it uniquely identifies each record.
Misconception: Normalisation always improves performance. Correction: Normalisation reduces redundancy but can increase the number of joins, which may slow down queries. Denormalisation is sometimes used for performance.
Misconception: SQL is case-insensitive for everything. Correction: SQL keywords are case-insensitive, but string comparisons are case-sensitive in many DBMS (e.g., MySQL depends on collation).

Frequently Asked Questions

Common questions students ask about this topic

Before You Start

Prior knowledge that will help with this topic

•Basic understanding of data types and structures (e.g., arrays, records).
•Familiarity with Boolean logic and set theory (e.g., union, intersection) as SQL queries often involve logical conditions.
•Knowledge of file organisation and data storage concepts (e.g., sequential vs. random access) helps contextualise why databases are efficient.

Key Terminology

Essential terms to know

Data retrieval and filtering
Data modification operations
Table joins and relationships
Aggregation and grouped analysis
Conditional filtering on groups
Data redundancy elimination
Functional dependency analysis
Normal forms (1NF, 2NF, 3NF)
Update anomaly prevention
Atomicity and all-or-nothing execution
Consistency constraints and integrity
Isolation levels and concurrency
Durability through logging and recovery
Commit and rollback mechanisms
Data tables and attributes
Primary and foreign keys
Entity relationships
Referential integrity
Normalization basics
Access control mechanisms
Data encryption and confidentiality
Backup and recovery strategies
Integrity constraints enforcement
Security threats and mitigation

Ready to test yourself?

Practice questions tailored to this topic

Fundamentals of databases

Subtopics in this area

Topic Overview

Key Concepts

Learning Objectives

Marking Points

Examiner Tips

Common Mistakes

Frequently Asked Questions

Before You Start

Key Terminology

Ready to test yourself?

Related Topics in AQA A-Level Computer Science

E2E stub concept

Theory of computation

Fundamentals of computer organisation and architecture

Systematic approach to problem solving

Topic Synopsis

Key Concepts & Core Principles

Exam Tips & Revision Strategies

Common Misconceptions & Mistakes to Avoid

Examiner Marking Points

Fundamentals of databases

Subtopics in this area

Topic Overview

Key Concepts

Learning Objectives

Marking Points

Examiner Tips

Common Mistakes

Frequently Asked Questions

What is the difference between a primary key and a foreign key?

How do I normalise a database to 3NF?

What SQL command do I use to join two tables?

Why is normalisation important?

What are ACID properties in databases?

How do I draw an ER diagram for a school database?

Before You Start

Key Terminology

Ready to test yourself?

Related Topics in AQA A-Level Computer Science

E2E stub concept

Theory of computation

Fundamentals of computer organisation and architecture

Systematic approach to problem solving