Statistics Revision Notes

    Subject: Mathematics | Level: GCSE | Exam Board: AQA

    Statistics is the science of collecting, analysing, and interpreting data. It is a highly practical and mark-friendly topic that appears in every single GCSE and A-Level Mathematics paper.

    Revision Notes & Key Concepts

    ## Overview ![Statistics: Making sense of data.](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_09055714-fe1c-4a0f-b8c3-89753af86e15/header_image.png) Welcome to the study of Statistics. This topic is about how we make sense of the world through data. From predicting the weather to analysing medical trials, statistics allows us to find patterns, make informed decisions, and understand probability. In your Mathematics examinations, Statistics is a core component that tests your ability to handle data systematically. Examiners are looking for your capability to choose appropriate methods, draw accurate diagrams, and calculate measures of central tendency and spread. Crucially, they also test your ability to interpret these results in context. Statistics connects deeply with other areas of mathematics, particularly Algebra (when manipulating formulas) and Number (when calculating percentages and fractions). Exam questions often blend these skills, asking you to calculate a mean from a grouped frequency table and then express a resulting probability as a fraction in its simplest form. ## Key Concepts ### Concept 1: Types of Data Before you can analyse data, you must classify it. The type of data dictates which statistical tools and diagrams you can use. ![Classification of data types.](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_09055714-fe1c-4a0f-b8c3-89753af86e15/data_types_diagram.png) Data is broadly split into qualitative (categorical) and quantitative (numerical). Quantitative data is further divided into discrete and continuous. **Discrete data** can only take specific, exact values. Think of it as data you *count*. For example, the number of siblings someone has, or shoe sizes. You cannot have 2.4 siblings. **Continuous data** can take any value within a range. Think of it as data you *measure*. For example, a person's height, weight, or the time taken to run a race. A height could be 175cm, 175.2cm, or 175.24cm, depending on the accuracy of your measuring tool. **Example**: If an exam question asks you to represent the heights of 50 plants, you must recognise this as continuous data and use a histogram or cumulative frequency graph, not a bar chart. ### Concept 2: Measures of Central Tendency (Averages) Examiners frequently test your understanding of the mean, median, and mode. You must not only know how to calculate them but also when to use them. ![Measures of central tendency and spread.](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_09055714-fe1c-4a0f-b8c3-89753af86e15/measures_diagram.png) - **Mean**: The sum of all values divided by the number of values. It uses all the data, making it a strong measure, but it is easily skewed by extreme values (outliers). - **Median**: The middle value when the data is ordered by size. It is excellent for skewed data because it is not affected by outliers. - **Mode**: The most frequently occurring value. It is the only average that can be used for qualitative data (e.g., the most popular car colour). **Example**: A dataset of house prices is £200k, £210k, £220k, £230k, and £5 million. The mean is heavily skewed by the £5m house, making it unrepresentative. The median (£220k) is a much better average to use here. ### Concept 3: Measures of Spread Averages only tell half the story. You also need to know how spread out the data is. - **Range**: The highest value minus the lowest value. Like the mean, it is vulnerable to outliers. - **Interquartile Range (IQR)**: The difference between the upper quartile (Q3) and the lower quartile (Q1). This measures the spread of the middle 50% of the data, ignoring extreme values. - **Standard Deviation (Higher Tier / A-Level)**: A measure of how far, on average, each data point is from the mean. A low standard deviation means the data is consistent and tightly clustered around the mean. ### Concept 4: Probability Probability measures the likelihood of an event occurring, on a scale from 0 (impossible) to 1 (certain). ![Probability scale and event combinations.](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_09055714-fe1c-4a0f-b8c3-89753af86e15/probability_diagram.png) Key rules to remember: - **Complement Rule**: P(Not A) = 1 - P(A). The probability of an event not happening is 1 minus the probability of it happening. - **Mutually Exclusive Events**: Events that cannot happen at the same time. P(A or B) = P(A) + P(B). - **Independent Events**: The outcome of one event does not affect the outcome of another. P(A and B) = P(A) × P(B). **Example**: The probability of rolling a 6 on a fair die is 1/6. The probability of rolling a 6 twice in a row (independent events) is 1/6 × 1/6 = 1/36. ### Podcast Revision Listen to this 10-minute revision podcast covering the core concepts, common mistakes, and exam tips for Statistics. ![Statistics Revision Podcast](https://xnnrgnazirrqvdgfhvou.supabase.co/storage/v1/object/public/study-guide-assets/guide_09055714-fe1c-4a0f-b8c3-89753af86e15/statistics_podcast.mp3) ## Mathematical/Scientific Relationships - **Mean (Ungrouped Data)**: $\bar{x} = \frac{\sum x}{n}$ (Sum of values divided by number of values) - **Mean (Grouped Data)**: $\bar{x} = \frac{\sum fx}{\sum f}$ (Sum of frequency × midpoint, divided by total frequency) - **Frequency Density (Histograms)**: $\text{Frequency Density} = \frac{\text{Frequency}}{\text{Class Width}}$ - **Interquartile Range**: $\text{IQR} = Q_3 - Q_1$ - **Probability of A or B (Mutually Exclusive)**: $P(A \cup B) = P(A) + P(B)$ - **Probability of A and B (Independent)**: $P(A \cap B) = P(A) \times P(B)$ ## Practical Applications Statistics is used extensively in the real world. Actuaries use probability to calculate insurance premiums. Medical researchers use clinical trials and statistical significance to determine if a new drug is effective. Quality control engineers use standard deviation to ensure manufacturing processes are consistent. Understanding statistics makes you a more critical consumer of news and data in daily life.

    Key Terms & Definitions

    Continuous Data
    Data that can take any numerical value within a given range, typically obtained by measuring.
    Mutually Exclusive Events
    Events that cannot occur at the same time.
    Independent Events
    Events where the outcome of one does not affect the probability of the other.
    Interquartile Range (IQR)
    The difference between the upper quartile and the lower quartile, representing the spread of the middle 50% of the data.
    Frequency Density
    The frequency divided by the class width, used as the y-axis on a histogram.
    Outlier
    An extreme value that lies significantly outside the overall pattern of the data.

    Worked Examples

    Practice Questions

    Statistics

    Statistics is the science of collecting, analysing, and interpreting data. It is a highly practical and mark-friendly topic that appears in every single GCSE and A-Level Mathematics paper.

    6
    Min Read
    3
    Examples
    5
    Questions
    6
    Key Terms
    🎙 Podcast Episode
    Statistics
    0:00-0:00

    Study Notes

    Overview

    Statistics: Making sense of data.

    Welcome to the study of Statistics. This topic is about how we make sense of the world through data. From predicting the weather to analysing medical trials, statistics allows us to find patterns, make informed decisions, and understand probability.

    In your Mathematics examinations, Statistics is a core component that tests your ability to handle data systematically. Examiners are looking for your capability to choose appropriate methods, draw accurate diagrams, and calculate measures of central tendency and spread. Crucially, they also test your ability to interpret these results in context.

    Statistics connects deeply with other areas of mathematics, particularly Algebra (when manipulating formulas) and Number (when calculating percentages and fractions). Exam questions often blend these skills, asking you to calculate a mean from a grouped frequency table and then express a resulting probability as a fraction in its simplest form.

    Key Concepts

    Concept 1: Types of Data

    Before you can analyse data, you must classify it. The type of data dictates which statistical tools and diagrams you can use.

    Classification of data types.

    Data is broadly split into qualitative (categorical) and quantitative (numerical). Quantitative data is further divided into discrete and continuous.

    Discrete data can only take specific, exact values. Think of it as data you count. For example, the number of siblings someone has, or shoe sizes. You cannot have 2.4 siblings.

    Continuous data can take any value within a range. Think of it as data you measure. For example, a person's height, weight, or the time taken to run a race. A height could be 175cm, 175.2cm, or 175.24cm, depending on the accuracy of your measuring tool.

    Example: If an exam question asks you to represent the heights of 50 plants, you must recognise this as continuous data and use a histogram or cumulative frequency graph, not a bar chart.

    Concept 2: Measures of Central Tendency (Averages)

    Examiners frequently test your understanding of the mean, median, and mode. You must not only know how to calculate them but also when to use them.

    Measures of central tendency and spread.

    • Mean: The sum of all values divided by the number of values. It uses all the data, making it a strong measure, but it is easily skewed by extreme values (outliers).
    • Median: The middle value when the data is ordered by size. It is excellent for skewed data because it is not affected by outliers.
    • Mode: The most frequently occurring value. It is the only average that can be used for qualitative data (e.g., the most popular car colour).

    Example: A dataset of house prices is £200k, £210k, £220k, £230k, and £5 million. The mean is heavily skewed by the £5m house, making it unrepresentative. The median (£220k) is a much better average to use here.

    Concept 3: Measures of Spread

    Averages only tell half the story. You also need to know how spread out the data is.

    • Range: The highest value minus the lowest value. Like the mean, it is vulnerable to outliers.
    • Interquartile Range (IQR): The difference between the upper quartile (Q3) and the lower quartile (Q1). This measures the spread of the middle 50% of the data, ignoring extreme values.
    • Standard Deviation (Higher Tier / A-Level): A measure of how far, on average, each data point is from the mean. A low standard deviation means the data is consistent and tightly clustered around the mean.

    Concept 4: Probability

    Probability measures the likelihood of an event occurring, on a scale from 0 (impossible) to 1 (certain).

    Probability scale and event combinations.

    Key rules to remember:

    • Complement Rule: P(Not A) = 1 - P(A). The probability of an event not happening is 1 minus the probability of it happening.
    • Mutually Exclusive Events: Events that cannot happen at the same time. P(A or B) = P(A) + P(B).
    • Independent Events: The outcome of one event does not affect the outcome of another. P(A and B) = P(A) × P(B).

    Example: The probability of rolling a 6 on a fair die is 1/6. The probability of rolling a 6 twice in a row (independent events) is 1/6 × 1/6 = 1/36.

    Podcast Revision

    Listen to this 10-minute revision podcast covering the core concepts, common mistakes, and exam tips for Statistics.

    Statistics Revision Podcast

    Mathematical/Scientific Relationships

    • Mean (Ungrouped Data): \bar{x} = \frac{\sum x}{n} (Sum of values divided by number of values)
    • Mean (Grouped Data): \bar{x} = \frac{\sum fx}{\sum f} (Sum of frequency × midpoint, divided by total frequency)
    • Frequency Density (Histograms): \text{Frequency Density} = \frac{\text{Frequency}}{\text{Class Width}}
    • Interquartile Range: \text{IQR} = Q_3 - Q_1
    • Probability of A or B (Mutually Exclusive): P(A \cup B) = P(A) + P(B)
    • Probability of A and B (Independent): P(A \cap B) = P(A) \times P(B)

    Practical Applications

    Statistics is used extensively in the real world. Actuaries use probability to calculate insurance premiums. Medical researchers use clinical trials and statistical significance to determine if a new drug is effective. Quality control engineers use standard deviation to ensure manufacturing processes are consistent. Understanding statistics makes you a more critical consumer of news and data in daily life.

    Visual Resources

    3 diagrams and illustrations

    Classification of data types.
    Classification of data types.
    Measures of central tendency and spread.
    Measures of central tendency and spread.
    Probability scale and event combinations.
    Probability scale and event combinations.

    Interactive Diagrams

    2 interactive diagrams to visualise key concepts

    Decision tree for approaching probability questions.

    Choosing the appropriate measures of central tendency and spread.

    Worked Examples

    3 detailed examples with solutions and examiner commentary

    Practice Questions

    Test your understanding — click to reveal model answers

    Q1

    A group of 20 students took a maths test. The median score was 65 and the interquartile range was 12. A second group of 20 students took the same test. Their median score was 72 and their interquartile range was 8. Compare the performance of the two groups. (2 marks)

    2 marks
    standard

    Hint: You need to make two distinct comparison statements: one about the average and one about the spread. Relate it back to the context of the test.

    Q2

    A biased coin is flipped twice. The probability of getting a Head on any flip is 0.7. Calculate the probability of getting exactly one Head and one Tail. (3 marks)

    3 marks
    standard

    Hint: There are two ways to get exactly one Head and one Tail: (Head, Tail) OR (Tail, Head). Calculate the probability for each pathway and add them together.

    Q3

    The cumulative frequency graph shows the weights of 80 apples. The median weight is 120g, the lower quartile is 105g, and the upper quartile is 135g. The lightest apple is 90g and the heaviest is 160g. Draw a box plot to represent this data. (3 marks)

    3 marks
    standard

    Hint: A box plot requires 5 key values: minimum, lower quartile, median, upper quartile, and maximum.

    Q4

    A factory produces lightbulbs. The probability that a lightbulb is defective is 0.02. A sample of 500 lightbulbs is tested. Calculate an estimate for the number of defective lightbulbs in the sample. (2 marks)

    2 marks
    foundation

    Hint: Multiply the probability of a single event by the number of trials.

    Q5

    Explain why the median is a more appropriate average than the mean for a dataset of salaries in a company where the CEO earns £2,000,000 and the other 50 employees earn around £30,000. (1 mark)

    1 marks
    foundation

    Hint: Think about how extreme values affect different types of averages.

    Explore this topic further

    View Topic PageAll Mathematics Topics

    Key Terms

    Essential vocabulary to know