This topic covers the processing, representation, and analysis of data within the statistical enquiry cycle. It includes the use of various diagrams, stati
Topic Synopsis
This topic covers the processing, representation, and analysis of data within the statistical enquiry cycle. It includes the use of various diagrams, statistical measures of central tendency and dispersion, correlation, time series, and estimation techniques to interpret data sets and draw valid conclusions.
Key Concepts & Core Principles
- Measures of central tendency: mean (sum of values divided by number of values), median (middle value when ordered), mode (most frequent value). For grouped data, use midpoints to estimate the mean.
- Measures of spread: range (max - min), interquartile range (IQR = Q3 - Q1), and standard deviation (measure of dispersion around the mean). Know how to calculate these from raw data and frequency tables.
- Data representation: histograms (area proportional to frequency, with continuous data), box plots (show median, quartiles, and outliers), cumulative frequency graphs (for finding median and quartiles), and stem-and-leaf diagrams (retain original data values).
- Outliers: values that are more than 1.5 × IQR above Q3 or below Q1. Understand how to identify and handle outliers (e.g., exclude or investigate).
- Comparing distributions: use back-to-back stem-and-leaf diagrams or parallel box plots to compare two data sets. Comment on typical values (median) and spread (IQR or range).
Exam Tips & Revision Strategies
- Always check the axis labels and scales on diagrams to avoid misinterpretation
- Ensure the correct formula is used for standard deviation and frequency density
- When comparing data sets, always use both a measure of central tendency and a measure of dispersion
- State assumptions clearly when using models like the binomial or normal distribution
- Use the context of the problem to justify your choice of statistical measure or diagram
- Remember that correlation does not imply causation
Common Misconceptions & Mistakes to Avoid
- Confusing independent and dependent variables on scatter diagrams
- Inappropriate pairing of measures of central tendency and dispersion (e.g., mean with IQR)
- Misinterpreting correlation as causation
- Errors in constructing histograms, particularly with unequal class widths
- Incorrectly identifying or handling outliers
- Misuse of technology for data representation
Examiner Marking Points
- Correct use of statistical terminology and notation
- Accurate construction and interpretation of diagrams and visualisations
- Correct calculation of summary statistics including mean, median, mode, range, IQR, and standard deviation
- Appropriate selection of statistical measures and representations based on data type and context
- Correct identification and handling of outliers
- Accurate interpretation of correlation and regression lines
- Correct application of probability laws and distributions
- Justification of statistical methods and conclusions within the enquiry cycle