How do I convert a negative denary number to binary using two's complement?

First, write the positive version of the number in binary (e.g., +5 = 0101 in 4 bits). Then invert all bits (so 0101 becomes 1010). Finally, add 1 to the result: 1010 + 1 = 1011. That's −5 in two's complement. The most significant bit (leftmost) is 1, indicating a negative number. To convert back, invert and add 1 again.

What is the difference between lossy and lossless compression?

Lossless compression reduces file size without losing any data, so the original can be perfectly reconstructed. Examples include ZIP files and PNG images. Lossy compression permanently removes some data, usually by discarding less important information, resulting in a smaller file but lower quality. Examples include JPEG images and MP3 audio. Lossy is used when a perfect copy isn't needed, like streaming music.

Why do computers use binary instead of decimal?

Computers use binary because it's simple and reliable for electronic circuits. Transistors have two states: on (1) and off (0), making binary easy to implement. Decimal would require circuits that can distinguish ten different voltage levels, which is harder to make reliable and more power‑hungry. Binary also aligns with Boolean logic, which is the foundation of all digital computation.

How do I normalise a floating‑point number?

Normalisation ensures the mantissa is in the form 0.1xxx (positive) or 1.0xxx (negative) in two's complement. For a positive mantissa, shift left until the first bit after the sign is 1, decreasing the exponent by 1 for each shift. For a negative mantissa, shift left until the first bit after the sign is 0, again decreasing the exponent. If the mantissa is already normalised, leave it. Example: 0.00101 × 2^3 becomes 0.101 × 2^1 after shifting left twice.

What is the Nyquist theorem and why is it important for sound?

The Nyquist theorem states that to accurately capture a sound wave, the sample rate must be at least twice the highest frequency present in the signal. For example, human hearing goes up to about 20 kHz, so CD quality uses 44.1 kHz (just over double). If you sample below the Nyquist rate, you get aliasing—distortion where high frequencies appear as lower false frequencies. That's why sample rate matters for faithful digital audio.

How do I calculate the file size of a bitmap image?

File size (in bits) = width in pixels × height in pixels × colour depth (bits per pixel). For example, a 1920×1080 image with 24‑bit colour (true colour) gives 1920 × 1080 × 24 = 49,766,400 bits. Divide by 8 to get bytes: 6,220,800 bytes ≈ 6.22 MB. Remember that this is the raw size; actual file formats (like JPEG) use compression to reduce it.

Fundamentals of data representation

AQA

A-Level

Encryption is a fundamental technique for securing data confidentiality and integrity in digital systems. This topic covers the principles and mechanisms of symmetric encryption, where a single shared key is used for both encryption and decryption (e.g., AES), and asymmetric encryption, which employs a public-private key pair (e.g., RSA). Understanding the mathematical underpinnings, practical applications, and security implications of these algorithms is essential for designing secure communication protocols and protecting sensitive information.

Objectives

Exam Tips

Pitfalls

Key Terms

Mark Points

Subtopics in this area

Encryption

Number systems

Floating point numbers

Representation of images

Representation of sound

Character encoding

Data compression

Bitwise manipulation and masks

Binary arithmetic and representation

Topic Overview

Fundamentals of data representation is the bedrock of computer science, exploring how all digital information—numbers, text, images, sound, and instructions—is stored and manipulated using binary. You'll learn why computers use base‑2, how to convert between binary, denary, and hexadecimal, and how these systems underpin everything from memory addresses to colour codes. This topic is essential because it explains the 'language' of computers, enabling you to understand data storage limits, compression, and error detection.

Beyond simple conversions, you'll dive into how different data types are represented: signed and unsigned integers (including two's complement for negatives), fixed‑point and floating‑point real numbers (with mantissa and exponent), and character sets like ASCII and Unicode. You'll also explore how bitmap images are encoded using pixels and colour depth, and how sound is digitised through sampling, bit depth, and sample rate. These concepts directly link to file sizes, compression algorithms (lossy vs lossless), and the trade‑offs between quality and storage.

Mastering this topic is crucial for A‑Level success because it appears in multiple contexts: from calculating file sizes in the theory exam to understanding how data is stored in databases and transmitted over networks. It also builds a foundation for more advanced topics like encryption, data structures, and machine architecture. By the end, you should be able to confidently perform binary arithmetic, explain how real numbers are approximated, and justify the choice of representation for a given scenario.

Key Concepts

Core ideas you must understand for this topic

→Binary, denary, and hexadecimal conversions: Be able to convert between all three bases fluently, including using hex as a shorthand for binary (e.g., 0x1A = 00011010).
→Two's complement for signed integers: Understand that the most significant bit (MSB) is the sign bit, and how to negate a number (invert all bits and add 1).
→Floating‑point representation: Know the structure of mantissa and exponent (both in two's complement), and how to normalise a number to maximise precision.
→Character encoding: Distinguish between ASCII (7‑bit, 128 characters) and Unicode (e.g., UTF‑8, UTF‑16), and why Unicode is needed for global text.
→Bitmap images and sound: Understand how resolution, colour depth, and sample rate affect file size, and the difference between lossy and lossless compression.

Learning Objectives

What you need to know and understand

Explain the operational principles of symmetric and asymmetric encryption.
Compare symmetric and asymmetric encryption in terms of speed, key length, and security.
Analyse how hybrid encryption combines the strengths of both approaches for efficient secure communication.
Evaluate the importance of key size and computational complexity in resisting brute force and cryptographic attacks.
Apply encryption concepts to justify the use of digital signatures and certificates in secure web browsing.
Convert accurately between binary, decimal, hexadecimal, and octal number systems.
Explain why hexadecimal is used in computing as a more human-readable representation of binary.
Apply grouping techniques to efficiently convert between binary and hexadecimal or octal.
Compare the storage efficiency of different number bases in digital systems.
Explain the need for floating point representation in computer systems.
Apply the IEEE 754 standard to convert denary numbers into binary floating point format.
Analyse the effects of limited mantissa bits on calculation precision.
Evaluate the trade-offs between range and precision in floating point representation.
Describe the structure of a bitmap image in terms of pixels and colour depth
Explain how vector graphics represent shapes using mathematical primitives
Calculate bitmap file sizes given image dimensions and colour depth
Compare the scalability and file size implications of bitmap and vector representations
Evaluate the suitability of bitmap versus vector formats for different scenarios
Represent sound as digital samples (sampling rate, bit depth)
Explain the relationship between sampling rate, frequency range, and sound quality using the Nyquist theorem.
Calculate the file size of an uncompressed monophonic or stereo sound file.
Evaluate the impact of bit depth on dynamic range and signal-to-noise ratio.
Describe the steps involved in ADC for sound, including sampling and quantisation.
Define a character set and explain the need for standardised encoding in digital communication.
Describe the 7-bit ASCII encoding scheme and its range of representable characters.
Identify the limitations of ASCII for internationalisation and multilingual support.
Explain the concept of Unicode as a universal character set with over a million code points.
Distinguish between code points, code units, and encoding forms in Unicode.
Compare UTF-8 and UTF-16 encoding mechanisms, focusing on byte usage and ASCII compatibility.
Analyse scenarios where UTF-8 or UTF-16 is more appropriate based on data characteristics.
Evaluate the impact of character encoding choices on data size, processing speed, and cross-platform compatibility.
Explain the difference between lossless and lossy compression, including their respective advantages and disadvantages.
Apply run-length encoding to compress and decompress a given sequence of symbols.
Construct a Huffman tree from a frequency table and derive the corresponding Huffman codes for each symbol.
Calculate and interpret compression ratios for different compression techniques.
Evaluate the suitability of lossy compression for different types of media, considering factors such as perceptibility and data fidelity.
Apply bitwise AND, OR, XOR, and NOT operations to binary sequences.
Use logical and arithmetic shift operations to manipulate binary data.
Construct masks to isolate, set, clear, or toggle specific bits within a byte or word.
Evaluate the efficiency of bitwise techniques over conventional arithmetic for specific tasks (e.g., multiplication/division by powers of two).
Debug bit-level code to identify errors in mask application or operator precedence.
Represent given positive and negative decimal integers in both sign-magnitude and two's complement binary forms.
Perform binary addition and subtraction using two's complement arithmetic, including the handling of carry and overflow.
Analyse the range of representable numbers for a given number of bits in sign-magnitude versus two's complement formats.
Evaluate the suitability of sign-magnitude and two's complement representations for efficient arithmetic in computer systems.
Identify and correct common misconceptions regarding negative zero and overflow in signed binary arithmetic.

Marking Points

Key points examiners look for in your answers

Accurately describe AES as a symmetric block cipher with key sizes (e.g., 128, 192, 256 bits) and its use in WPA2, SSL/TLS.
Explain that RSA is an asymmetric algorithm based on the difficulty of factoring large primes, used for key exchange and digital signatures.
Identify that symmetric encryption uses the same key for encryption and decryption, requiring a secure key distribution method.
Demonstrate understanding that asymmetric encryption uses mathematically related key pairs, solving the key distribution problem but being slower.
Correctly reference real-world protocols such as HTTPS using a combination of asymmetric key exchange and symmetric session keys.
Award credit for correctly applying conversion algorithms even if final answer is incorrect due to an arithmetic slip.
Look for clear method showing grouping of bits for hex and octal conversions.
Check understanding of place values and significance of base (e.g., correct powers of base).
Credit demonstration of the inverse conversion process to verify results.
Award credit for correctly identifying the sign bit appropriate to the number.
Award credit for accurate conversion of the mantissa to a normalised form.
Award credit for applying the correct exponent bias and storing the exponent in biased form.
Award credit for recognising and handling edge cases such as zero, denormalised numbers, infinity, and NaN.
Award credit for correctly stating that a bitmap stores colour values for each pixel in a grid
Award credit for identifying that vector graphics use instructions like draw circle at (x,y) radius r
Award credit for applying the formula: file size (bits) = width × height × colour depth, with clear substitution
Award credit for explaining that vector images scale without loss of quality due to mathematical descriptions
Award credit for noting that metadata (e.g., header, palette) contributes additional overhead to file size
Award credit for contrasting typical use cases (e.g., photographs as bitmaps, logos as vectors)
Award credit for demonstrating that sampling rate must be at least twice the maximum frequency to avoid aliasing.
Credit accurate file size calculations with correct unit conversions (bits to bytes to kilobytes).
Expect learners to identify that higher bit depth reduces quantisation error but increases file size.
Reward mention that stereo doubles the file size compared to mono for the same duration.
Award credit for recognising that ASCII uses 7 bits per character, allowing 128 possible codes.
Award credit for correctly stating that the first 128 Unicode code points match the ASCII character set.
Award credit for explaining that UTF-8 is a variable-length encoding using 1 to 4 bytes per character.
Award credit for detailing that UTF-16 uses either 2 or 4 bytes per character via surrogate pairs.
Award credit for identifying UTF-8's backward compatibility with ASCII and its efficiency for English text.
Award credit for discussing real-world uses, such as UTF-8 dominance on the web and UTF-16 in Windows internals.
Award credit for correctly applying RLE, including handling of consecutive symbols and output format.
Marks are awarded for accurate construction of a Huffman tree: selecting the two lowest frequencies at each step and building the tree bottom-up.
Credit for generating valid prefix codes from the Huffman tree, with no code being a prefix of another.
In questions comparing compression methods, credit is given for considering the nature of the data (e.g., text, image, sound) and the intended use.
Award marks for correct calculation of compression ratio, expressed as original size divided by compressed size, and for commenting on the result.
Award credit for correctly applying a mask using AND to clear bits, with concise justification.
Expect demonstration of understanding that XOR can toggle bits when a mask is applied.
Credit for recognising that left shift multiplies by 2^n and right shift divides by 2^n (with truncation for unsigned).
Assess correct ordering of operations when multiple bitwise operators are used (e.g., using brackets).
Award marks for correctly applying the two's complement conversion method (invert bits and add one) to obtain the binary representation of a negative number.
Credit should be given for clearly showing working steps in binary addition/subtraction, particularly the handling of a final carry-out bit.
Examiners expect candidates to identify and explain overflow, e.g., when the sum of two positive numbers produces a negative result in two's complement.
Candidates should be able to state that sign-magnitude representation suffers from negative zero and requires separate addition/subtraction logic.
Marks are awarded for correctly identifying the range of numbers that can be represented with a given number of bits in both formats.

Examiner Tips

Expert advice for maximising your marks

💡When asked to describe algorithms, provide specific details like key lengths, block sizes, and typical use cases, not just generic terms.
💡Use comparisons to highlight trade-offs: draw a table if appropriate in written answers to show speed, key length, and use cases.
💡Relate theoretical concepts to practical scenarios, e.g., how WhatsApp or browsers use encryption, to demonstrate applied understanding.
💡In evaluation questions, always consider both advantages and limitations, mentioning aspects like performance, key management, and forward secrecy.
💡When converting between systems, always show working clearly to gain method marks.
💡Double-check hex and octal conversions by grouping binary digits from the right and anchoring to 4-bit or 3-bit groups.
💡Practice timed conversions under exam conditions to improve speed and accuracy.
💡Use decimal as an intermediate step for verifying binary-to-hex or octal-to-hex conversions.
💡Always begin by identifying the sign bit before proceeding with conversion.
💡Memorise the exponent bias values for single (127) and double (1023) precision.
💡Clearly show intermediate steps including normalisation and bias application in your working.
💡Check for special representations such as all-zeros or all-ones in exponent fields to avoid common pitfalls.
💡Always present step-by-step working for file size calculations, showing unit conversions clearly
💡Use a comparison table to structure answers contrasting bitmap and vector properties
💡Remember that vector images are resolution-independent but may have larger file sizes for photorealistic detail
💡Quote common bit depths (e.g., 1, 8, 24) and relate them to colour capabilities
💡For higher marks, discuss the role of headers and metadata in actual file formats like BMP or SVG
💡In calculation questions, always convert duration to seconds and show step-by-step unit conversions.
💡Use the acronym 'SARA' (Sample rate, Amplitude resolution, Rate of transmission) to remember key factors.
💡Relate sound theory to practical examples, like CD quality (44.1 kHz, 16-bit, stereo).
💡When comparing encoding schemes, structure your answer around bit/byte usage, compatibility, and typical applications.
💡Use precise terminology: distinguish between 'character', 'code point', and 'code unit' in your responses.
💡Practice calculating the number of bytes required for a given string in both UTF-8 and UTF-16 to solidify understanding.
💡Remember that exam questions often ask for justification of encoding choice—link to bandwidth, storage, or legacy system constraints.
💡Draw diagrams of how bits are arranged in UTF-8 multi-byte sequences to avoid common counting errors.
💡When constructing a Huffman tree, always list frequencies in increasing order initially, and show merging steps clearly, labeling edges with 0 and 1 conventionally.
💡For RLE, agree on a format (e.g., count, symbol) and be prepared to decompress an encoded string back to original; practice both directions.
💡In evaluation questions, refer to specific compression methods by name and link to data types: for example, RLE for simple graphics, Huffman for text, JPEG for photographs.
💡Remember that compression ratio = size of original data / size of compressed data; a higher ratio indicates better compression.
💡Always check that your Huffman codes are prefix-free by ensuring no code is the start of another; use the tree to decode a sample string to verify.
💡Practice converting between binary, hex, and decimal to quickly verify bitwise results.
💡Always test edge cases: empty masks, all bits set, shift amounts exceeding bit width.
💡Use parentheses to clarify precedence when combining multiple bitwise operations.
💡For written answers, show the binary pattern before and after the operation to earn full marks.
💡Always show your working when converting to/from two's complement; marks are often given for method even if final answer is wrong.
💡Know the process for performing subtraction by converting the subtrahend to its two's complement and adding.
💡When checking for overflow, consider the signs of the operands and the result, not just the carry-out.
💡Memorise the ranges: for n bits, sign-magnitude: -(2^(n-1)-1) to +(2^(n-1)-1); two's complement: -2^(n-1) to +(2^(n-1)-1).
💡Practice with different bit lengths (4-bit, 8-bit, 16-bit) as exam questions may vary.
💡Always show your working for conversions, especially when using two's complement or floating‑point. Examiners award marks for correct method even if the final answer is slightly off. Write each step clearly.
💡For floating‑point normalisation, remember that the mantissa must start with 0.1 for positive numbers and 1.0 for negative numbers (in two's complement). If it doesn't, shift left until it does, adjusting the exponent accordingly.
💡When calculating file sizes, use the formula: File size = resolution × colour depth (for images) or sample rate × bit depth × duration × channels (for sound). Don't forget to convert bits to bytes by dividing by 8, and be careful with units (e.g., MB vs MiB).

Common Mistakes

Pitfalls to avoid in your exam answers

Confusing which key is used for encryption/decryption in asymmetric schemes (e.g., thinking the private key encrypts in public-key encryption).
Assuming asymmetric encryption is always more secure than symmetric; ignorance of quantum computing threats and algorithm vulnerabilities.
Failing to distinguish between encryption and hashing, or believing that encryption alone ensures data integrity.
Misunderstanding that the same key length in symmetric and asymmetric encryption provides equivalent security (e.g., 128-bit AES vs 1024-bit RSA).
Misunderstanding of the role of place values leading to incorrect grouping for hex/oct (e.g., grouping from the left instead of right).
Confusing binary-coded decimal (BCD) with standard binary representation.
Misapplying hex digits: treating A as 10, B as 11, etc., but incorrectly when converting back to decimal.
Omitting leading zeros when converting to binary, leading to incomplete byte representation.
Omitting the exponent bias when converting to stored format.
Failing to normalise the mantissa correctly, leaving leading zeros.
Confusing the bit allocation between single and double precision fields.
Assuming floating point arithmetic is exact, neglecting rounding errors.
Confusing colour depth (bits per pixel) with the number of colours, leading to incorrect file size calculations
Assuming vector graphics are always smaller in file size regardless of image complexity
Nelecting to include metadata overhead when estimating file sizes
Thinking that resizing a bitmap image increases its resolution without loss of quality
Misunderstanding that vector graphics store the actual shapes rather than a pixel-per-pixel map
Confusing the roles of sampling rate (temporal resolution) and bit depth (amplitude resolution).
Failing to convert bits to bytes when calculating file size, leading to answers 8 times too large.
Believing that increasing sampling rate beyond the Nyquist criterion yields indefinitely better quality.
Confusing the terms 'character set' and 'character encoding', treating them as synonymous.
Assuming Unicode always uses 2 bytes per character, overlooking variable-length schemes.
Believing that UTF-8 is less efficient than UTF-16 for all languages, ignoring its optimisation for Latin scripts.
Misunderstanding that a byte in UTF-8 directly corresponds to a character for non-ASCII ranges.
Failing to recognise that the highest bit in ASCII is always 0, leading to confusion with extended ASCII.
Forgetting that RLE can increase size if applied to data with few runs, as each symbol may require a count even for single occurrences.
In Huffman coding, incorrectly merging nodes by not consistently taking the two smallest weight trees, leading to non-optimal codes.
Assuming that Huffman codes are unique; different but equally valid trees can be produced if there are ties in frequencies.
Believing that lossy compression always results in noticeable quality degradation; modern codecs are designed to be perceptually transparent at moderate compression ratios.
Omitting the storage overhead of the code table or dictionary when comparing compressed file size, especially for small files.
Confusing the bitwise NOT (~) with the logical NOT (!) in programming contexts.
Forgetting that right shift on signed integers may preserve the sign bit (arithmetic shift) in some languages.
Misapplying masks, for example using OR when AND is needed to clear bits, leading to unintended modifications.
Overlooking operator precedence, causing expressions like a & b == c to be evaluated incorrectly.
Confusing the sign bit in sign-magnitude with the most significant bit in two's complement, leading to incorrect conversion.
Forgetting to add 1 after flipping bits when forming two's complement.
Misinterpreting the carry-out bit as overflow instead of discarding it when the result is within range.
Assuming that two negative numbers can be added without overflow if they are small.
Treating two's complement subtraction as simply adding a negative by performing signed addition incorrectly.
Misconception: 'Hexadecimal is a different type of number, not just a different way to write binary.' Correction: Hexadecimal is simply base‑16; it's a compact representation of binary groups of 4 bits. There's no 'hexadecimal arithmetic' separate from binary.
Misconception: 'Two's complement is just flipping the bits.' Correction: Flipping bits gives the one's complement; two's complement requires adding 1 after flipping. For example, −5 in 4‑bit two's complement is 1011, not 1010.
Misconception: 'A higher sample rate always means better sound quality.' Correction: While higher sample rate captures higher frequencies (Nyquist theorem), quality also depends on bit depth and the recording environment. Doubling sample rate quadruples file size, so it's a trade‑off.

Frequently Asked Questions

Common questions students ask about this topic

Before You Start

Prior knowledge that will help with this topic

•Basic understanding of place value in denary (base‑10) to grasp binary place value.
•Familiarity with powers of 2 (up to 2^10 or more) for quick conversions and file size calculations.
•Simple arithmetic skills (addition, subtraction) to handle binary addition and two's complement.

Key Terminology

Essential terms to know

Symmetric ciphers (e.g., AES)
Asymmetric ciphers (e.g., RSA)
Key exchange and distribution
Applications in modern security
Binary representation and arithmetic
Hexadecimal and octal as shorthand
Efficiency in digital systems
Place value and base significance
Normalisation and mantissa
Exponent biasing
Precision vs range trade-offs
Special values (zero, infinity, NaN)
Single vs double precision
Pixel grid and colour depth
Mathematical primitive encoding
Resolution and scalability
File size determination
Metadata and headers
Sampling rate and the Nyquist criterion
Bit depth and quantisation error
File size calculation for uncompressed audio
Analogue-to-digital conversion (ADC)
Character sets vs. character encoding
ASCII structure and limitations
Unicode design goals and code points
Variable-width encoding schemes
UTF-8 and UTF-16 comparison
Encoding selection in real-world applications
Lossless vs lossy compression
Run-length encoding (RLE)
Huffman coding and prefix codes
Compression ratio and efficiency
Lossy compression principles
Applications and trade-offs
Bitwise logical operations
Shift operations
Mask construction and usage
Low-level data manipulation
Efficiency of bitwise techniques
Signed integer encoding
Two's complement arithmetic
Overflow detection
Binary addition/subtraction
Range and zero representation

Ready to test yourself?

Practice questions tailored to this topic

Fundamentals of data representation

Subtopics in this area

Topic Overview

Key Concepts

Learning Objectives

Marking Points

Examiner Tips

Common Mistakes

Frequently Asked Questions

Before You Start

Key Terminology

Ready to test yourself?

Related Topics in AQA A-Level Computer Science

E2E stub concept

Theory of computation

Fundamentals of computer organisation and architecture

Systematic approach to problem solving

Topic Synopsis

Key Concepts & Core Principles

Exam Tips & Revision Strategies

Common Misconceptions & Mistakes to Avoid

Examiner Marking Points

Fundamentals of data representation

Subtopics in this area

Topic Overview

Key Concepts

Learning Objectives

Marking Points

Examiner Tips

Common Mistakes

Frequently Asked Questions

How do I convert a negative denary number to binary using two's complement?

What is the difference between lossy and lossless compression?

Why do computers use binary instead of decimal?

How do I normalise a floating‑point number?

What is the Nyquist theorem and why is it important for sound?

How do I calculate the file size of a bitmap image?

Before You Start

Key Terminology

Ready to test yourself?

Related Topics in AQA A-Level Computer Science

E2E stub concept

Theory of computation

Fundamentals of computer organisation and architecture

Systematic approach to problem solving