How do you convert between megabytes and gigabytes?

To convert megabytes (MB) to gigabytes (GB), divide by 1024. For example, 2048 MB ÷ 1024 = 2 GB. To convert GB to MB, multiply by 1024. So 3 GB × 1024 = 3072 MB. Remember: in computing, 1 GB = 1024 MB, not 1000.

What is run-length encoding and how does it work?

Run-length encoding (RLE) is a simple lossless compression method. It works by replacing sequences of repeated characters with a count followed by the character. For example, the string 'AAABBBCC' becomes '3A3B2C'. This is effective for data with many runs, like simple images or text with repeated spaces.

Why can't you compress a compressed file further?

Compression works by removing redundancy and patterns. Once a file is compressed, most patterns are already encoded efficiently. Trying to compress it again usually yields little or no reduction, and sometimes the file can even get slightly larger due to overhead from the compression algorithm.

What is the difference between a bit and a byte?

A bit is the smallest unit of data, representing a single binary value (0 or 1). A byte is a group of 8 bits, which can represent 256 different values (e.g., a single character like 'A'). Bytes are the standard unit for measuring file sizes, while bits are often used for data transfer speeds (e.g., Mbps).

How does compression affect image quality?

Lossy compression (like JPEG) reduces image quality by discarding fine details that the human eye might not notice. The more you compress, the more quality degrades — leading to artifacts like blurring or blockiness. Lossless compression (like PNG) keeps every pixel exactly the same, so quality is unchanged, but file sizes are larger than lossy versions.

Data storage and compression

Q: What is the difference between lossy and lossless compression?

Lossy compression reduces file size by permanently removing some data, which can lower quality. It's used for images (JPEG) and audio (MP3) where a small loss is acceptable. Lossless compression reduces size without losing any data, so the original can be perfectly restored. It's used for text files (ZIP) and images where quality must be preserved (PNG).

EDEXCEL

GCSE

This topic covers the measurement of data storage using binary multiples and the necessity of data compression. Students learn to calculate file sizes and data capacity requirements, as well as the differences between lossless and lossy compression methods.

Objectives

Exam Tips

Pitfalls

Key Terms

Mark Points

Topic Overview

Data storage and compression is a fundamental topic in computer science that explores how digital data is stored, measured, and reduced in size. You'll learn about binary representation, units of data (bit, byte, kilobyte, etc.), and how files like images, sound, and text are encoded. Understanding compression is crucial because it affects everything from loading web pages to streaming video. This topic connects to networks, data representation, and even ethical issues around data usage.

In the Edexcel GCSE, you need to know the difference between lossy and lossless compression, and when each is appropriate. Lossy compression permanently removes data to reduce file size, which can degrade quality — think JPEG images or MP3 audio. Lossless compression, like PNG or ZIP, reduces size without losing any original data, making it essential for text files or software. You'll also encounter run-length encoding (RLE) and dictionary-based methods as specific techniques.

Why does this matter? In a world generating zettabytes of data daily, efficient storage and transmission are critical. Compression saves bandwidth, speeds up downloads, and reduces costs. Understanding these concepts also prepares you for more advanced topics like encryption and error checking. Master this, and you'll see how computers balance quality, size, and performance.

Key Concepts

Core ideas you must understand for this topic

→Units of data: bit (0 or 1), nibble (4 bits), byte (8 bits), kilobyte (1024 bytes), megabyte, gigabyte, terabyte — and how they relate to each other.
→Binary representation: how numbers, text (ASCII/Unicode), images (bitmap pixels), and sound (sampling) are stored as binary digits.
→Lossy vs lossless compression: lossy reduces file size by discarding data permanently (e.g., JPEG, MP3); lossless reduces size without losing data (e.g., PNG, ZIP).
→Run-length encoding (RLE): a simple lossless method that replaces repeated consecutive data with a count and the value (e.g., 'AAAABBB' becomes '4A3B').
→Dictionary-based compression: uses a dictionary of frequently occurring patterns to replace them with shorter codes (e.g., LZW used in GIF).

What You Need to Demonstrate

Key skills and knowledge for this topic

Correct use of binary multiples (bit, nibble, byte, kibibyte, mebibyte, gibibyte, tebibyte)
Accurate construction of expressions to calculate file sizes
Accurate calculation of data capacity requirements
Correct identification of the need for data compression
Distinction between lossless and lossy compression methods

Marking Points

Key points examiners look for in your answers

Correct use of binary multiples (bit, nibble, byte, kibibyte, mebibyte, gibibyte, tebibyte)
Accurate construction of expressions to calculate file sizes
Accurate calculation of data capacity requirements
Correct identification of the need for data compression
Distinction between lossless and lossy compression methods

Examiner Tips

Expert advice for maximising your marks

💡Ensure you are familiar with the binary multiples hierarchy (bit, nibble, byte, kibibyte, mebibyte, gibibyte, tebibyte)
💡Show all working for calculations as marks are often awarded for the method, not just the final answer
💡Be prepared to explain why compression is necessary in specific scenarios, such as web transmission or storage limitations
💡Practice identifying whether a scenario requires lossless or lossy compression
💡When comparing lossy and lossless, always mention a specific example (e.g., JPEG vs PNG) and explain why one is chosen over the other — this shows deeper understanding.
💡For RLE questions, carefully count consecutive identical characters. A common mistake is missing a run or miswriting the output format (e.g., '3A2B' instead of '3A2B' — ensure you include the character after the number).
💡Know the units of data in order: bit, nibble, byte, kilobyte, megabyte, gigabyte, terabyte. Questions often ask you to convert between them, so practice multiplying/dividing by 1024.

Common Mistakes

Pitfalls to avoid in your exam answers

Confusing binary prefixes (kibibyte) with decimal prefixes (kilobyte)
Incorrectly applying units when calculating file sizes
Failing to show working in calculation-based questions
Misunderstanding the trade-off between file size and quality in lossy compression
Misconception: 'Lossy compression always reduces quality so much it's unusable.' Correction: Lossy compression can be tuned to balance size and quality; for example, a high-quality JPEG may look identical to the original to the human eye.
Misconception: 'Lossless compression can make any file smaller.' Correction: Lossless compression works best on data with patterns (e.g., text, simple images). Random data or already-compressed files may not shrink and can even grow slightly.
Misconception: 'A kilobyte is 1000 bytes.' Correction: In computing, a kilobyte is 1024 bytes (2^10), though some contexts (like hard drive marketing) use 1000. In GCSE, stick to 1024 unless told otherwise.

Frequently Asked Questions

Common questions students ask about this topic

Before You Start

Prior knowledge that will help with this topic

•Binary numbers: understanding how to convert between binary and denary, and how binary represents data.
•Basic understanding of files and file types (e.g., .txt, .jpg, .mp3) — what they are used for.
•Simple arithmetic: multiplying and dividing by powers of 2 (especially 1024).

Likely Command Words

How questions on this topic are typically asked

Calculate

Describe

Explain

Identify

Ready to test yourself?

Practice questions tailored to this topic

Data storage and compression

Topic Overview

Key Concepts

What You Need to Demonstrate

Marking Points

Examiner Tips

Common Mistakes

Frequently Asked Questions

Before You Start

Likely Command Words

Ready to test yourself?

Related Topics in EDEXCEL GCSE Computer Science

E2E stub concept

Algorithms

Binary

Constructs

Topic Synopsis

Key Concepts & Core Principles

Exam Tips & Revision Strategies

Common Misconceptions & Mistakes to Avoid

Examiner Marking Points

Data storage and compression

Topic Overview

Key Concepts

What You Need to Demonstrate

Marking Points

Examiner Tips

Common Mistakes

Frequently Asked Questions

What is the difference between lossy and lossless compression?

How do you convert between megabytes and gigabytes?

What is run-length encoding and how does it work?

Why can't you compress a compressed file further?

What is the difference between a bit and a byte?

How does compression affect image quality?

Before You Start

Likely Command Words

Ready to test yourself?

Related Topics in EDEXCEL GCSE Computer Science

E2E stub concept

Algorithms

Binary

Constructs