JPEG AND MPEG STANDARDS

Abdou Youssef

  1. Motivation for Standards

  2. Image/Video Compression Standards (Outline)

  3. The JPEG ``Toolkit''

  4. The Baseline JPEG Algorithm

  5. The Quantization Matrices

  6. Coding of the DC Residuals

  7. Coding of the AC Terms (The AC Huffman Table)

  8. Examples

  9. Example Huffman Table for Lena

  10. Coding the Example Block

  11. Decoding

  12. Extended JPEG

  13. Performance of JPEG

  14. MPEG (1 & 2): Basic Concepts

  15. Modes of MPEG Compression

  16. Types of Frames in MPEG

  17. Interframe Compression of P and B Blocks

  18. Flowchart of MPEG Compression

  19. Motion-Estimation and Prediction

  20. Motion-Estimation and Prediction for P Frames

  21. Motion-Estimation and Prediction for B Frames

  22. MPEG2

  23. References

  24. Links to other Standards

Back to Top

  1. Motivation for Standards

    Back to Top

  2. Image/Video Compression Standards (Outline)

    Back to Top

  3. The JPEG ``Toolkit''

    Back to Top

  4. The Baseline JPEG Algorithm

    1. It operates on 8×8 blocks of the input image

    2. Mean-normalization (subtract 128 from each pixel)

    3. Transform: DCT-transform each block

    4. Quantization
      • An 8×8 quantization matrix Q is user-provided
      • Each block is divided by Q (point by point)
      • The terms are then rounded to their nearest integers

      • Remark: Up to 4 quantization matrices per image are allowed (for example, one for luminance, and for each of the three color components)


    5. Entropy-coding of the DC coefficients (the top left coefficient of each quantized block) using DPCM+Huffman
      • Huffman-encode the DC residuals derived from the difference between each DC and the DC of the preceding block


    6. Entropy-coding of the AC (i.e., non-DC) coefficients
      • Zigzag-order the quantized coefficients of each block
      • Record for each nonzero coefficient both its distance (called run) to the preceding nonzero coefficient in the zigzag sequence, and its value (called level)
      • Huffman code the [run,level] terms using one single Huffman table for all the AC's of the image

    Back to Top

  5. The Quantization Matrices

    Back to Top

  6. Coding of the DC Residuals

    1. The DC residuals are in the range [-2047,2047]
    2. Thus, the magnitude of each residual is between 0 and 2047=211-1, inclusive.
    3. Divide this range into 12 subranges, or categories, where category k ranges from 2k-1 to 2k-1 inclusive. (Note that category 0 has only the integer 0).
    4. Let r be a DC residual. Clearly, |r| = 2k-1 + t, where 0 <= t <= 2k-1-1. In particular, t can be represented in binary using k-1 bits.
    5. Therefore, r can be uniquely represented by s, k, and t, where s is the sign of r, and k and t are as above.

    6. Develop a Huffman code for the 12 categories, where every codeword is at most 16 bits long
      • Exercise: How would you find the probabilities of the 12 categories, for a given input image?
      • Exercise: How would you modify the Huffman algorithm to produce a Huffman tree of height at most 16? Or generally a height of at most some given value m?

    7. Encode each DC residual r as a binary string hsm where
      • h is the Huffman codeword of the residual's category k
      • s= sign of the residual; s=0 if the residual is negative, 1 if positive
      • m= the (k-1)-bit binary representation of t.

    Back to Top

  7. Coding of the AC Terms (The AC Huffman Table)

    1. All non-zero AC terms are of magnitude <= 210-1
    2. Let x be a non-zero AC term, and let d be the length of the zero run between x and the previous nonzero AC term.
    3. 1 <= |x| <= 210-1
    4. Divide the range 1 .. 210-1 into 10 categories, where category k is the range (2k-1 .. 2k-1) inclusive.
    5. Represent x by its sign s and its magnitude |x|. s = 0 if x<0, 1 otherwise.
    6. For whatever value of |x|, there is a unique k such that 2k-1 <= |x| <= 2k-1. k is the category of x
    7. |x| = 2k-1 + t, where 0 <= t <= 2k-1-1. In particular, t can be represented in binary using k-1 bits.
    8. Therefore, x can be uniquely represented by s, k, and t.
    9. Thus, (d,x) can be represented by (d,k,s,t), where s is one bit and t is k-1 bits.
    10. The runlength d is between 0 and 62.
    11. d = 15p + r, where r=0,1,2,...,14.
    12. r can be represented with 4 bits r3r2r1r0, different from 1111.
    13. p can be represented with 111100001 111100002 ... 1111000p
    14. This implies that d is 111100001 111100002 ... 1111000pr3r2r1r0
    15. the category (or level) k, being between 1 and 10, can be represented with 4 bits k3k2k1k0.
    16. Therefore, the (d,k) in the (d,k,s,t) representation of (d,x), is represented as
      111100001 111100002 ... 1111000pr3r2r1r0k3k2k1k0
    17. This representation of (d,k) can be viewed as a sequence of p+1 bytes.
    18. The last byte represents 15*10= 150 legitimate values
    19. Add to those the byte 11110000 and the end-of-block (EOB) symbol to signal the end of the nonzero AC terms in a block.
    20. This results in 152 different symbols.

    21. Build a Huffman table for those 152 symbols, where every codeword is at most 16 bits long

    22. JPEG encodes each quantized AC term (d,x)=(d,k,s,t) as hsm where
      • h is the Huffman codeword of (d,k)
      • s= sign of the term; s=0 if negative, 1 if positive
      • m= the (k-1)-bit binary representation of t.



    Back to Top

  8. Examples

    Back to Top

  9. Example Huffman Table for Lena

    d=Length of Zero Run Category k Codelength Codeword
    0 1 2 00
    0 2 2 01
    0 3 3 100
    0 4 4 1011
    0 5 5 11010
    0 6 6 111000
    0 7 7 1111000
    . . . .
    1 1 4 1100
    1 2 6 111001
    1 3 7 1111001
    1 4 9 111110110
    . . . .
    2 1 5 11011
    2 2 8 11111000
    . . . .
    3 1 6 111010
    3 2 9 111110111
    . . . .
    4 1 6 111011
    5 1 7 1111010
    6 1 7 1111011
    7 1 8 11111001
    8 1 8 11111010
    9 1 9 111111000
    10 1 9 111111001
    11 1 9 111111010
    . . . .
    . . . .
    End of Block (EOB) 4 1010

    Back to Top

  10. Coding the Example Block

    Back to Top

  11. Decoding

    1. Entropy-decode the bitstream back to the quantized blocks

    2. Dequantize: multiply each block coefficient by the corresponding coefficient of the quantization matrix

    3. Apply the inverse DCT transform on each block

    4. Denormalize: add 128 to each coefficient


    Back to Top

  12. Extended JPEG

    Back to Top
  13. Performance of JPEG

    Back to Top

  14. MPEG (1 & 2): Basic Concepts

    Back to Top

  15. Modes of MPEG Compression

    Back to Top

  16. Types of Frames in MPEG

    Back to Top

  17. Interframe Compression of P and B Blocks

    Back to Top

  18. Flowchart of MPEG Compression



    Back to Top

  19. Motion-Estimation and Prediction

    Back to Top

  20. Motion-Estimation and Prediction for P Frames

    Back to Top

  21. Motion-Estimation and Prediction for B Frames

    Back to Top

  22. MPEG2

    Back to Top

  23. References

    1. B. pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard, Van Nostrand reinhold, New York 1993.
    2. ISO-11172-2: Generic Coding of moving pictures and associated audio (MPEG-1)
    3. ISO-13818-2: Generic Coding of moving pictures and associated audio (MPEG-2)

    Back to Top

  24. Links to other Standards

    Back to Top