IP Logo
distance keeper Image Communication    
Graphic Element West Graphic Element Middle Graphic Element East
 
Graphic Element Quadgray Start
Graphic Element Quadgreen Organisation
  Image Communication
  Computer Vision & Graphics
  Immersive Media & 3D-Video
  Hardware Architectures & Implementations
  Embedded Systems
Graphic Element Quadgray Fields of Competence
Graphic Element Quadgray Fields of Application
Graphic Element Quadgray Alliances & Committees
Graphic Element Quadgray Products
Graphic Element Quadgray Events
Graphic Element Quadgray Staff
Graphic Element Quadgray Jobs
Graphic Element Quadgray Visitors
Graphic Element Quadgray Contact
Graphic Element Quadgray HHI Home
Group 1 Logo
   
  Multi-frame Motion-Compensated Prediction

 

Long-Term Memory Motion-Compensated Prediction

Long-term memory motion-compensated prediction extends the spatial displacement vector utilized in block-based hybrid video coding by a variable time delay permitting the use of more frames than the previously decoded one for motion compensated prediction. The long-term memory covers several seconds of decoded frames at encoder and decoder. The use of multiple frames for motion compensation in most cases provides significantly improved prediction gain. The variable time delay has to be transmitted as side information requiring additional bit-rate which may be prohibitive when the size of the long-term memory becomes too large. Therefore, we control the bit-rate of the motion information by employing rate-constrained motion estimation. Simulation results are obtained by integrating long-term memory prediction into an H.263 codec. Reconstruction PSNR improvements up to 1.5 dB for the Foreman sequence and 0.9 dB for the Mother-Daughter sequence are demonstrated in comparison to the TMN-1 0 H.263 coder. The PSNR improvements correspond to bit-rate savings up to 23 % and 17 %, respectively. Mathematical inequalities are used to speed-up motion estimation while achieving full prediction gain. Long-term memory prediction can also be benefitially applied for the transmission over error prione channels. We present a framework that incorporates an estimated error into rate-constrained motion estimation and mode decision. Experimental results with a Rayleigh fading channel show that long-term memory prediction significantly outperforms the single-frame prediction H.263-based anchor. When a feedback channel is available, the decoder can inform the encoder about successful or unsuccessful transmission events by sending positive (ACK) or negative (NACK) acknowledgments. This information is utilized for updating the error estimates at the encoder. Similar concepts such as the ACK and NACK mode known from the H.263 standard are unified into a general framework providing superior transmission performance.

 

 

Long-Term Memory Prediction

 

The long-term memory prediction scheme has been proposed to ITU-T/SG16/Q15. Various submissions have been made to that group which decided at the February 1999 meeting in Moterey, CA, USA, to adopt the feature. The name Enhanced Reference Picture Selection has been coined for the scheme since there already exists an Annex that utilized several frames for increased error resilience with the name Reference Picture Selection. Enhanced Reference Picture Selection is planned to be included as an integral part of the ITU-T Recommendation H.263 as Annex U. You can download the latest version of draft Annex U here Annex U.

 

Affine Multi-Frame Motion-Compensated Prediction

Multi-frame affine prediction extends motion compensation from the previous frame to several past decoded frames and warped versions thereof. Affine motion parameters describe the warping. In contrast to translational motion compensation, the affine motion parameters must be assigned to large image segments to obtain a rate-distortion efficient motion representation. These large image segments usually can not be chosen so as to partition the image uniformly. Hence, encoding proceeds in four steps: (i) estimation of several affine motion parameter sets between the current and previous frames, (ii) generating the multi-frame buffer consisting of past decoded frames and affine warped frames, (iii) multi-frame block-based hybrid video encoding, and (iv) determination of the efficient number of motion models using Lagrangian optimization techniques. A significant improvement in coding efficiency can be observed when comparing the multi-frame affine motion coder to the TMN-10 coder, the rat e-distortion optimized test model of the H.263 standard. At a fixed quality of 34 dB PSNR, the proposed coder achieves 24 % bit-rate reduction over a set of 8 different test sequences. The bit-rate savings inside this set of test sequences vary from 35 % to 15 % which correspond to PSNR gains of 3 dB and 0.8 dB, respectively. It is shown that both concepts, affine motion and long-term memory prediction, contribute to the overall gain.

 

 

Block diagram of the proposed combined affine and long-term memory motion compensator.

 

 

Multi-Hypothesis Motion-Compensated Prediction

Multi-hypothesis prediction extends motion compensation with one prediction signal to the linear superposition of several motion-compensated prediction signals with the result of increased coding efficiency. The multiple hypotheses in this paper are blocks in past decoded frames. These blocks are referenced by individual motion vectors and picture reference parameters incorporating long-term memory motion-compensated prediction. In this work, we at most employ two hypotheses similar to B-frames. However, they are obtained from the past. Due to the increased rate for the motion vectors, rate-constrained coder control is utilized. On this basis, we demonstrate the efficiency of multi-hypothesis prediction in combination with variable block size and long-term memory and present bit-rate savings up to 32%. It turns out that the use of multiple reference frames enhances the efficiency of multi-hypothesis prediction.

 

 

Multi-Hypothesis Long-Term Memory Prediction

 

 

Rate-Constrained Video Compression Using a 3-D Head Model

We show that traditional waveform-coding and 3-D model-based coding are not competing alternatives but should be combined to support and complement each other. Both approaches are combined such that the generality of waveform coding and the efficiency of 3-D model-based coding are available where needed. The combination is achieved by providing the block-based video coder with a second reference frame for prediction which is synthesized by the model-based coder. The model-based coder uses a parameterized 3-D head model specifying shape and color of a person. We therefore restrict our investigations to typical videotelephony scenarios that show head-and-shoulder scenes. Motion and deformation of the 3-D head model constitute facial expressions which are represented by facial animation parameters (FAPs) based on the MPEG-4 standard. An intensity gradient-based approach that exploits the 3-D model information is used to estima te the FAPs as well as illumination parameters that describe changes of the brightness in the scene. Model failures and objects that are not known at the decoder are handled by standard block-based motion-compensated prediction which is not restricted to a special scene content, but results in lower coding efficiency. A Lagrangian approach is employed to determine the most efficient prediction for each block from either the synthesized model frame or the previous decoded frame. Experiments on five video sequences show that bit-rate savings of about 35 % are achieved at equal average PSNR when comparing the model-aided codec to TMN-10, the state-of-the-art test model of the H.263 standard. This corresponds to a gain of 2-3 dB in PSNR when encoding at the same average bit-rate.

 

 

 

 

Below, frame 150 of the Akiyo sequence is depicted. The frame is coded using the TMN-10 and the MBBC at the same bit-rate, left image: TMN-10 (31.08 dB PSNR, 720 bits), right image: MBBC (33.19 dB PSNR, 725 bits).

 

 

You can also take a look at the Quicktime movie of the sequence (5.6 Mb). My colleague, Peter Eisert, has also generated a web page about the project (Peter's page.)

 

 
Publications

 

    2001
  • Thomas Wiegand and Bernd Girod:
    Multi-Frame Motion-Compensated Prediction for Video Transmission,
    Kluwer Academic Publishers, September 2001.

Publications in Journals
Publications in Conference Proceedings
    2001
  • Detlev Marpe, Thomas Wiegand, and Hans L. Cycon:
    Wavelet-Based Video Compression Using Long-Term Memory Motion-Compensated Prediction and Context-Based Adaptive Arithmetic Coding,
    Second International Conference on Wavelet Analysis and Its Applications (ICWAA), Hong Kong, China, December 2001.
    (Download as
    pdf)

  • Markus Flierl, Thomas Wiegand, and Bernd Girod:
    Multihypothesis Pictures for H.26L,
    IEEE International Conference on Image Processing (ICIP'01), Thessaloniki, Greece, September 2001.
    (Download as
    pdf)

  • 2000
  • Markus Flierl, Thomas Wiegand, and Bernd Girod:
    Rate-Constrained Multi-Hypothesis Motion-Compensated Prediction for Video Coding,
    IEEE International Conference on Image Processing (ICIP'00), Vancouver, Canada, September 2000.
    (Download as
    pdf)

  • Markus Flierl, Thomas Wiegand, and Bernd Girod:
    A Video Codec Incorporating Block-Based Multi-Hypothesis Motion-Compensated Prediction,
    IS&T/SPIE Symposium on Visual Communications and Image Processing (VCIP'00), Perth, Australia, June 2000.
    (Download as
    pdf)

  • 1999
  • Eckehard Steinbach, Thomas Wiegand, and Bernd Girod:
    Using Multiple Global Motion Models for Improved Block-Based Video Coding,
    IEEE International Conference on Image Processing (ICIP'99), Kobe, Japan, Vol. 1, pp. 56-60, October 1999.
    (Download as
    pdf)

  • Thomas Wiegand, Eckehard Steinbach, and Bernd Girod:
    Long-Term Memory Prediction Using Affine Motion Compensation,
    IEEE International Conference on Image Processing (ICIP'99), Kobe, Japan, Vol. 1, pp. 51-54, October 1999.
    (Download as
    pdf)

  • 1998
  • Thomas Wiegand, Markus Flierl, and Bernd Girod:
    Entropy-Constrained Linear Vector Prediction for Motion-Compensated Video Coding,
    International Symposium on Information Theory (ISIT 1998), Boston, USA, p. 409, October 1998.
    (Download as
    pdf)

  • Thomas Wiegand, Bo Lincoln, and Bernd Girod:
    Fast Search for Long-Term Memory Motion-Compensated Prediction,
    IEEE International Conference on Image Processing (ICIP'98), Chicago, USA, Vol. 3, pp. 619-622, October 1998.
    (Download as
    pdf)

  • Bernd Girod, Thomas Wiegand, Eckehard Steinbach, Markus Flierl, and Xiaozheng Zhang:
    High-Order Motion Compensation for Low Bit-Rate Video,
    European Signal Processing Conference (EUSIPCO'98), Island of Rhodes, Greece, pp. 253-256, September 1998, Invited Paper.
    (Download as
    pdf)

  • Markus Flierl, Thomas Wiegand, and Bernd Girod:
    A Locally Optimal Design Algorithm for Block-Based Multi-Hypothesis Motion-Compensated Prediction,
    Data Compression Conference (DCC 1998), Snowbird, USA, pp. 239-248, March 1998.
    (Download as
    pdf)

  • Thomas Wiegand, Eckehard Steinbach, Axel Stensrud, and Bernd Girod:
    Multiple Reference Picture Video Coding Using Polynomial Motion Models,
    IS&T/SPIE Symposium on Visual Communications and Image Processing (VCIP'98), San Jose, USA, pp. 134-145, February 1998, Best Student Paper Award.


  • 1997
  • Thomas Wiegand, Xiaozheng Zhang, and Bernd Girod:
    Motion-Compensating Long-Term Memory Prediction,
    IEEE International Conference on Image Processing (ICIP'97), Santa Barbara, CA, USA, Vol. 2, pp. 53-56, October 1997.
    (Download as
    pdf)

  • Thomas Wiegand, Xiaozheng Zhang, and Bernd Girod:
    Block-Based Hybrid Video Coding Using Motion-Compensated Long-Term Memory Prediction,
    Picture Coding Symposium (PCS'97), Berlin, Germany, pp. 153-158, September 1997.
    (Download as
    pdf)

Contributions to Standardization
    2002
  • Thomas Wiegand and Karsten Sühring:
    Number of Allowed Reference Frames,
    Joint Video Team (JVT), Awaji Island, Japan, JVT-F081, December 2002.

  • Thomas Wiegand:
    Multi-Picture Handling,
    Joint Video Team (JVT), Klagenfurt, Austria, JVT-D018, July 2002.

  • 2000
  • Thomas Wiegand:
    Modification of Annex U for Enhanced Error Resilience,
    VCEG meeting - ITU-T SG16/Q.6, Osaka, Japan, ITU-T/SG16/Q15-J-33, May 2000, Proposal for H.263++.

  • Thomas Wiegand:
    Proposed Editorial Changes to Annex U,
    VCEG meeting - ITU-T SG16/Q.6, Osaka, Japan, ITU-T/SG16/Q15-J-49, May 2000, Proposal for H.263++.

  • Thomas Wiegand:
    Interoperability of Annex U with other Annexes,
    VCEG meeting - ITU-T SG16/Q.6, Osaka, Japan, ITU-T/SG16/Q15-J-50, May 2000, Proposal for H.263++.

  • Thomas Wiegand:
    Test Model Description for an Annex U Encoder,
    VCEG meeting - ITU-T SG16/Q.6, Osaka, Japan, ITU-T/SG16/Q15-J-51, May 2000, Proposal for H.263++.

  • Thomas Wiegand and Mike Nilsson:
    Annex U including Picture Numbers,
    VCEG meeting - ITU-T SG16/Q.6, Osaka, Japan, ITU-T/SG16/Q15-J-66, May 2000, Proposal for H.263++.


  • 1999
  • Thomas Wiegand and Eckehard Steinbach:
    Core Experiment Description on Affine Multiframe Motion Compensation,
    VCEG meeting - ITU-T SG16/Q.6, Red Bank, NJ, USA, ITU-T/SG16/Q15-I-42, October 1999, Proposal for H.263++.

  • Thomas Wiegand and Eckehard Steinbach:
    Results of Core Experiment on Affine Multiframe Motion Compensation,
    VCEG meeting - ITU-T SG16/Q.6, Red Bank, NJ, USA, ITU-T/SG16/Q15-I-43, October 1999, Proposal for H.263++.

  • Thomas Wiegand:
    Proposed Changes to Draft Annex U: Enhanced Reference Picture Selection,
    VCEG meeting - ITU-T SG16/Q.6, Red Bank, NJ, USA, ITU-T/SG16/Q15-I-44, October 1999, Proposal for H.263++.

  • Thomas Wiegand:
    Multiframe Buffering Syntax for H.26L,
    VCEG meeting - ITU-T SG16/Q.6, Red Bank, NJ, USA, ITU-T/SG16/Q15-I-45, October 1999, Proposal for H.263++.

  • Thomas Wiegand and Eckehard Steinbach:
    Description of Affine Core Experiment,
    VCEG meeting - ITU-T SG16/Q.6, Berlin, Germany, ITU-T/SG16/Q15-H-34, August 1999, Proposal for H.263++.

  • Thomas Wiegand, Eckehard Steinbach, Bernd Girod, and Barry D. Andrews:
    Further Work on Affine Multiframe Motion Compensation,
    VCEG meeting - ITU-T SG16/Q.6, Berlin, Germany, ITU-T/SG16/Q15-H-21, August 1999, Proposal for H.263++.

  • Thomas Wiegand:
    Proposed Draft for Annex U on Enhanced Reference Picture Selection,
    VCEG meeting - ITU-T SG16/Q.6, Berlin, Germany, ITU-T/SG16/Q15-H-30, August 1999, Proposal for H.263++.

  • Thomas Wiegand, Niko Färber, and Bernd Girod:
    Error Resilient Transmission Using Long-Term Memory Motion-Compensated Prediction,
    VCEG meeting - ITU-T SG16/Q.6, Berlin, Germany, ITU-T/SG16/Q15-H-24, August 1999, Proposal for H.263++.

  • Thomas Wiegand, Niko Färber, and Bernd Girod:
    Error Resilient Transmission Using Long-Term Memory Motion-Compensated Prediction,
    ITU-T/SG16/Q15-H-24, Berlin, Germany, Proposal for H.263++, August 1999.

  • Thomas Wiegand, Bernd Girod, and Barry D. Andrews:
    Core Experiment Description for Enhanced Reference Picture Selection,
    VCEG meeting - ITU-T SG16/Q.6, Monterey, CA, USA, ITU-T/SG16/Q15-G-19, February 1999, Proposal for H.263++.

  • Thomas Wiegand, Bernd Girod, and Barry D. Andrews:
    Results for Core Experiment on Enhanced Reference Picture Selection,
    VCEG meeting - ITU-T SG16/Q.6, Monterey, CA, USA, ITU-T/SG16/Q15-G-20, February 1999, Proposal for H.263++.

  • Thomas Wiegand, Eckehard Steinbach, Bernd Girod, and Barry D. Andrews:
    Video Coding Using Long-Term Memory and Affine Motion Compensation,
    VCEG meeting - ITU-T SG16/Q.6, Monterey, CA, USA, ITU-T/SG16/Q15-G-21, February 1999, Proposal for H.263++ and H.26L.


  • 1998
  • Thomas Wiegand, Niko Färber, Bernd Girod, and Barry D. Andrews:
    Proposed Draft for Annex on Enhanced Reference Picture Selection,
    VCEG meeting - ITU-T SG16/Q.6, Seoul, Korea, ITU-T/SG16/Q15-F-32r1, November 1998, Proposal for H.263++.

  • Thomas Wiegand, Eckehard Steinbach, Bernd Girod, and Barry D. Andrews:
    Video Coding Using Long-Term Memory and Affine Motion Compensation,
    VCEG meeting - ITU-T SG16/Q.6, Seoul, Korea, ITU-T/SG16/Q15-F-33, November 1998, Proposal for H.263++ and H.26L.

  • Thomas Wiegand, Niko Färber, Bernd Girod, and Barry D. Andrews:
    Results of Core Experiment: Long-Term Memory Motion-Compensated Prediction,
    VCEG meeting - ITU-T SG16/Q.6, Whistler, Canada, ITU-T/SG16/Q15-E-25, July 1998.

  • Thomas Wiegand, Niko Färber, Bernd Girod, and Barry D. Andrews:
    Long-Term Memory Motion-Compensated Prediction for Surveillance Applications,
    VCEG meeting - ITU-T SG16/Q.6, Whistler, Canada, ITU-T/SG16/Q15-E-44, July 1998.

  • Thomas Wiegand and Barry D. Andrews:
    Core Experiment Description for Enhanced Reference Picture Selection,
    VCEG meeting - ITU-T SG16/Q.6, Whistler, Canada, ITU-T/SG16/Q15-E-52, July 1998.

  • Thomas Wiegand, Xiaozheng Zhang, Bernd Girod, and Barry D. Andrews:
    Long-Term Memory Motion-Compensated Prediction,
    VCEG meeting - ITU-T SG16/Q.6, Tampere, Finnland, ITU-T/SG16/Q15-D-54, April 1998.

  • Thomas Wiegand, Xiaozheng Zhang, Bernd Girod, and Barry D. Andrews:
    Fast Search for Long-Term Memory Motion-Compensated Prediction,
    VCEG meeting - ITU-T SG16/Q.6, Tampere, Finnland, ITU-T/SG16/Q15-D-55, April 1998.


  • 1997
  • Thomas Wiegand, Xiaozheng Zhang, Bernd Girod, and Barry D. Andrews:
    Long-Term Memory Motion-Compensated Prediction,
    VCEG meeting - ITU-T SG16/Q.6, Eibsee, Germany, ITU-T/SG16/Q15-C-11, December 1997.