logo

English

How video compression works 4/5 - Encoder secrets and motion compensation

by digipine posted Nov 02, 2017
?

Shortcut

PrevPrev Article

NextNext Article

Larger Font Smaller Font Up Down Go comment Print
?

Shortcut

PrevPrev Article

NextNext Article

Larger Font Smaller Font Up Down Go comment Print

Encoder secrets and motion compensation
 
The simplest and most thorough way to perform motion estimation is to evaluate every possible 16x16 region in the search area, and select the best match. Typically, a "sum of absolute differences" (SAD) or "sum of squared differences" (SSD) computation is used to determine how closely a candidate 16x16 region matches a macro block. The SAD or SSD is often computed for the luminance plane only, but can also include the chrominance planes. But this approach can be overly demanding on processors: exhaustively searching an area of 48x24 pixels requires over 8 billion arithmetic operations per second at QVGA (640x480) video resolution and a frame rate of 30 frames per second.
Because of this high computational load, practical implementations of motion estimation do not use an exhaustive search. Instead, motion estimation algorithms use various methods to select a limited number of promising candidate motion vectors (roughly 10 to 100 vectors in most cases) and evaluate only the 16x16 regions corresponding to these candidate vectors. One approach is to select the candidate motion vectors in several stages. For example, five initial candidate vectors may be selected and evaluated. The results are used to eliminate unlikely portions of the search area and zero in on the most promising portion of the search area. Five new vectors are selected and the process is repeated. After a few such stages, the best motion vector found so far is selected.

Another approach analyzes the motion vectors previously selected for surrounding macro blocks in the current and previous frames in an effort to predict the motion in the current macro block. A handful of candidate motion vectors are selected based on this analysis, and only these vectors are evaluated.

By selecting a small number of candidate vectors instead of scanning the search area exhaustively, the computational demand of motion estimation can be reduced considerably—sometimes by over two orders of magnitude. But there is a tradeoff between processing load and image quality or compression efficiency: in general, searching a larger number of candidate motion vectors allows the encoder to find a block in the reference frame that better matches each block in the current frame, thus reducing the prediction error. The lower the predication error, the fewer bits that are needed to encode the image. So increasing the number of candidate vectors allows a reduction in compressed bit rate, at the cost of performing more SAD (or SSD) computations. Or, alternatively, increasing the number of candidate vectors while holding the compressed bit rate constant allows the prediction error to be encoded with higher precision, improving image quality.

Some codecs (including H.264) allow a 16x16 macroblock to be subdivided into smaller blocks (e.g., various combinations of 8x8, 4x8, 8x4, and 4x4 blocks) to lower the prediction error. Each of these smaller blocks can have its own motion vector. The motion estimation search for such a scheme begins by finding a good position for the entire 16x16 block. If the match is close enough, there's no need to subdivide further. But if the match is poor, then the algorithm starts at the best position found so far, and further subdivides the original block into 8x8 blocks. For each 8x8 block, the algorithm searches for the best position near the position selected by the 16x16 search. Depending on how quickly a good match is found, the algorithm can continue the process using smaller blocks of 8x4, 4x8, etc.

Even with drastic reductions in the number of candidate motion vectors, motion estimation is the most computationally demanding task in video compression applications. The inclusion of motion estimation makes video encoding much more computationally demanding than decoding. Motion estimation can require as much as 80% of the processor cycles spent in the video encoder. Therefore, many processors targeting multimedia applications provide a specialized instruction to accelerate SAD computations or a dedicated SAD coprocessor to offload this task from the CPU. For example, ARM's ARM11 core provides an instruction to accelerate SAD computation, and some members of Texas Instruments' TMS320C55x family of DSPs provide an SAD coprocessor.

Note that in order to perform motion estimation, the encoder must keep one or two reference frames in memory in addition to the current frame. The required frame buffers are very often larger than the available on-chip memory, requiring additional memory chips in many applications. Keeping reference frames in off-chip memory results in very high external memory bandwidth in the encoder, although large on-chip caches can help reduce the required bandwidth considerably.

Secret Encoders
In addition to the two approaches described above, many other methods for selecting appropriate candidate motion vectors exist, including a wide variety of proprietary solutions. Most video compression standards specify only the format of the compressed video bit stream and the decoding steps and leave the encoding process undefined so that encoders can employ a variety of approaches to motion estimation. The approach to motion estimation is the largest differentiator among video encoder implementations that comply with a common standard. The choice of motion estimation technique significantly affects computational requirements and video quality; therefore, details of the approach to motion estimation in commercially available encoders are often closely guarded trade secrets.

Motion Compensation
In the video decoder, motion compensation uses the motion vectors encoded in the compressed bit stream to predict the pixels in each macro block. If the horizontal and vertical components of the motion vector are both integer values, then the predicted macro block is simply a copy of the 16-pixel by 16-pixel region of the reference frame. If either component of the motion vector has a non-integer value, interpolation is used to estimate the image at non-integer pixel locations. Next, the prediction error is decoded and added to the predicted macro block in order to reconstruct the actual macro block pixels. As mentioned earlier, the 16x16 macroblock may be subdivided into smaller sections with independent motion vectors.

Compared to motion estimation, motion compensation is much less computationally demanding. While motion estimation must perform SAD or SSD computation on a number of 16-pixel by 16-pixel regions in an attempt to find the best match for each macro block, motion compensation simply copies or interpolates one such region for each macro block. Still, motion compensation can consume as much as 40% of the processor cycles in a video decoder, though this number varies greatly depending on the content of a video sequence, the video compression standard, and the decoder implementation. For example, the motion compensation workload can comprise as little as 5% of the processor cycles spent in the decoder for a frame that makes little use of interpolation.

Like motion estimation, motion compensation requires the video decoder to keep one or two reference frames in memory, often requiring external memory chips for this purpose. However, motion compensation makes fewer accesses to reference frame buffers than does motion estimation. Therefore, memory bandwidth requirements are less stringent for motion compensation compared to motion estimation, although high memory bandwidth is still desirable for best processor performance.

TAG •

List of Articles
No. Subject Author Date Views
67 7 Best Music Players for Mac You Should Try file 엉뚱도마뱀 2017.12.07 601
66 CISCO Router 설정 팁 - QOS 1 digipine 2017.11.03 1816
65 CM6206 High Integrated USB Audio I/O Controller Windows Driver and PDF file digipine 2022.02.02 5640
64 Cubism Live2D Demo Windows D3D9 file lizard2019 2023.01.11 243
63 DaVinci 에 대한 소개글 digipine 2017.11.03 112
62 Facebook 게시물을 검색하여 자살 암시를 발견하는 AI 서비스 시작 1 file 엉뚱도마뱀 2017.11.29 512
61 GA-P55A-UD3R rev 2.0 / GT 240 OSX 스노우 레파드 해킨가이드 digipine 2017.11.03 754
60 Galanz Glanz BCD-467WTDH Screen-Controlled eco-smart refrigerator file lizard2019 2020.05.19 55542
59 GDPR 유럽 일반 개인 정보보호법 시행 엉뚱도마뱀 2018.05.24 879
58 HIGH QUALITY MOBILE EXPERIENCE (HQME) digipine 2017.11.03 339
57 High Quality Mobile Experience (HQME) IEEE P2200TM file lizard2019 2020.02.18 1443
56 How video compression works 1/5 file digipine 2017.11.02 1596
55 How video compression works 2/5 - Quantization, coding, and prediction file digipine 2017.11.02 5828
54 How video compression works 3/5 - Color and motion file digipine 2017.11.02 1595
» How video compression works 4/5 - Encoder secrets and motion compensation digipine 2017.11.02 340
52 How video compression works 5/5 - Deblocking, deringing, and color space conversion file digipine 2017.11.02 23693
51 iOS 14.3 업데이트 릴리즈 file digipine 2020.12.16 266
50 IoT 악성 코드 "Mirai"변종이 급증 주의 file 엉뚱도마뱀 2017.11.28 344
49 iPhone SE 2 2018년 상반기에 출시? file digipine 2017.12.11 539
48 iPhone X의 Face ID RAW 데이터 영상 분석 및 은행권 앱 동향 file 엉뚱도마뱀 2017.11.27 669
Board Pagination Prev 1 2 3 ... 4 Next
/ 4