IT News - How video compression works 4/5 - Encoder secrets and motion compensation

Encoder secrets and motion compensation

The simplest and most thorough way to perform motion estimation is to evaluate every possible 16x16 region in the search area, and select the best match. Typically, a "sum of absolute differences" (SAD) or "sum of squared differences" (SSD) computation is used to determine how closely a candidate 16x16 region matches a macro block. The SAD or SSD is often computed for the luminance plane only, but can also include the chrominance planes. But this approach can be overly demanding on processors: exhaustively searching an area of 48x24 pixels requires over 8 billion arithmetic operations per second at QVGA (640x480) video resolution and a frame rate of 30 frames per second.
Because of this high computational load, practical implementations of motion estimation do not use an exhaustive search. Instead, motion estimation algorithms use various methods to select a limited number of promising candidate motion vectors (roughly 10 to 100 vectors in most cases) and evaluate only the 16x16 regions corresponding to these candidate vectors. One approach is to select the candidate motion vectors in several stages. For example, five initial candidate vectors may be selected and evaluated. The results are used to eliminate unlikely portions of the search area and zero in on the most promising portion of the search area. Five new vectors are selected and the process is repeated. After a few such stages, the best motion vector found so far is selected.

Another approach analyzes the motion vectors previously selected for surrounding macro blocks in the current and previous frames in an effort to predict the motion in the current macro block. A handful of candidate motion vectors are selected based on this analysis, and only these vectors are evaluated.

By selecting a small number of candidate vectors instead of scanning the search area exhaustively, the computational demand of motion estimation can be reduced considerably—sometimes by over two orders of magnitude. But there is a tradeoff between processing load and image quality or compression efficiency: in general, searching a larger number of candidate motion vectors allows the encoder to find a block in the reference frame that better matches each block in the current frame, thus reducing the prediction error. The lower the predication error, the fewer bits that are needed to encode the image. So increasing the number of candidate vectors allows a reduction in compressed bit rate, at the cost of performing more SAD (or SSD) computations. Or, alternatively, increasing the number of candidate vectors while holding the compressed bit rate constant allows the prediction error to be encoded with higher precision, improving image quality.

Some codecs (including H.264) allow a 16x16 macroblock to be subdivided into smaller blocks (e.g., various combinations of 8x8, 4x8, 8x4, and 4x4 blocks) to lower the prediction error. Each of these smaller blocks can have its own motion vector. The motion estimation search for such a scheme begins by finding a good position for the entire 16x16 block. If the match is close enough, there's no need to subdivide further. But if the match is poor, then the algorithm starts at the best position found so far, and further subdivides the original block into 8x8 blocks. For each 8x8 block, the algorithm searches for the best position near the position selected by the 16x16 search. Depending on how quickly a good match is found, the algorithm can continue the process using smaller blocks of 8x4, 4x8, etc.

Even with drastic reductions in the number of candidate motion vectors, motion estimation is the most computationally demanding task in video compression applications. The inclusion of motion estimation makes video encoding much more computationally demanding than decoding. Motion estimation can require as much as 80% of the processor cycles spent in the video encoder. Therefore, many processors targeting multimedia applications provide a specialized instruction to accelerate SAD computations or a dedicated SAD coprocessor to offload this task from the CPU. For example, ARM's ARM11 core provides an instruction to accelerate SAD computation, and some members of Texas Instruments' TMS320C55x family of DSPs provide an SAD coprocessor.

Note that in order to perform motion estimation, the encoder must keep one or two reference frames in memory in addition to the current frame. The required frame buffers are very often larger than the available on-chip memory, requiring additional memory chips in many applications. Keeping reference frames in off-chip memory results in very high external memory bandwidth in the encoder, although large on-chip caches can help reduce the required bandwidth considerably.

Secret Encoders
In addition to the two approaches described above, many other methods for selecting appropriate candidate motion vectors exist, including a wide variety of proprietary solutions. Most video compression standards specify only the format of the compressed video bit stream and the decoding steps and leave the encoding process undefined so that encoders can employ a variety of approaches to motion estimation. The approach to motion estimation is the largest differentiator among video encoder implementations that comply with a common standard. The choice of motion estimation technique significantly affects computational requirements and video quality; therefore, details of the approach to motion estimation in commercially available encoders are often closely guarded trade secrets.

Motion Compensation
In the video decoder, motion compensation uses the motion vectors encoded in the compressed bit stream to predict the pixels in each macro block. If the horizontal and vertical components of the motion vector are both integer values, then the predicted macro block is simply a copy of the 16-pixel by 16-pixel region of the reference frame. If either component of the motion vector has a non-integer value, interpolation is used to estimate the image at non-integer pixel locations. Next, the prediction error is decoded and added to the predicted macro block in order to reconstruct the actual macro block pixels. As mentioned earlier, the 16x16 macroblock may be subdivided into smaller sections with independent motion vectors.

Compared to motion estimation, motion compensation is much less computationally demanding. While motion estimation must perform SAD or SSD computation on a number of 16-pixel by 16-pixel regions in an attempt to find the best match for each macro block, motion compensation simply copies or interpolates one such region for each macro block. Still, motion compensation can consume as much as 40% of the processor cycles in a video decoder, though this number varies greatly depending on the content of a video sequence, the video compression standard, and the decoder implementation. For example, the motion compensation workload can comprise as little as 5% of the processor cycles spent in the decoder for a frame that makes little use of interpolation.

Like motion estimation, motion compensation requires the video decoder to keep one or two reference frames in memory, often requiring external memory chips for this purpose. However, motion compensation makes fewer accesses to reference frame buffers than does motion estimation. Therefore, memory bandwidth requirements are less stringent for motion compensation compared to motion estimation, although high memory bandwidth is still desirable for best processor performance.

No.	Subject	Author	Date	Views
67	DaVinci 에 대한 소개글	digipine	2017.11.03	112
66	도커(Docker)와 쿠버네티스(Kubernetes)란 무엇인가?	digipine	2021.10.20	159
65	트위터에서 다크모드 설정 방법	digipine	2022.01.20	185
64	UC 환경을 위한 종합 네트워크 FMC	digipine	2017.11.03	202
63	마이크로소프트, 차세대 운영체제 윈도우 11 공개	digipine	2021.06.29	209
62	RFID와 USN 에 대해서	digipine	2017.11.03	226
61	애플, 서드파티 앱마켓 진출 허용하나	lizard2019	2022.12.18	235
60	차세대 비휘발성 메모리 기술동향	digipine	2017.11.02	247
59	삼성전자 갤럭시 S22 발표, S펜을 탑재한 첫 S시리즈	digipine	2022.02.10	249
58	Cubism Live2D Demo Windows D3D9	lizard2019	2023.01.11	258
57	iOS 14.3 업데이트 릴리즈	digipine	2020.12.16	266
56	애플 드디어 ARM 기반의 맥 출시 예정, 11월 10일	digipine	2020.11.09	300
55	딥마인드, 인간 개발자와 유사한 수준의 AI ‘알파코드’ 개발	digipine	2022.02.03	302
54	퀄컴, 갤노트20 두뇌 '스냅드래곤865+' 공개	digipine	2020.07.09	303
53	모바일 전용 CPU ARM이 노트북과 AI로 진화 한다	digipine	2017.11.03	318
52	안드로이드 의 써드파티 어플의 SD RW 권한 부여	digipine	2017.11.03	336
51	HIGH QUALITY MOBILE EXPERIENCE (HQME)	digipine	2017.11.03	339
50	구글 서비스, 데이터 유지 기간을 18개월로 조정	digipine	2020.06.25	339
»	How video compression works 4/5 - Encoder secrets and motion compensation	digipine	2017.11.02	340
48	IoT 악성 코드 "Mirai"변종이 급증 주의	엉뚱도마뱀	2017.11.28	344

How video compression works 4/5 - Encoder secrets and motion compensation

Shortcut

Shortcut