Jack Erickson, MathWorks
Low-latency video processing applications rely on FPGA and ASIC hardware to process the large amounts of incoming pixel data. But high-resolution formats, such as 4k and 8k, and high-frame-rate video contain too many pixels per second to process serially. Digital hardware allows for parallelism, but many algorithms such as filters and edge detection operate on a window of contiguous pixels that makes efficient parallel processing challenging.
Vision HDL Toolbox™ natively supports multiple-pixel-per-clock processing. Its Frame-to-Pixels and Pixels-to-Frame gateway blocks offer easy settings to switch the design’s inputs and outputs from one pixel at a time to 4 or 8 in parallel. Supported algorithms, such as the Image Filter and Edge Detector blocks shown in this example, automatically update their architectures based on this specified level of parallelism. They simulate this behavior with the proper latency, and using HDL Coder™ they generate synthesizable RTL that shares resources between these overlapping neighborhood processing windows. The result is that resource usage scales sub-linearly with the number of pixels per clock.
To take advantage of this capability to develop custom multi-pixel-per-clock algorithms, use the Line Buffer block as shown in this video.