FPGA Acceleration for Feature-Based Processing Applications

Dhiraj Raut
Accelerated Image Processing using FPGAs.
3 min readMay 1, 2021

--

Feature-based vision applications are highly efficient extraction of images and analysis of features from images for better performance and latency.
This paper describes the implementation of an algorithm that combines distributed feature detector (DHCD) with a rotational invariant feature descriptor (RHOG). Based on an algorithmic comparison with other
feature detectors and descriptors. They show that algorithms has the lowest error rate for 3D aerial scene matching.

In this paper, They propose an embedded algorithm for feature-based processing that is optimized for multi-modal sensor alignment (e.g. LWIR to Visible), 3D aerial scene matching, and robust tracking of objects. FPGA implementation of an algorithm (D-HCD, R-HOG, matching) combines distributed feature detector (D-HCD) and rotational invariant feature descriptor (R-HOG) follows exhaustive search matching to better process aerial imagery.

Feature-Based Analysis using proposed distributed feature detector and rotational invariant descriptors for onboard UAV processing

Some contributions in this paper:
• A feature-based vision algorithm for a spare feature and rotational invariance characteristics (D-HCD and RHOG).
• A FPGA implementation on Zynq platform, with service-based API for flexibility in using the hardware accelerators: Harris Corner Detector (HCD),
Histogram of Oriented Gradient (HOG), and full search symmetric-descriptor matching.
• A small-embedded sensor-processor board with integrated sensors and cameras for onboard processing on aerial drones.

They consider a number of feature detectors and descriptors. Feature detectors (e.g. HCD, STAR, FAST, SIFT, SURF, ORB, BRISK, MSER, GITT ) are developed to select localized salient features from 2D images. Feature descriptors (e.g. HOG, BRIEF, FREAK) describe the selected features as representative feature vectors. Some feature detectors. firstly D-HCD finds all features with a strength above a threshold, and then divides the image into tiles (e.g. 10x10), and selects the best N features in each tile, with the maximum number of features of N x tiles (< maximum number of features). The best N features in each tile may not be the highest salient features in the entire image, but well distributed and strong enough to represent spatial information of the scene.

Feature Processing Pipeline. Shared memory facilitates acceleration between software and hardware components.

The diagram shows the sequence of processing steps and the use of shared memory between FPGA and ARM processing. The video input is processed in-line for image enhancement, filtering (or multi-resolution image
generation), and HCD. The HCD also computes an angle/magnitude image that is used by the feature descriptor (HOG). The HOG and MATCH functions perform a sequence of accelerations based on a list of features (Key Points) to process. Both of these types of accelerations require a different data buffering/caching mechanism that enables efficient computations in the FPGA. In this implementation, the ARM is also used for the HCD tile sorting and Match list generation.

The FPGA implemented algorithms using distributed sampling of feature points and rotational invariant descriptors compared favorably over a set of common feature detector/descriptor algorithms for registering a large sequence of aerial imagery. Hardware implementation achieves 15x speedup and 5x reduction in latency over quad-core CPU.

References:

Gooitzen van der Wal, David Zhang, Indu Kandaswamy, James Markowitz, Kevin Kaighn, Joe Zhang, Sek Chai, SRI International, Princeton, NJ

--

--