Design of a high performance and efficient architecture for real-time embedded video processing systems

 

 Real-time systems

Feature demo (CLICK on the image to view the video clip)

 

Video enhancement of color video stream (uniform darkness)

·         Xilinx's multimedia board, Virtex II XC2V2000 FPGA

·         Xilinx' Integrated Software Environment (ISE)

·         VHDL language

·         Systolic pipelined architecture

·         Utilization of quadrant symmetry property in 2D convolution

·         Efficient design of logarithm module

·         Utilization of fast Zero-Bus-Turnaround (ZBT) RAM

·         MicroBlaze soft processor

  (~4MB)

 

Video enhancement of color video stream (non-uniform darkness)

·         Xilinx's multimedia board, Virtex II XC2V2000 FPGA

·         Xilinx' Integrated Software Environment (ISE)

·         VHDL language

·         Systolic pipelined architecture

·         Utilization of quadrant symmetry property in 2D convolution

·         Efficient design of logarithm module

·         Utilization of fast Zero-Bus-Turnaround (ZBT) RAM

·         MicroBlaze soft processor

  (~3MB)

 

Efficient design for skin segmentation module as a part of face recognition system

·         Altera's DE2 Board, Cyclone II FPGA

·         Altera's Quartus II development software

·         Verilog language

·         Parallel pipelined architecture

·         Efficient Color space transformation

·         Utilization of fast Zero-Bus-Turnaround (ZBT) RAM

·         Nios II soft processor

 

  (~5MB)

 


 

RESEARCH PROJECTS

1.      Low power design methodologies for video processing systems (2006 – 2007): research work focuses on the neighborhood-dependency methodology for reduction of dynamic power consumption. The approach utilizes the presence of repeated or non-significant data values in a video stream to stop the unnecessary switching activities in computational and logic blocks. Designing and implementing image-based processing technique to exploit the special characteristics of the pixels in the surrounding neighborhood for more efficient computations with reduced dynamic power consumption

2.      Real-time skin segmentation (2006-2007): Implementation of the standard skin segmentation based on piecewise and linear functions in an Altera’s Cyclone II device. The implementation employs estimation method for fast computation of logarithm and division operations. The parallel and pipelined prototype is capable of processing input video stream on the fly (equal to the input rate).

3.      FPGA-based real time face detection (2007): A modular approach for the design of the FPGA-based face detection system for real time video processing applications is developed and implemented in the FPGA environment. The modular approach utilizes the inherent parallelism in the application to increase throughput of the system.

4.      Driver’s assistant for visibility improvement (2004-2006):  Design and develop a prototype system for Driver’s Assistant for Visibility Improvement (DAVI). Research focuses on optimizing the non-linear image enhancement suitable for hardware architecture, developing estimation technique for fast computation of logarithm, efficient integrating of hardware modules to perform real-time processing.

5.      FPGA-based real-time video enhancement (2004-2006): Design and implement various nonlinear image enhancement algorithms in the Xilinx’s Virtex II FPGA. The designs utilize estimation methods for computing logarithm and division operations. Systolic design for 2D convolution with consideration of quadrant symmetry in Gaussian kernel is employed to reduce redundant computations.  The pipelined system is capable of produce an output rate equal to the input rate.

6.      Multi-sensor video fusion (2006): Video streams from a color CCD camera and a long-wave thermal camera are fused to obtain optimum scene for surveillance and security applications. The procedure includes video enhancement of the color stream, video registration and video fusion based on wavelet transform function.

7.      Real-time 2D convolution (2004-2005): Design and implement quadrant symmetry 2D convolution modules for real-time spatial filter in video applications. Folding techniques are utilized to reduce as much as 75% number of multiplications and additions in the module

8.      Real-time face recognition (2004): Design and implement a parallel architecture for a face recognition system based on modular Principal Component Analysis (modular PCA) technique. Each processing element computes a weight vector from a face image region and pre-computed eigenvectors; hence the processing element is also parallelized where each path works on one eigenvector and the face image region to compute one element in the weight vector. The architecture is able to recognize a face image from a database of 1000 face images in 11 ms.

9.      Real-time distortion correction (2001-2003): Design and implement a real-time system capable of correcting barrel distortion in wide-angle cameras. Research focuses on incorporating CORDIC algorithm to the non-linear expansion of the distorted image. A fully pipelined system is developed and it is capable of processing 120 M-pixels per second.

10.  Neural network based pattern association (2002): Design and implement a systolic array implementation of block-based Hopfield neural network for efficient pattern association. The design is based on modeling the energy equation of Hopfield neural network to a systolic (or modular) form.

 

 

HARDWARE DESIGN CAPABILITIES
 

Low Power Design Approach

·        Neighborhood dependency considerations for low-power design

·        Switching activity control (logic level) in low-power digital design

·        Data dependency considerations in low power design of discrete cosine transform architecture

·        Data dependency considerations in low power design of 2D convolution architecture for video processing systems

 

Design Modules

·        Design of log and anti-log computational modules

·        Quadrant symmetry design approach for 2-D convolution module

·        Design of an efficient multiplier-less architecture for multi-dimensional convolution

·        Hardware module for normalized cross-correlation computation

·        Multilane architecture for modular-PCA implementation

·        Design of custom logic for video interface modules

·        Design of CORDIC based trigonometric functional modules

·        Unidirectional CORDIC modules for asynchronous applications

·        VLSI architecture for pre-computation of rotation bits in unidirectional Flat-CORDIC

·        VLSI efficient discrete time cellular neural network processor

 

Application Specific Architectures

·        Systolic implementation of feedforward neural networks for pattern recognition

·        Parallelo-pipelined design approach for nonlinear enhancement of color images

·        Pipelined architecture for distortion correction in wide-angle camera images

·        High performance architecture for multi-sensor image fusion

·        High storage capacity architecture for pattern recognition using an array of Hopfield neural networks

·        Hardware implementation of Fuzzy-ART based image compression

·        Vector processor based architecture for gradient and normal computation in real-time volume rendering

·        A parallel VLSI architecture for real-time segmentation of images with complex background environment

·        A generalized cellular neural network architecture for high storage capacity pattern recognition

·        A multilevel architecture for FPGA based implementation of feed-forward neural network for pattern recognition

·        A modular architecture for a recurrent neural network for character recognition

·        Systolic array implementation of block based Hopfield neural network for pattern association

·        System level design of real time face recognition architecture based on composite PCA

·        A fully pipelined architecture for barrel-distortion correction based on back mapping and linear interpolation

·        A flexible and efficient hardware architecture for real time face recognition based on eigenface approach

·        A real-time parallel system for video skin segmentation

·        A modular approach for a face detection system for real-time applications