Research

Research Topics

[Vijayan K. Asari]

Wide Area Surveillance

Visibility Improvement in Wide Area Motion Imagery (WAMI): Shadow Illumination and Single Image Super Resolution

Image Registration and Moving Object Detection in Wide Area Motion Imagery (WAMI)

Tracking of Vehicles and Pedestrians in Wide Area Motion Imagery (WAMI)

Object Detection and Classification on Wide Area Motion Imagery (WAMI)

Scene Analysis and Understanding

3D Reconstruction from Single Moving Camera

Video Stabilization

Building Change Detection in Satellite Imagery

Pipeline Right-of-Way Monitoring

Perception Beyond Visible Spectrum

LiDAR Data Analysis

Hyperspectral Data Processing

Explosive Detection

SAR Image Analysis

Biometrics for Human Identification

Multi-View Face Detection

Pose and Lighting Invariant Real-Time Face Recognition

Face Sketch Recognition

Iris Recognition

Human Activity Recognition

Phase Space for Face Pose Estimation

Human Action Recognition

Automatic 3D Facial Expression Recognition

Face Tracking

Video Pre-Processing

Nonlinear Techniques for Image/Video Enhancement

Single Image Super Resolution

Visibility Improvement of Hazy/Foggy Images

Phase Congruency based Technique for Removal of Rain from Videos

Brain Machine Interface and Emotion Recognition

Emotion Recognition by Spatiotemporal Analysis of EEG Signals

EEG: Localization of Spatial Disorientation (Source Localization)

EEG Signal Analysis for Brain-Machine Interface

Eye Artifact Removal in EEG Data

Robotics and Vision Guided Navigation

Hexacopter

RAIDER: Vision Guided Navigation

7DOF Robotic Arm Control

Segway: Vision Guided Navigation

Machine Learning Approach for Object Region Segmentation

Human Activity Segmentation

Touchscreen Segmentation

Neural Networks and Learning Algorithms      

Recurrent network as a nonlinear line attractor for pattern association

A new learning algorithm for pattern association in a recurrent neural network is developed. Unlike the conventional model in which the memory is stored in an attractive fixed point at discrete location in state space, the dynamics of the new learning algorithm represents memory as a line of attraction. The region of convergence at the line of attraction is defined by the statistical characteristics of training data. The performance of the learning algorithm is compared with Bayesian model in experiments on skin color segmentation. (For more information VIPS)

Modular network architecture and distance based training algorithm for pattern association

A two-dimensional modular architecture for Hopfield neural network that improves the storage capacity and reduces the structural complexity is developed. The new approach involves dividing a 2-D network into sub-modules with each module functioning independently as a sub-network in conjunction with the inputs from neighboring modules. A divide and conquer approach, which can solve a complex computational task by dividing it into simpler subtasks and then combining their individual solutions is utilized. The performance of this technique is evaluated by applying various character patterns. It is observed that the network exhibits faster convergence characteristics and is capable of reproducing learned patterns successfully from noisy data. ...more...

Multiple-valued neural network for multi-valued pattern recognition

The binary model of the artificial neurons does not describe the complexity of biological neurons fully since the neurons actually handle continuous data. However, analog neurons implemented in an integrated chip require high precision resistors and are easily affected by electrical noise. Because of the problems associated with the binary and analog neurons, research on multilevel neural networks for modeling the biological neurons has attracted great attention. A technique for the training of multiple-valued neural networks based on back-propagation learning algorithm employing a multilevel threshold function is developed. The optimum threshold width of the multilevel function and the range of the learning parameter to be chosen for convergence are the important parameters to be derived. Trials performed on a benchmark problem demonstrate the convergence of the network within the specified range of parameters. ...more...

Neural network based reconstruction of images from partial and noisy data

The consequence of reducing the impact of the synaptic weights from neurons farther away from the neuron under consideration on a modular two-dimensional Hopfield network using Hebbian learning rule is examined for image processing applications. A generalized modular architecture is developed by defining each module with a group of neighboring neurons and all modules communicating with each other. A spatially decaying distance factor is introduced into the Hebbian rule to reduce the effect of neurons from farther modules. A biologically inspired visual perception concept has been adopted for defining the variation of the distance factor. The performance of the new technique is evaluated by conducting several experiments on character images and it is observed that the new method increases the learning ability and convergence rate of the network. The nature of the distance factor helps the removal of several synaptic weights farther away from a particular neuron and this leads to the reduction of complexity of the network in terms of both software and hardware implementation. ...more...

Adaptive technique for image compression using a self-organizing neural network

A novel image compression technique employing the self-organized clustering capability of Fuzzy-ART neural network and 2-D run-length encoding is developed. Initially the image is divided into smaller blocks and the vectors representing the pixels in the blocks are applied to the Fuzzy-ART neural network for classification. The image is then represented by the block codes consisting of the sequence of class indices, and the codebook consisting of the class index and their respective grey levels. Further compression is achieved by 2-D run-length encoding, making use of the repetitions of the class index in the block codes in x and y directions. By controlling the vigilance parameter of Fuzzy-ART, a reasonable compression of the image without sacrificing the image quality can be obtained. From the experimental results, it has been seen that the new method of image compression can be used for image communication systems where large compression ratio is required. With the introduction of a new class of Fuzzy-ART network, namely Force Class Fuzzy-ART, hardware implementation of the image compression module is made feasible. This network constrains the maximum number of classes in the output of the network by forcing the new vectors into one of the closest categories. ...more...

Image Enhancement, Color Constancy and Image Fusion

Visibility improvement in video sequences captured in non-uniform and extremely low lighting environment

Image enhancement is a preprocessing technique to improve the visibility and highlight the features and details of the image for pattern recognition under varying lighting conditions. Three nonlinear image enhancement algorithms - LDNE (luma-dependent nonlinear enhancement), INDANE (integrated neighborhood dependent algorithm for nonlinear enhancement) and AINDANE (adaptive INDANE, which was implemented based on the statistical features of the image, such as histogram distribution, standard deviation, and frequency domain information.) - based on the principle of sensory attributes of the human visual cortex, are developed. It is observed that LDNE, INDANE and AINDANE algorithms provide extremely optimal results for preprocessing images so that the object regions in an image captured under extremely low or non-uniform lighting conditions are brightened and made more distinct. (For more information VIPS)

Unsupervised color constancy for face recognition

Illumination invariant face recognition is achieved by mapping human skin colors in the face region captured under different illuminations to a canonical illumination and then performing face recognition of the new transformed face region. Color constancy is obtained by measuring the mapping of colors in the same skin patches in one lighting condition to the colors under canonical illumination. The mapping is done using color flows. A color flow is a vector field in color space, each vector in the vector field starts at one color and ends at another. Since a vector field can be considered as element of a vector space, the ensemble of color flows can be studied using principal component analysis. The color constancy scheme was tested in a laboratory environment, where the canonical light was fluorescent light; the other lighting conditions were obtained at various intervals and with various combinations of fluorescent lights and outdoor lighting. Once the skin color flows have been learned for the different lighting conditions to the canonical lighting condition, the skin colors in a new image can be flowed to the canonical skin colors. Average intensity of the image in a lighting condition is used to determine which skin color flow has to be applied to the new image. This is performed on patches of the image to deal with mixture of lighting conditions. (For more information VIPS)

Driver’s assistant for visibility improvement

Poor visibility on the road has been the cause for many car accidents in the United States. Sometimes a single accident might involve several vehicles. It may be too late to respond after the drivers identify what is ahead in a low visibility environment due to the presence of snow, fog, rain or darkness. Irrespective of the several safety measures available in automobiles, drivers face a great risk of having accidents at night. For instance, an experienced driver generally looks far ahead (100 to 500 meters) to anticipate situations on the road; however, at night the “look ahead” distance is limited to approximately 80 meters. Therefore, it is a necessity that the drivers are assisted with a device that can “see through” the dark, the rain or the fog that is present while driving night. Such a device can be designed with a specific and proven image enhancement technique that can be used to brighten, de-rain or de-fog the images captured in such low visibility environments. The enhanced images can then be used to assist motorists in operating their vehicles under those low visibility conditions. It would be helpful to have enough reaction time if a driver can detect road hazards, pedestrians, and animals on the road well beyond the visible range of vehicle headlights. It is envisaged that this system would be able to provide a clear and complete view and would be able to eliminate even the critical “blind-spot” during driving. (For more information VIPS)

Multi-sensor imaging and image fusion

Thermal and infrared imaging cameras will be able to see objects in total darkness. An infrared imaging camera installed on a head-mount could be used to see the objects during night at a long distance. The images captured by the infrared imaging camera are in a gray-scale mode. By registering the infrared images with the color images captured by a forward-looking wide-viewing-angle video camera, the objects could be seen more clearly during night. The images captured by three other video cameras looking backwards and on the left and right sides could be able to provide images from the rear and both sides of the person. This would help him/her to clearly see the objects on his rear and sides. A four-frame display panel with a main front panel for frontal image, one top panel for rear view, and two side panels for left and right side views would be able to display these four images. (For more information VIPS)

Face Detection, Recognition, Tracking and Face Expression and Emotion Analysis

Face location and tracking system in complex background environment

Face detection is the first step in human face recognition and video surveillance. Fast and automatic detection of faces from video sequences is an important task in security applications. The number, location, size and orientation of human faces may vary from frame to frame. Face detection and tracking algorithms in color images in the presence of varying lighting conditions as well as complex background environment is developed. The goal of this research work is to locate human faces in images in natural environments with unconstrained lighting, varying face poses, and varying face geometries and skin colors. The new method detects skin regions over the entire image, and then searches for face candidates based on their spatial arrangements. Eye, mouth, and face boundary maps constructed during the search process verify each face candidate. A set of normalized parameters representing statistical and geometrical features extracted from the segmented regions classifies them as face or non-face regions based on a new classification strategy. It is envisaged that the new method leads to successful detection of faces over a wide range of facial variations in color, position, scale, rotation, pose, and expression. ...more...

Pose and lighting invariant real-time face recognition system

An efficient face recognition algorithm based on a modular PCA approach that has an improved recognition rate for large variations in pose, lighting direction and facial expression is developed. In this technique the face images are divided into smaller sub-images and the PCA approach is applied to each of these sub-images. The training phase of the technique begins by extracting the eigenvectors corresponding to the largest Eigenvalues of a covariance matrix, which is constructed from the training image database. These eigenvectors are used for creating a generalized face feature vector, which can classify the incoming face images successfully. Experimental results demonstrate that the modular PCA method has higher recognition rate when compared with the traditional PCA method for tests conducted on the UMIST and Yale databases. ...more...

Human face detection, tracking and recognition in video sequences

This project deals with a human surveillance system with integrated face understanding technologies to recognize personal identities in real time. The focus is on developing automated visual identification of face image sequences automatically detected and tracked by a closed-loop mechanism using conventional surveillance cameras. The system performs face detection and tracking robustly in real-time under a wide range of lighting and scale variations by combining motion and facial appearance models. The faces in a scene are detected using a machine learning algorithm and the detected faces are tracked all the time, which helps the detection on next frame. The surveillance system then performs real-time face recognition, which is invariant to large changes in lighting, facial expressions, and poses. (For more information VIPS)

Pose and illumination invariant 3D face recognition

The need for an accurate, easily trainable recognition system has become more pressing these days. Current systems have advanced to be fairly accurate in recognition under constrained scenarios, but extrinsic imaging parameters such as pose, illumination, and facial expression are still causing much difficulty in correct recognition. The clearest trend noted by many vendors on the field of face recognition is the emergence of 3D technology. The goal of any face recognition algorithm is to separate the characteristics of a face, which are determined by the intrinsic shape and texture of the facial surface, from the random conditions of image generation. This can be achieved by using 3D face modeling techniques. A deformable 3D face modeling approach, which is able to produce accurate and realistic 3D shape models from 2D images for robust face recognition, is considered for further research and to increase the performance of the existing recognition system to 100 % accuracy. The face recognition system under development is invariant to both illumination and pose. (For more information VIPS)

Network enabled feature search for person identification from databases in servers distributed around the world

Existing face recognition techniques, which typically require a large set of training images of individual faces for creating a feature database, are not feasible for many applications when these multiple training images in various poses and illuminations are not available. In fact, the most typical scenario is to be able to robustly recognize faces, even though only one or a very small number of training images are available, and these training images were acquired under significantly different lighting and pose conditions. Motivated by human visual perception, which remains robust despite these difficulties, a feature-based face recognition system, largely independent of pose and lighting, is being explored and developed. A multidimensional feature matrix representing multiple views of the individual will be created using face regions detected from images captured by three surveillance cameras placed at orthogonal locations. A similarity search between this feature matrix and feature vectors in a distributed database will result in the identification of one of its vectors, which represents a face in a particular pose or illumination. In order to accommodate worldwide searches from databases distributed on different servers in a variety of organizations and offices within an organization, new network routing techniques will be developed and integrated with the search. A typical scenario is the verification of individuals at the immigration gate at an international airport. The details of an individual could be popped up in the immigration officer’s computer before the individual reaches the officer’s desk from an international flight, by performing a search in all the connected servers around the world with the feature matrix generated by three images captured by the surveillance cameras. The details of this individual if he/she is convicted anywhere in the world could be made available to the officer’s computer by the networked feature search technique. If in a particular situation, there is no data popped up in the computer, it could be considered as a case where the individual is in disguise and he/she could be checked more intensively by another group of experts. (For more information IDHS)

Face expression analysis for automated mind-reading

The face expression analysis and classification system will be classifying the facial expressions into six classes namely, sadness, happiness, anger, disgust, surprise, and neutral. The mind-reading system recognizes a number of head displays or affective cues namely head nod, head shake, head tilt left and head tilt right. Appropriate combination of these affective cues with the facial expressions is capable of determining the mental state. The different mental states being determined by the new system are: agreeing – encompasses the mental states that communicate agreeing, granting consent or reaching a mutual understanding about something, concentrating – an absorbed, concentrating and vigilant state of mind, disagreeing – communicates that one has a differing opinion on or is disputing something and it includes contradictory, disapproving, discouraging, and disinclined, interested – being interested indicates that one’s attention is directed to an object or class of objects, thinking – communicates that one is reasoning about, or reflecting on, some object: brooding, choosing, fantasizing, judging, thinking and thoughtful, and unsure – communicates a lack of confidence about something, and is associated with a feeling of not knowing. (For more information VIPS)

Brain signal analysis for identification of emotional states of mind

In many homeland security related situations, it would be helpful to have objective information regarding a person’s emotional state to ascertain the person’s thoughts and intentions. It would be beneficial if the intention of the individual to hijack an airplane or contemplate a terrorist attack is identified without the person’s knowledge prior to the entry of that individual into the operation spot. This leads to the concept of a system, which would obtain information pertinent to a person’s emotional state and process the information to identify the signal characteristics of a terroristic mind set. (For more information VIPS)

Embedded System Design

Design of embedded systems for real-time applications

Despite the remarkable advancement in processing speed of conventional computers, the processing power of these conventional computers do not satisfy the demand for high throughput rates of data-intensive computer vision applications. On the other hand, application specific integrated circuits (ASIC) can be designed to solve the processing speed concerns for any particular computer vision applications; however, these ASIC devices are not very flexible that they can be used to support a variety of computer vision algorithms. In this project, we attempt to design a massively parallel architecture that will combine the high processing speed of an ASIC device and the flexibility of a general purpose processor. The parallel architecture design can achieve much higher processing rates than conventional computers because they exploit inherent parallelism – the ability to carry out many operations simultaneously – in computer vision applications. A set of special SIMD instructions, which especially take advantage of the inherent parallelism in the applications, will be developed to support the massive parallel architecture. The introduction of the special instruction set would provide the flexibility to perform various image processing algorithms. The designed architecture is expected to process high-resolution images (typically 2 Mpixels) in real-time (at least 30 frames per second). Some of the applications that would be supported by the designed architectures are developed within the laboratory and those applications include image enhancement, skin extraction, face detection, face tracking and face recognition. (For more information VIPS)

Multilevel digital architecture for neural network based pattern recognition

Design and development of the digital implementation of a multilevel feed forward neural network architecture for face recognition based on statistical features representing Eigenfaces is presented. The architecture is divided into three parts viz. feature extractor, classifier and identifier. The Eigenface extractor architecture is developed based on an efficient design strategy in which all the M weight values corresponding to the Eigenfaces are generated simultaneously from M images representing the Eigen vectors and the test input image. The multilayer neural network classifier is trained using error back propagation algorithm. A novel multilevel digital architecture is developed for the implementation of the multilayer feed forward neural network for categorization of the input vectors into specific output classes. At the output of the back propagation network, a maximization network is used for the final classification of the multilevel outputs from the neural network. The data interpretation concepts adopted in the system design led to an efficient design methodology, which eliminated the necessity of complex computations needed for the implementation of multilayer perceptron using sigmoid function. ...more...

Dynamic rescheduling of switching activity in video coding systems

Future power demands by wireless video applications are driving the need for innovations in power optimization both at the algorithmic and architectural level. One of the most power consuming modules for such applications is video coding. Architectures for MPEG4, the current standard for video coding, require low power compliance. Video data has a great possibility of having the presence of same magnitude for neighboring pixels. Preventing repetitive computations of these pixel values with intelligent data reuse is an effective power optimization technique. Furthermore, it can be observed that a large amount of data exists in video, which produce insignificant results. These data computations can also be prevented, by blocking the clock signals to the processing modules. Implementation of these design concepts in digital systems reduces the switching activity, which is the most important factor in power consumption. ...more...

Real-time correction of barrel distortion in wide-angle camera images

Images captured with a typical wide-angle camera lenses show spatial distortion, which necessitates spatial warping for subsequent analysis. In this research, an efficient architecture for an embedded system for the real-time correction of barrel distortion in wide-angle camera images is proposed. The spatial warping procedure follows a methodology based on least-squares estimation to correct the non-linear distortion in the images. A mathematical model of polynomial mapping is used to map the images from distorted image space onto the warped image space. The model parameters include the expansion polynomial coefficients, distortion centre and corrected centre. The spatial warping model is applied to several gastrointestinal images. A very high speed pipelined architecture for the real-time correction of barrel distortion in wide-angle camera images is being developed in the VLSI Systems Laboratory. The CORDIC based hardware design is suitable for an 8-bit input image of size up to 2056x2056 pixels and is pipelined to operate at a clock frequency of 33 MHz and it produces the corrected image at a rate of 30 frames per second. The VLSI system will facilitate the use of a dedicated hardware that could be mounted along with the camera unit. ...more...

Unidirectional CORDIC

Unidirectional CORDIC algorithm for computation of trigonometric and transcendental functions

A novel technique for the pre-computation of rotation bits for unidirectional CORDIC is developed in the VLSI Systems Laboratory. The unidirectional CORDIC algorithms differs from the conventional CORDIC in the degree of rotation. In the conventional algorithm, the rotation of the angle is in both clockwise and counterclockwise directions, but the rotation is only in counterclockwise direction in unidirectional CORDIC algorithm. This leads to no rotation for few bits thereby reducing the number of rotations required to compute a value. Hence the unidirectional CORDIC approach has significant improvement in speed and power savings. A new technique is developed to pre-compute the rotation bits from any given angle. Experimental results obtained with computations of trigonometric and hyperbolic functions using the pre-computed bits show the accuracy results in the order of ~10-8. ...more...

Unidirectional CORDIC for asynchronous design in DSP systems ...more...

Pre-computation of rotation digits for unidirectional Flat-CORDIC ...more...

Multiple-valued Logic

Optimization techniques for the design of multiple-valued PLAs

An optimization technique for the design of two types of multiple-valued PLAs is developed. In a type-I PLA, the multiple-valued function is realized directly, whereas in a type-II PLA, output encoding is used to encode the binary output of the PLA. In both types, multiple function literal circuits are used for the purpose of minimization. It is observed that the new technique leads to a considerably reduced size of PLA when compared to the earlier techniques. ...more...

Specific Research Projects

Research Groups