Research

Radar Waveform Recognition With Time-Frequency Analysis and Deep Convolutional Network

Due to the heavily occupied radio frequency spectrum, automotive radar-based sensing systems urgently need counter-jamming techniques which are usually deployed in electronic warfare systems. An essential countermeasure is the utilization of low probability of intercept (LPI) radar waveforms, where LPI properties are desirable to keep radar signals away from intercept receivers commonly deployed alongside radar warning receiver to possibly identify emitter and detect threats. This accordingly requires the implementation of automatic waveform recognition in advanced intercept receivers to reliably authenticate LPI signals. Most of the existing works suffer two drawbacks: (i) conventional methods extracting handcrafted low-level features to learn traditional classifiers cannot obtain a high recognition rate, and (ii) deep learning (DL) frameworks with conventional conventional neural network (CNN) architectures hamper the intensive exploitation of intrinsic waveform characteristics. To cope with the above-mentioned issues, we propose an efficient DL-based method to automatically recognize various LPI radar waveforms. Concretely, we leverage Choi-Williams distribution (CWD) to analyze the time-frequency representation of radar signals. The research introduces LPI-Net, a deep CNN having multiple cascades processing modules to learn highly-discriminative features at multi-scale representations, wherein feature collection is associated with skip-connection to enrich the feature diversity and prevent the vanishing gradient issue.

[paper-WCL2021] [paper-COMML2021]

Figure: LPI radar waveform recognition system model.

Figure: LPI-Net for radar waveform recognition: (a) the overall network architecture and (b) the structure of a processing module.

Figure: Overall network architecture of MCNet involving multiple convolutional blocks connected via skip connections.

Figure: Description of convolutional blocks developed in the MCNet. (a) the pre-block with dual convolutional flows of asymmetric filters 3×1 and 1×3; (b) the convolutional M-block with triple flows of convolutional filters 3×1, 1×3, and 1×1; and (c) the dimension-reduced version of M-block with an additional max-pooling layer.

Radio Spectrum Improvement with Automatic Modulation Classification

Nowdays, respect to the emergence of multiple standards and innovative technologies for wireless communication, wherein radio signal should be encoded by a pre-defined modulation scheme according to the specification of transmission channel, spectrum analysis through the models of signal classification and modulation recognition plays a vital role in massive wireless communication systems like fifth generation (5G). Fundamentally, modulation recognition is the task of classifying the modulation type of a received radio signal, which is treated as a multi-class decision problem. With the increasing number of advanced modulation algorithms, the high recognition accuracy of modulation type is of great importance in noise and multipath fading conditions. By means of artificial intelligence (AI) algorithms, automatic modulation classification (AMC) is taken into account as a potential solution of efficient spectrum management. A cost-efficient and well-performed ConvNet, namely MCNet, is studied for robust modulation classification against channel impairments, which promisingly enhances the quality of 5G services. Regarding the network architecture, MCNet is specified by:

Convolutional blocks designed with asymmetric kernels to reduce parameters and acquire more powerful features.
Skip connections deployed for block-wise association to prevent the vanishing gradient problem and accordingly improve the performance of classification model.

[paper-Survey2021] [paper-COMML2020] [paper-WCNC2020] [paper-ICCE2020] [paper-TVT2021] [paper-GLOBECOM2020] [code-github]

Figure: Schematic overview of our proposed method for 3-D action recognition. Given a skeleton sequence (a) extracted pose features (b-1) are transformed into a color image I (b-2) by an encoding technique, called PoF2I. The action image I is refined by a mechanism of adding or eliminating randomly selected skeleton frames. Manifold action images J (b-3) are generated from I for training set augmentation. These action images J are finally fed into deep CNNs (c) for action recognition.

Some remarkable points derived from research:

A novel skeleton-to-image encoding technique is introduced to exploit pose features for a more robust action representation.
An efficient image size refinement scheme is proposed to flexibly adapt varying-length action appearance.
A data augmentation mechanism is further proposed to promote the dataset to be more informative and generalized for recognition accuracy improvement and overfitting prevention.

3D Skeleton-based Action Recognition with Deep Learning

According to a skeleton sequence, an action can be recognized based on the temporal movement of human posture and the spatial relation of body joints. Most of the conventional action recognition systems have exploited handcrafted features (for example, histogram of joint location, relative joint location, spatiotemporal joint motion) for learning recognition models with the help of classifiers, but nevertheless they are unable to competently model long-term action patterns due to shallowly discriminative information. An efficient three-dimensional (3-D) action recognition approach is presented to learn skeleton information by deep convolutional neural networks, in which a novel encoding technique, namely pose feature to image (PoF2I), is introduced to transform the skeleton information to image-based representation.

[paper-TII2020] [paper-INS2020] [paper-INDIN2019]

[paper-IMCOM2020] [paper-SAS2019]

Background Extraction Algorithm for Foreground Detection

The appearance of objects in a video sequence can be detected using many techniques, which are generally classified into frame difference, optical flow, and background subtraction. Despite the widespread use of background subtraction techniques, this class of approaches presents some crucial drawbacks, including its performance in terms of computational costs and the accuracy of the background estimation. The background is fundamentally defined as the reference frame, in which pixel intensities appear with the highest probability. We present an efficient background subtraction method based on a novel background estimation algorithm, Neighbor-based Intensity Correction (NIC), to enhance foreground detection accuracy in dynamic scenes. The idea comes from the consideration and comparison of two intensity patterns generated from the background and the current frame to modify the pixel intensity in the current background.

[paper-TCSVT2017] [paper-AVSS2017] [paper-Access2019]

Figure: Overview of the foreground detection method using the NIC based on the background subtraction scheme.

As the core technique, the rule for intensity modification is then performed by comparing the standard deviation results calculated from neighbor pixels to decide whether a current pixel belongs to the background or the current frame. By adjusting the window size, the algorithm is flexible enough to detect multiple objects moving at different speeds in a scene.

Figure: The workflow of a proposed interaction recognition method using spatio-temporal relation features and topic model.

We go beyond the problem of recognizing video-based human interactive activities by studying a novel approach that permits to deeply understand complex person-person activities based on the knowledge coming from human pose analysis. The research has three technical highlights:

Pose-based spatio-temporal relation features for intra- and inter-person.
Two types of codeword corresponding to joint distance and angle.
Four-level Pachinko Topic Model (PAM) for flexibly modeling interactions.

Interactive Human Activity Recognition

In recent decades, human activity recognition has been an active research area in computer vision and artificial intelligence due to its wide range of potential applications, such as indoor-outdoor surveillance, human robot interaction, and human-computer interaction. Although it has received attention from the scientist community, an effective method for recognizing activities in the real environment still remains a challenge because of variations of appearance, mutual occlusion, and object interactions. Most existing approaches have concentrated on the low-level features, known as local spatial-temporal features, instead of the human body representation, known as skeleton, due to limitations in the pose estimation performance. Group action, generally performed by visual separable people with complicated interactions, such as walking together, approaching, gathering, has been investigated using human-based features and tracking information for detection and recognition.

[paper-INS2016] [paper-INS2018]

Figure: The processing flow of embedding process [paper-INS2018].

Digital Content Authentication - Image Watermarking

Millions of multimedia contents are daily generated and distributed among diverse social networks, websites, and applications fostered by the rapid growth of mobile devices and the Internet. Particularly noticeable is the current pace of creation and sharing of digital images, which are ubiquitously captured to record and show diverse aspects of our personal and social life. This poses important challenges in terms of transmission, storage, and especially the usage of these data, in which the copyright protection plays a crucial role. Unprotected images can be accessed, downloaded and reused by others illegitimately. Consequently, personal images might be subject to commercial or other purposes by third parties without legally requiring the user consent. To avoid this kind of situations, efficient and robust techniques are especially required for digital image copyright protection and authentication. This research develops a efficient digital image watermarking model based on a coefficient quantization technique that intelligently encodes the owner’s information for each color channel to improve imperceptibility and robustness of the hidden information.

[paper-ESWA2016] [paper-INS2018]

Figure: Proposed watermarking model flowchart with watermark embedding and extraction processes [paper-ESWA2016].