In recent years, real-time radio-frequency (RF) signal classification has become essential across diverse applications such as radar-based human activity recognition (HAR), RF waveform classification, and wireless communications. Traditionally, RF classification relies on a two-step approach involving computationally expensive transformations (like spectrogram generation), followed by classification using deep neural networks (DNNs). However, these methods face challenges such as high computational latency, limited interpretability, and inefficiency in processing complex-valued RF signals directly.
To overcome these limitations, we introduce PLFNets, a novel deep learning framework specifically designed for direct classification of complex-valued raw IQ radar data, using parameterized learnable filters (PLFs).
Parameterized Learnable Filters (PLFs):
Developed a generalized and flexible filter-learning framework utilizing structured functions including:
Sinc
Gaussian
Gammatone
Ricker Wavelets
Each filter function is parameterized by physically meaningful attributes such as center frequency and bandwidth, facilitating interpretability.
Complex-Valued Neural Network Architecture:
Introduced a fully complex-valued CNN structure that directly processes raw IQ radar signals.
Integrated PLFs as the initial layer, significantly reducing computational complexity and latency by directly learning from the signal’s raw spectral features.
Enhanced Interpretability:
Learned filter parameters provide clear physical interpretations (e.g., identifying critical frequency bands), enabling better understanding and trust in the model’s decisions.
Datasets:
Evaluated extensively using both synthetic RF modulation waveforms (LFM, Barker, Frank, Costas, Rectangular) and a real-world American Sign Language (ASL) radar dataset.
Accuracy and Efficiency:
Achieved an average of 47% improvement in accuracy over conventional 1D CNNs on raw RF data.
Delivered approximately 7% improvement over CNNs using real-valued learnable filters.
Matched or exceeded the accuracy of computationally intensive 2D CNN models applied to micro-Doppler spectrograms, while reducing computational latency by around 75%.
Computational Advantages:
Demonstrated significantly lower training and inference times, making PLFNets highly suitable for real-time and near real-time RF sensing applications.
The proposed PLFNets framework sets a new standard for computational efficiency, interpretability, and accuracy in radar-based applications. It is especially valuable for:
Radar-based Human Activity Recognition (HAR)
RF waveform modulation recognition
Real-time RF sensor deployments in smart environments
Edge computing platforms for immediate classification tasks
Further research aims to extend the PLFNets framework to:
Real-time embedded system implementations.
Address complex multi-signal scenarios.
Expand application domains, including joint radar-communication systems and advanced wireless communications.
Through PLFNets, the integration of deep learning and radar signal processing reaches a new level, offering an innovative pathway towards efficient, real-time, and interpretable RF signal analysis.
The PLFNet Architecture
The PLF Block
Top four filters projected on top of a Micro-Doppler Spectrogram
This project showcases a novel deep learning framework designed to reconstruct high-resolution time-frequency representations from radar signals. These signatures are crucial for applications like human activity recognition (HAR). Here's a breakdown of the key ideas:
Motivation and Challenges:
Traditional methods, such as the Short-Time Fourier Transform (STFT), require a trade-off between time and frequency resolution. They are also highly sensitive to noise and demand extensive parameter tuning, which can limit their effectiveness in practical, real-world scenarios.
The HRSpecNet Approach:
HRSpecNet tackles these challenges with a novel, three-stage deep neural network architecture that directly converts 1D complex radar signals into high-quality 2D time-frequency representations:
Autoencoder Module:
This part of the network works to denoise the input signal. By learning a compact representation, it helps to remove interference and improve the overall signal quality.
Convolutional STFT Module:
Instead of using a fixed window length like the traditional STFT, this module uses learnable convolutional filters. It creates multiple intermediate frequency domain representations, adapting to various signal characteristics.
U-Net Module:
Finally, the U-Net fuses the features from the previous block to produce a high-resolution spectrogram. Its design ensures that subtle details—such as the intersection of frequency components—are preserved, even in noisy conditions.
Performance and Validation:
The HRSpecNet framework was evaluated on both synthetic signals and real-world data (including a challenging dataset of American Sign Language gestures). The results demonstrated:
Improved Resolution and Noise Robustness:
HRSpecNet produces clearer, sparser, and more accurate time-frequency representations compared to traditional STFT and other deep learning approaches.
Enhanced Classification Accuracy:
When the generated micro-Doppler signatures were used for classifying human activities, HRSpecNet achieved higher accuracy, improving performance by over 3% relative to state-of-the-art methods.
Computational Efficiency:
Despite its advanced capabilities, HRSpecNet maintains high computational efficiency, making it a practical choice for real-time applications.
Impact:
By effectively addressing the limitations of conventional techniques, HRSpecNet opens up new possibilities in radar-based HAR applications—from improved safety in automotive systems to enhanced performance in healthcare monitoring and security systems. In summary, HRSpecNet represents a significant step forward in leveraging deep learning for radar signal processing, providing robust, high-resolution micro-Doppler signatures that enable more accurate and efficient human activity recognition.
The Complete Proposed Architecture
Flow-diagram of the proposed architecture
HRFreqNet is a deep neural network designed to overcome the limitations of FFT-based frequency estimation for radar range profiling by addressing issues such as the Rayleigh resolution limit and high sidelobe levels. It employs an auto-encoder block to enhance SNR by denoising 1D complex time-domain signals, a frequency estimation block that learns frequency transformations to generate pseudo frequency representations, and a 1D-UNET block that reconstructs high-resolution frequency representations. This integrated approach leads to enhanced resolution, improved estimation accuracy, effective noise suppression, and produces range profiles that are both accurate and sparse with lower sidelobe levels, as demonstrated on both synthetic and real-world radar data.
The Complete Proposed Architecture
Traditional RF classification systems typically assume the presence of only a single target within the field-of-view, making multi-target recognition a challenging task. This is largely due to conventional radar signal processing techniques, which result in the overlapping of micro-Doppler signatures from multiple targets. Our project addresses this limitation through the development of an innovative angular subspace projection technique.
This approach generates multiple radar data cubes conditioned on angle (RDC-ω), enabling the separation of signals in the raw radar data. As a result, deep neural networks (DNNs) can be effectively employed to process raw RF data—or any other data representation—in multi-target scenarios. In situations where targets are in close proximity and classical techniques struggle to differentiate between them, our method enhances the relative signal-to-noise ratio (SNR) between targets. This produces multi-view spectrograms that substantially improve classification accuracy when used as input for a specially designed multi-view DNN.
Our experimental results demonstrate that, for a nine-class human activity recognition problem in a scenario involving three individuals, we achieved an accuracy of 97.8% using a DNN trained on single-target data. Additionally, the proposed approach has shown significant performance boosts in challenging close-proximity scenarios, such as sign language recognition and side-by-side activities.
This project represents a major step forward in RF-based multi-target recognition, with promising implications for applications in surveillance, human activity monitoring, and advanced human-machine interaction systems.
Proposed end-to-end framework of the angular projection method for a classification application
Radar-based technologies are now ubiquitous in both civilian and military applications, ranging from navigation to electronic warfare, yet their widespread adoption has led to challenges like spectrum congestion and interference between radar and telecommunications signals, making fast and accurate waveform recognition essential. Traditional methods that transform raw RF data into alternative feature spaces, such as the time-frequency domain, often suffer from low feature representation fidelity and high computational costs. To address these issues, we propose a filter-based deep learning framework that learns directly from raw RF data by incorporating parameterized filters with learnable cutoff frequencies, which enable the extraction of high-level features with clear physical interpretations. Initial validation on synthetic RF waveform receptions, using both Sinc and Gabor filters, demonstrates state-of-the-art performance with an overall accuracy of 97.4%—all achieved without any extensive data preprocessing.
This study fills a research gap by directly comparing RF sensors for real-time human motion recognition in applications like human-computer interfaces and smart environments, using datasets collected from a 77 GHz mmWave FMCW TI Radar and a Raspberry Pi 3B+ equipped with Nexmon firmware for Wi-Fi. Spectrograms are generated from both systems under the same scenarios, and the performance is evaluated in a 7-class human activity recognition (HAR) task. The results show that radar outperforms Wi-Fi by 32.7% in accuracy, underscoring the superiority of radar technology in HAR while also highlighting Wi-Fi’s potential for indoor activity monitoring. Additionally, both the datasets and the associated code have been made publicly accessible to support further research in this emerging field.
Autonomous vehicles rely on advanced driver-assisted systems (ADAS) equipped with sensors such as radar, lidar, and cameras to understand and navigate their surroundings. These systems help address various driving scenarios—from collision avoidance and lane changes to navigating intersections and managing sudden speed variations. However, one significant barrier to full automotive autonomy is the challenge of navigating unstructured environments where traditional traffic signals (like traffic lights) are absent or non-operational.
In these scenarios, human intervention is often required to direct traffic through hand signals or gestures. Interpreting such human body language and gestures presents a formidable challenge for autonomous systems. To tackle this, our project introduces a novel dataset that captures traffic signaling motions using an integrated suite of sensors: millimeter-wave (mmWave) radar, cameras, lidar, and a motion-capture system. The dataset reflects real-world conditions based on the U.S. traffic system.
Preliminary experiments employing radar micro-Doppler (µ-D) signature analysis in conjunction with basic Convolutional Neural Networks (CNN) have achieved approximately 92% accuracy in classifying traffic signaling motions. These results underscore the potential of deep learning to enhance the interpretation of human gestures, thereby improving the responsiveness and safety of ADAS in complex, real-world environments.
Block Diagrams for CNN architectures