We are looking into improving gaze estimation on smartphones using their front camera. We are designing dynamic, personalised, calibration techniques to adjust the prediction algorithm with the change of dynamic conditions and holding postures. We are also interested in designing different types of gaze interfaces and investigating their usability on smartphones to support real-time applications.
Sensor-based human activity recognition (HAR) is having a significant impact in a wide range of applications in smart city, smart home, and personal healthcare. Such wide deployment of HAR systems often faces the annotation-scarcity challenge. To tackle this problem, we have proposed several unsupervised domain adaptation techniques, where the activity knowledge from a well-annotated domain can be transferred to a new, unlabelled domain.
ContrasGAN uses bi-directional generative adversarial networks for heterogeneous feature transfer and contrastive learning to capture distinctive features between classes. We evaluate ContrasGAN on three commonly-used HAR datasets under conditions of cross-body, cross-user, and cross-sensor transfer learning. Experimental results show a superior performance of ContrasGAN on all these tasks over a number of state-of-the-art techniques, with relatively low computational cost.
Shift-GAN novelly integrates bidirectional GAN and kernel mean matching to learn intrinsic, robust feature transfer between highly heterogeneous domains. It outperforms 10 state-of-the-art domain adaptation techniques across a large number of human activity recognition tasks.
Assume that we have two users living in two residential settings deployed with different sensors. Each user has been annotated with different activities. XLearn aims to combine their sensor data and activity annotations to learn all these activities on all the users.
UDAR combines knowledge- and data-driven techniques to achieve coarse- and fine-grained feature alignment.
Continuously learning new tasks from new data while preserving knowledge learned from previous tasks
HAR-GAN is a continual learning technique for HAR. It does not require a prior knowledge on what new activity classes might be and it does not require to store historical data by leveraging the use of GAN to generate sensor data on the previously learned activities. We have evaluated HAR-GAN on four third-party, public datasets collected on binary sensors and accelerometers. Our extensive empirical results demonstrate the effectiveness of HAR-GAN in continual activity recognition and shed insight on the future challenges.
Emotion understanding represents a core aspect of human communication. Our social behaviours are closely linked to expressing our emotions and understanding others’ emotional and mental states through social signals. We are exploring different bio-inspired models for multisensory integration in emotion recognition; that is, different ways of integrating visual and audio signals for predicting human emotions.
We proposed three multisensory integration models, based on different pathways of multisensory integration in the brain; that is, integration by convergence, early cross-modal enhancement, and integration through neural synchrony. The proposed models are designed and implemented using third-generation neural networks, Spiking Neural Networks (SNN).
Synch-Graph is a novel bio-inspired approach based on neural synchrony in audio-visual multisensory integration in the brain. We model multisensory interaction using spiking neural networks (SNN) and explore the use of Graph Convolutional Networks (GCN) to represent and learn neural synchrony patterns.
Audio sensing can contribute to daily activity recognition by detecting the use of appliances like coffee machines or microwaves. It also helps to identify environmental context by detecting ambient sound. We are designing algorithms to classify multiple sound sources and learning temporal patterns in acoustic signal such as animal calls.
We explore different approaches in multi-sound classification, and propose a stacked classifier based on the recent advance in deep learning. We evaluate our proposed approach in a comprehensive set of experiments on both sound effect and real-world datasets. The results have demonstrated that our approach can robustly identify each sound category among mixed acoustic signals, without the need of any a prior knowledge about the number and signature of sounds in the mixed signals.
A convolutional recurrent neural network (CRNN) is proposed to learn temporal correlations between gibbon call syllables.