Research

Gary's research interests include:

He has broad interests in developing and applying machine learning, graphics, visualization, vision and optimization techniques to multi-dimensional data analysis. These data include, but not limited to, collection of 2D images, 3D face image sequences, motion capture data, 3D shape segmentation, 3D models with rigid / non-rigid deformation, 4D geometry sequences.

He and his research team have received the best VAST paper award in 2016, the ICCM distinguished paper award in 2019.

Research Topics

Attention Modelling, Salient Object Detection, Semantic Segmentation

Point Cloud Processing, Completion, Compression, Saliency

Geometry Processing, Shape Analysis, Correspondence, Registration, Part Matching

Visual Analytics and Data Intelligence

Facial Dynamics, Temporal Analysis

Information Retrieval, Indexing

Bioinformatics and Applications

 Current Highlights

Keneni W. Tesema, Lyndon Hill, Mark Jones, Muneeb I. Ahmad, Gary K.L. Tam "Point Cloud Completion: A Survey", Transactions on Visualization and Computer Graphics, 1-20, 2024. [Paper DOI] [IEEE Gold Open Access] [Post-Print]

Point cloud completion is the task of producing a complete 3D shape given an input of a partial point cloud. It has become a vital process in 3D computer graphics, vision and applications such as autonomous driving, robotics, and augmented reality. These applications often rely on the presence of a complete 3D representation of the environment. Over the past few years, many completion algorithms have been proposed and a substantial amount of research has been carried out. However, there are not many in-depth surveys that summarise the research progress in such a way that allows users to make an informed choice of what algorithms to employ given the type of data they have, the end result they want, the challenges they may face and the possible strategies they could use. In this study, we present a comprehensive survey and classification of papers on point cloud completion untill August 2023 based on the strategies, techniques, inputs, outputs, and network architectures...

Avishek Siris, Jianbo Jiao, Gary K.L. Tam, Xianghua Xie and Rynson WH Lau, "Inferring Attention Shift for Salient Instance Ranking",  IJCV, 132:964–986, 2024. [Paper DOI] [Springer Open Access] [Post-Print] [Project Page (incl. all codes and data)]

The human visual system has limited capacity in simultaneously processing multiple visual inputs. Consequently, humans rely on shifting their attention from one location to another. When viewing an image of complex scenes, psychology studies and behavioural observations show that humans prioritise and sequentially shift attention among multiple visual stimuli. In this paper, we propose to predict the saliency rank of multiple objects by inferring human attention shift. We first construct a new large-scale salient object ranking dataset, with the saliency rank of objects defined by the order that an observer attends to these objects via attention shift. We then propose a new deep learning-based model to...

Zhaoyi Jiang, Guoliang Wang, Gary K.L. Tam, Chao Song, Bailin Yang, Frederick W.B. Li "An End-to-end Dynamic Point Cloud Geometry Compression in Latent Space", Journal of Displays, 80:102528, 2023. [Paper DOI] [Post-Print]

Dynamic point clouds are widely used for 3D data representation in various applications such as immersive and mixed reality, robotics and autonomous driving. However, their irregularity and large scale make efficient compression and transmission a challenge. Existing methods require high bitrates to encode point clouds since temporal correlation is not well considered. This paper proposes an end-to-end dynamic point cloud compression network that operates in latent space, resulting in more accurate motion estimation and more effective motion compensation. Specifically, a multi-scale motion estimation network is introduced to obtain accurate motion vectors. Motion information computed at a coarser level is upsampled and warped to the finer level based on cost volume analysis for motion compensation. Additionally, a residual compression network is designed to mitigate the effects of noise and inaccurate predictions by encoding latent residuals, resulting in smaller conditional entropy and better results...

Zhaoyi Jiang, Luyun Ding, Gary K.L. Tam, Chao Song, Frederick W.B. Li, Bailin Yang, "C2SPoint: A classification-to-saliency network for point cloud saliency detection", Journal of Computers & Graphics, 115:274-284, 2023. [Paper DOI] [Post-Print]

Point cloud saliency detection is an important technique that support downstream tasks in 3D graphics and vision, like 3D model simplification, compression, reconstruction and viewpoint selection. Existing approaches often rely on hand-crafted features and are only applicable to specific datasets. In this paper, we propose a novel weakly supervised classification network, called C2SPoint, which directly performs saliency detection on the point clouds. Unlike previous methods that require per-point saliency annotations, C2SPoint only requires category labels of the point clouds during training. The network consists of two branches: a Classification branch and a Saliency branch...

David George, Xianghua Xie, Yu-Kun Lai, Gary K.L. Tam, "A Deep Learning Driven Active Framework for Segmentation of Large 3D Shape Collections", Journal of Computer-Aided Design, 144:103179, 2022. [Project Page] [Paper DOI] [Code/Data DOI]

High-level shape understanding and technique evaluation on large repositories of 3D shapes often benefit from additional infor-mation known about the shapes.   One example of such information is the semantic segmentation of a shape into functional ormeaningful parts. Generating accurate segmentations with meaningful segment boundaries is, however, a costly process, typicallyrequiring large amounts of user time to achieve high-quality results. In this paper we propose an active learning framework for largedataset segmentation, which iteratively provides the user with new predictions by training new models based on already segmentedshapes.  Our proposed pipeline consists of three components...

Avishek Siris, Jianbo Jiao, Gary K.L. Tam, Xianghua Xie and Rynson WH Lau, "Scene Context-Aware Salient Object Detection",  ICCV2021 [Project Page (incl. all codes and data)]

Salient object detection identifies objects in an image that grab visual attention. Although contextual features are considered in recent literature, they often fail in real-world complex scenarios. We observe that this is mainly due to two issues: First, most existing datasets consist of simple foregrounds and backgrounds that hardly represent real-life scenarios. Second, current methods only learn contextual features of salient objects, which are insufficient to model high-level semantics for saliency reasoning in complex scenes. To address these problems...

Avishek Siris, Jianbo Jiao, Gary K.L. Tam, Xianghua Xie and Rynson WH Lau, "Inferring Attention Shift Ranks of Objects for Image Saliency",  CVPR2020 [Project Page (incl. all codes and data)]

Psychology studies and behavioural observation show that humans shift their attention from one location to another when viewing an image of a complex scene. This is due to the limited capacity of the human visual system in simultaneously processing multiple visual inputs. The sequential shifting of attention on objects in a non-task oriented viewing can be seen as a form of saliency ranking. Although there are methods proposed for predicting saliency rank, they are not able to model this human attention shift well, ...

Taiwei Wang, David George, Yu-Kun Lai, Xianghua Xie and Gary K.L. Tam, "Consistent Segment-wise Matching with Multi-layer Graphs",  Journal of Computer Aided Geometric Design, 70:31-45, 2019.

[PDF(PostPrint)/DOI] [Codes/Data] [PPT]

Segment-wise matching is an important research problem that supports higher-level understanding of shapes in geometry processing. Many existing segment-wise matching techniques assume perfect input seg- mentation, and would suffer from imperfect or over-segmented input. To handle this shortcoming, we propose multi-layer graphs (MLGs) to represent possible arrangements of partially merged segments of input shapes. We then adapt the diffusion pruning technique on the MLGs to find consistent segment-wise matching...

Gary K.L. Tam, Vivek Kothari and M. Chen, "An Analysis of Machine- and Human-Analytics in Classification", IEEE Transactions on Visualization and Computer Graphics, 23(1):71-80, 2017.

[Best paper award in VAST section] [IEEE Computing Now: Snapshot of Current Trends in Visualization (Honourably mentioned in Feb 2017 issue)] [PDF(PostPrint) / URL / DOI] [Fast Forward Video (Music and Voice Over)]

In this work, we present a study that traces the technical and cognitive processes in two visual analytics applications to a common theoretic model of soft knowledge that may be added into a visual analytics process for constructing a decision-tree model. Both case studies involved the development of classification models based on the “bag of features” approach...

Gary K.L. Tam, Ralph Martin, P.L. Rosin and Yu-Kun Lai, "An Efficient Approach to Correspondences between Multiple Non-Rigid Parts",  Computer Graphics Forum, 33(5):137-146, 2014.

[ICCM distinguished paper award (若琳奖) / ICCM 2019]

[PDF(PostPrint)/DOI]  [Demo/Codes]

Identifying multiple deformable parts on meshes and establishing dense correspondences between them are tasks of fundamental importance to computer graphics, with applications to e.g. geometric edit propagation and texture transfer. Much research has considered establishing correspondences between non-rigid surfaces, but little work can both identify similar multiple deformable parts and handle partial shape correspondences. This paper addresses two related problems, treating them as a whole ...

Gary K.L. Tam, Ralph Martin, P.L. Rosin and Yu-Kun Lai, "Diffusion Pruning for Rapidly and Robustly Selecting Global Correspondences using Local Isometry",  ACM Transactions on Graphics, 33(1):4, 2014.  [PDF(PostPrint)/Cover/DOI] [Demo/Code] [Presented at SIGGRAPH 2014]

Finding correspondences between two surfaces is a fundamental operation in various applications in computer graphics and related fields. Candidate correspondences can be found by matching local signatures, but as they only consider local geometry, many are globally inconsistent. We provide a novel algorithm to prune a set of candidate correspondences to those most likely to be globally consistent. Our approach can handle ...

H. Fang, N.M. Parthaláin, A.J Aubrey, Gary KL Tam, R. Borgo, P.L. Rosin, P.W. Grant, D. Marshall, M. Chen, "Facial expression recognition in dynamic sequences: An integrated approach", Pattern Recognition, 47(3):1271–1281, 2014. [PDF(PostPrint)/DOI]

Automatic facial expression analysis aims to analyse human facial expressions and classify them into discrete categories. Methods based on existing work are reliant on extracting information from video sequences and employ either some form of subjective thresholding of dynamic information or attempt to identify the particular individual frames in which the expected behaviour occurs...


Gary K.L. Tam, Zhi-Quan Cheng, Yu-Kun Lai, Frank Langbein, Yonghuai Liu, D. Marshall, Ralph Martin, Xianfang Sun, P.L. Rosin, "Registration of 3D Point Clouds and Meshes: A Survey From Rigid to Non-Rigid", IEEE Transactions on Visualization and Computer Graphics, 19(7):1199-1217, 2013. [PDF(PostPrint)/DOI]

3D surface registration transforms multiple 3D datasets into the same coordinate system so as to align overlapping components of these sets. Recent surveys have covered different aspects of either rigid or non-rigid registration, but seldom discuss them as a whole. Our study serves two purposes:...