Zhenxing Niu

Professor,    Xidian Uinversity,  Email: zhenxingniu@gmail.com

Research Interests:

Computer Visioin: Image Generation/Editing, Image Understanding, Low-level Vision, Image/Multimodal  Retrieval

AI Security : Adversarial Attack/Defense, Backdoor Attack/Defense


Biography:

Dr. Zhenxing Niu received his Ph.D. degrees in July 2012 from the Xidian Uinversity. He worked with Dr. Gang Hua in Microsoft Research Asia from 2011 to 2012. He visited Prof. Qi Tian (FIEEE) at University of Texas, San Antonio from 2013 to 2014. From 2017 to 2021, he joined Alibaba DAMO Academy, and did research on computer vision and machine learning. During 2019 to 2020, he worked abroad at the Israel Lab@DAMO Academy, located at Tel Aviv.  During 2020 to 2021, he was a Tech Leader at the Vision Lab@DAMO Academy, in charge of the algorithm development for visual content generation.

His current research interests include computer vision, deep learning and adversarial attack/defense. He has published over 50 papers on TPAMI, TIP, CVPR, ICCV, ECCV, AAAI and IJCAI. He served as the TPC member of CVPR 2015-2022, ICCV 2017-2022, etc. He served as an Area Chair (AC) of CVPR 2022.


Recent Research:

(1) AI Security:

[1] Jailbreaking Attack against Multimodal Large Language Model , arXiv 2024.

AI security about Multimodal-LLM as well as LLM. 

Code: https://github.com/abc03570128/Jailbreaking-Attack-against-Multimodal-Large-Language-Model.git 

[2] Towards Unified Robustness Against Both Backdoor and Adversarial Attacks , TPAMI 2024.

A novel Progressive Unified Defense (PUD) algorithm is proposed to defend against backdoor and adversarial attacks simultaneously.

Code: https://github.com/John-niu-07/PUD

[3] Adversarial Attack and Defense in Deep Ranking , TPAMI 2024.

Anti-Collapse Triplet (ACT) is proposed to defend against deep ranking attack. In particular, for each sample triplet, the positive and negative samples are pulled close to each other via adversarial attack, while the model learns to separate them. This leads to a significant performance improvement over EST defense 

[4] Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks, CVPR 2023.

This is the first work to find the underlying connection between adversarial and backdoor attacks. 

Code: https://github.com/John-niu-07/BPE 

[5] Adversarial ranking attack and defense, ECCV 2020.

This is the first work about an adversarial attack for an image retrieval task. We can manipulate the image rank as we wish. 

Code: https://github.com/John-niu-07/advrank

[6] Practical relative order attack in deep ranking, ICCV 2021.

This work covertly alters the relative order among a selected set of candidates according to an attacker-specified permutation, with limited interference to other unrelated candidates

(2) AI-Generated Content (AIGC) :

[1] Semantic-shape adaptive feature modulation for semantic image synthesis, CVPR 2022.

A fine-grained part-level semantic layout will benefit object details generation, and it can be roughly inferred from an object’s shape, i.e., an object’s shape implies its part-level layout.

Code: https://github.com/John-niu-07/SAFM

[2] Structure first detail next: Image inpainting with pyramid generator, ICME 2023.

We suggest to adopt a ‘structure first detail next’ workflow for image inpainting. we propose to build a Pyramid Generator by stacking several sub-generators, where lower-layer sub-generators focus on restoring image structures while the higher-layer sub-generators emphasize image details


(3) Multimodal  Retrieval/Image Captioning:

[1] Hierarchical Multimodal LSTM for Dense Visual-Semantic Embedding, ICCV 2017. Google Citations: 150+

We address the problem of dense visual-semantic embedding that maps not only full sentences and whole images but also phrases within sentences and salient regions within images into a multimodal embedding space.

[2] Ladder Loss for Coherent Visual-Semantic Embedding, AAAI 2020.

Previous methods normally treat the relevance between queries and candidates in a bipolar way – relevant or irrelevant, and all “irrelevant” candidates are uniformly pushed away from the query by an equal margin in the embedding space. This work introduce a continuous variable to model the relevance degree, where candidates with higher relevance degrees are mapped closer to the query than those with lower relevance degrees.

(4) Face Age Estimation:

[2] Ordinal regression with multiple output cnn for age estimation, CVPR 2016. Google Citations: 600+

This is the first work to address ordinal regression problems using deep learning methods.  A new age dataset is released to the community for age estimation.

Datasets: https://github.com/John-niu-07/tarball  

                       https://github.com/John-niu-07/tarball-lite

Codes: https://github.com/fqhank/Ordinal-Regression-for-Age-Estimation

https://github.com/xjtulyc/Ordinal_Regression_with_Multiple_Output_CNN_for_Age_Estimation