Jing Xiong

Ph.D. in Electrical Engineering

Stanford University

Interests: Computer Vision, Natural Language Processing, Automatic Speech Recognition

Google Experience: Google Lens on-device and server-side computer vision, e2e feature, and ranking model. Cloud AI multimodal LLM quality.

Currently working on Gemini multimodal multilingual evaluation and post training.

Prior to joining Google, I was a research assistant in the VLSI Research Group, working on using computer vision to solve neuroscience problems with Professor Mark Horowitz. I worked closely with the Luo Lab and was co-advised by Professor Liqun Luo.

Education

Ignite - Certificate Program in Innovation and Entrepreneurship, Stanford School of Business

Ph.D. - Electrical Engineering, Stanford University

M.S. - Electrical Engineering, Stanford University

B.S. - Electrical Engineering, Mathematics Minor, Summa Cum Laude, High Distinction, University of Minnesota, Twin Cities

Membership & Services

Members: Association for Computing Machinery, Society of Neuroscience

Reviewer: CVPR Computer Vision for AR/VR Workshop, MICCAI, ICIAP, Artificial Intelligience Review

Program committee: AAAI, Med-NeurlPS

Projects

Industry Project - Gemini Multilingual Multimodal Quality and Post Training

In charge of multimodal mulilingual quality.

[manuscript in preparation]

Hobby Project - Short Video Generation from Short Video

Led a team of four female engineers and won AGI House Gemini 1.0 Hackathon where Sergey gave the talk.

Hobby Project - Video Translation

Ran a translation channel that add subtitles in multiple langauges to Kpop idols live and shows. Created a website for fans to view selected YT videos with created subtitles. The service is currently hiatus.

Industry Project - Gemini Trust and Safety / Responsible AI

Worked on general trust and safety evaluation and filtering for Gemini 1.0 launch on Vertex. Put together dataset for multimodal/multilingual self identification evaluation for Gemini 1.5 launch.

[Defensive publication]

Industry Project - Document AI

Form recognizer for documents with fixed layout [Patent filed and waiting review from USPTO]

Industry Project - Improved ranking of web results for image based search using click data at Google Lens

Initiated, designed, implemented, and generated preliminary results that used user click data to improve image search ranking at Google Lens. The feature uses historical user click data combined with image similarity to rank image search results more effectly. Live experimented Q1 2022.

[Defensive publication]

Industry Project - End-to-end "Top match" feature at Google Lens

Tech lead of the "Top match" (now "high confidence clusters") feature of Google Lens. The feature differentiates and highlights high confidence image results from all retrieved similar images results for user image queries. Directly and primarily responsible for developing the new end-to-end feature. Created and drove the feature road map. Designed and implemented the algorithm that determines high confidence image answer for image queries. Collaborated with 4 different Google teams and across 2 time zones to deliver the feature from scratch in < 3 months. Resulted in 3pp E2E quality improvement and powers about 350M queries per month.

Industry Project - Unified image similarity model at Google Lens

Tech lead of the image similarity ranking model at Google Lens. Responsible for bringing together the visual intelligence of multiple server-side vision models to rank image results from all Lens verticals effectively. Aligned requirements from 4 Lens teams, unified the image similarity definition across all Lens verticals and collected ground truth data for model training. Led engineers from 3 Google teams on training and deploying a unified image similarity scoring model in Lens backend system with neutral latency change, better result quality, and a new SW+AI architecture. This model scores and ranks retrieved images from all Lens traffic.

Winner of Silver Perfy Award in Google for capacity management.

Industry Project - Trust & safety results for people-related queries

Invented and implemented the end-to-end computer vision cascade that enables trust & safety compliant results queries that contains human by showing results from safe results from non-people-sensitive regions in a query that contains people or face.

[Defensive publication]

Industry Project - On-device suggested action on Google Photos

Point of contact of on-device visual intelligience for Lens on Photos. Responsible for brining the visual intelligience of server-side vision models to mobile despite stringent compute and power constraints. Built and improved core on-device computer vision cascade for suggsted action of a feature. Privacy-preserving, compact, entirely built from distilled on-device models. Collaborated across 3 different organizations to bring the feature from scratch to live experiment in < 4 months. Conceived and implemented a new SW+AI architecture to enable E2E user privacy without sacrifacing quality and scale.

[Defensive publication 1]

[Defensive publication 2]

Research Project - Histological slices to Allen Mouse Brain Atlas mapping tool

Histological brain slices are widely used in neuroscience to study the anatomical organization of neural circuits. Systematic and accurate comparisons of anatomical data from multiple brains, especially from different studies, can benefit tremendously from registering histological slices onto a common reference atlas. Most existing methods rely on an initial reconstruction of the volume before registering it to a reference atlas. Because these slices are prone to distortions during the sectioning process and often sectioned with non-standard angles, reconstruction is challenging and often inaccurate. Here we describe a framework that maps each slice to its corresponding plane in the Allen Mouse Brain Atlas (2015) to build a plane-wise mapping and then perform 2D nonrigid registration to build a pixel-wise mapping. We use the L2 norm of the histogram of oriented gradients of two patches as the similarity metric for both steps, and a Markov random field formulation that incorporates tissue coherency to compute the nonrigid registration. To fix significantly distorted regions that are misshaped or much smaller than the control grids, we train a context-aggregation network to segment and warp them to their corresponding regions with thin plate spline. We have shown that our method generates results comparable to an expert neuroscientist and is significantly better than reconstruction-first approaches.Research Projects

[paper][website][code][slides]

Research Project - Anatomical study of the dorse raphe serotonin sub-systems

The dorsal raphe (DR) constitutes a major serotonergic input to the forebrain and modulates diverse functions and brain states, including mood, anxiety, and sensory and motor functions. Most functional studies to date have treated DR serotonin neurons as a single population. Using viral-genetic methods, we found that subcortical- and cortical-projecting serotonin neurons have distinct cell-body distributions within the DR and differentially co-express a vesicular glutamate transporter. Further, amygdala- and frontal-cortex-projecting DR serotonin neurons have largely complementary whole-brain collateralization patterns, receive biased inputs from presynaptic partners, and exhibit opposite responses to aversive stimuli. Gain- and loss-of-function experiments suggest that amygdala-projecting DR serotonin neurons promote anxiety-like behavior, whereas frontal-cortex-projecting neurons promote active coping in the face of challenge. These results provide compelling evidence that the DR serotonin system contains parallel sub-systems that differ in input and output connectivity, physiological response properties, and behavioral functions.

[paper]

Course Project - CNN-based segmentation on NISSL-stained histological images

We adopted the structure of the fully convolutional network for this segmentation problem. We trained a model to segment an experimental histological image into main brain regions - grey (cerebrum, brainstem, and cerebellum), fiber tracts, and ventricular systems - and background and achieved 96.1% accuracy on the test reference slices and 92.1% accuracy on test experimental datasets. Network is mainly trained on reference images because of the limitation on segmented experimental data.
[project report]

Course Project - Hidden emotion detection through analyzing facial expression

The Android app we developed takes video frames of a human face from camera as input and outputs a fusion image of extracted facial features and contours and a motion distribution map. The motion distribution map is generated based on MicroExpression hotmap with special color added. The brightness of the color is scaled by the magnitude of motion in each different area on the face. The client, an Android device, gets the initial location of eyes and mouth. Covariance based image registration is used to generate motion distribution of facial features on the server side. The fusion image generated with the information is then sent back to the client for display. Users can learn from this fusion image about micro changes of face features and thus interpret the human emotions. Since more than key points of facial features are extracted, we expect full utilization of our data to give precise interpretation proovided a robust scoring system of motions of different facial features and contours.
[report]

Course Project - Recommender system utilizing users' listening history and social network information

We implemented a music recommender system based on users' listening history and social network. We used collaborative filtering with both user-based and item-based strategies. For user-based collaborative filtering, we measured users' similarity with both the binary information and actual play count in their listening history. Our methods significantly increased the accuracy of recommendation. Furthermore, we modified the user-based collaborative filtering algorithm and came up with a method that combined the users' listening history and social relationships for music recommendation.
[report]

PhD Thesis

Mapping histological brain images to the Allen Mouse Brain Atlas (PDF)
Stanford University, PhD thesis

Talks

Stanford Imaging Symposium 9/17/2018.

Stanford Center for Image Systems Engineering (SCIEN) Industry Affiliates Meeting 2018.

Biomedical Computation at Stanford Symposium 4/4/2016.

Stanford Bio-X IIP Symposium 2/17/2016.

Center for Biomedical Imaging at Stanford Symposium 4/29/2015.

Invited Talks:

Neuroscience Conference 2018

CSIT 2021

AnalytiX2021

Awards and Honors

Friends of Music Applied Music Scholarship, Stanford University

Stanford Graduate Fellowship, Stanford University

Albert George Oswald Prize, University of Minnesota

KSP & Kumar Scholarship, University of Minnesota

Miscellaneous

I love scuba diving. I was involved with the Guzheng Community at Stanford when I wasn't that busy with research, and have performed in the Stanford Chinese New Year Spring Gala at the Memorial Auditorium and Stanford Guzheng Ensemble Concert.

I enjoy watching basketball games and am certified as a secondary basketball referee in China.

Report abuse