I am currently the CTO of SF Express business group, an express delivery company based at Shenzhen, China.
Previously I was a software engineer at Google working on web crawling and indexing for Google search and later social infrastructures which power Youtube/Blogger comments, G+, hangout, photos etc.
I received my Ph.D from UC Davis in June 2010. My advisor was Dr. Felix Wu. I also got a lot of advice from Dr. Norm Matloff, Hao Chen, Raju Pandey, and Michael Gertz.
Fun fact: my Erdős number, if you know what it means, is 4, through the path of Paul Erdos - Ron Graham - Tom Leighton - Fred Chong - me.
Education
Ph.D, Computer Science, University of California, Davis. Advisor: Dr. Felix Wu (Received in Jun. 2010)
M.S., Computer Science, University of California, Davis. (Received in Sep. 2008)
M.S., Electronic Engineering, Tsinghua University, Beijing, China. Advisor: Dr. Xing Li. (Received in Jan. 2005)
B.S., Electronic Engineering, Tsinghua University, Beijing, China. (Received in Jul 2002)
Research
My research focuses on Web related issues, particularly search engines and online social networks.
During my Ph.D study at UC Davis, most of our research projects are under the large umbrella of Davis Social Links. I looked at the following issues.
Online social network crawling and estimation
Message propagation and social influence measurements
Online privacy and trust management
My past research projects in Tsinghua University are listed as follows. Most of them were done as part of Compass Search.
Link Analysis:
Accelerated PageRank Algorithm
Distributed PageRank Computation
Web Crawler:
High performance crawling: I developed a distributed web crawler as a senior.
Crawling policy: a) server politeness. b) web page quality. c) crawling coverage.
Webpage Classification: Our Compass team took the first place in Chinese Web Page Categorization Competition, April 2003.
IPv6 Web Evolution and Performance Analysis: I monitored the evolution of more than 1,000 IPv6 Web sites from 2001 to 2005. Here is our IPv6 search engine.
During my graduate study in Tsinghua University, I interned at Microsoft Research Asia several times and worked on large scale duplicate document detection.
Publications
Journal Papers:
Shaozhi Ye and Felix Wu. Measuring message propagation and social influence on Twitter.com . International Journal of Communication Networks and Distributed Systems, Vol.11, No.1, pp 59–76, 2013
Shaozhi Ye and Felix Wu. Estimating the size of online social networks. International Journal of Social Computing and Cyber-Physical Systems, Vol.1, No.2, pp 160 - 179, 2011
Jedidiah R. Crandall, John Brevik, Shaozhi Ye, Gary Wassermann, Daniela A.S. de Oliveira, Zhendong Su, S. Felix Wu, and Frederic T. Chong. Putting Trojans on the Horns of a Dilemma: Redundancy for Information Theft Detection. Transactions on Computational Science, Vol.5430, pp 244-262, 2009.
Shaozhi Ye, Ji-Rong Wen, and Wei-Ying Ma. A Systematic Study on Parameter Correlations in Large Scale Duplicate Document Detection. Knowledge and Information Systems, Vol.14, No.2, pp 217-232, Feb. 2008.
Ming Jia, Jiangtao Wen, Shaozhi Ye, and Xing Li. Error Restricted Fast MAP Decoding of VLC. IEEE Communication Letters, Vol.9, No.10, pp 909-911, Oct. 2005.
Shaozhi Ye, Hui Liu, Yue Li, Hui Huang, and Xing Li. Development of IPv6 Networks Viewed from the Angle of Search Engine. Zhongxing Telecom Technology, Vol.40, pp 1-3, 2002. (in Chinese)
Hui Liu, Shaozhi Ye, Hui Huang, and Xing Li. IPv6 Networks Analysis based on Search Engine. Telecommunications Science, No.3, pp 43-45, 2002. (in Chinese)
Conference and Workshop Papers:
The acceptance rates of some conferences and workshops are given (papers accepted/papers submitted).
Ruaylong Lee, Roozbeh Nia, Jason Hsu, Karl N. Levitt, Jeff Rowe, S. Felix Wu , and Shaozhi Ye. Design and Implementation of FAITH, an Experimental System to Intercept and Manipulate Online Social Informatics. In Proceedings of the 2011 International conference on Advances in Social Network Analysis and Mining (ASONAM'11), pp. 195-202, 2011.
Haifeng Zhao, Shaozhi Ye, Prantik Bhattacharyya, Jeff Rowe, Ken Gribble, and Felix Wu. SocialWiki: Bring Order to Wiki Systems with Social Context. In Proceedings of the 2nd International Conference on Social Informatics (SocInfo'10), pp 232-248, 2010.
Shaozhi Ye and Felix Wu. Measuring Message Propagation and Social Influence on Twitter.com. In Proceedings of the 2nd International Conference on Social Informatics (SocInfo'10), pp 216-231, 2010. [slides]
Shaozhi Ye and Felix Wu. Estimating the size of online social networks. In Proceedings of the 2nd IEEE International Conference on Social Computing (SocialCom'10), pp 169-176, 2010. (13%) [slides]
Shaozhi Ye, Juan Lang, and Felix Wu. Crawling Online Social Graphs. In Proceedings of the 12th International Asia-Pacific Web Conference (APWeb'10), pp 236-242, 2010. (33%=41/124) [slides]
Thomas Tran, Kelcey Chan, Shaozhi Ye, Prantik Bhattacharyya, Ankush Garg, Xiaoming Lu, and S. Felix Wu. Design and Implementation of Davis Social Links OSN Kernel. In Proceedings of the Workshop on Social Networks, Applications, and Systems (SNAS'09), pp 527 - 540, August 2009.
Shaozhi Ye, Felix Wu, Raju Pandey, and Hao Chen. Noise Injection for Search Privacy Protection. In Proceedings of 2009 IEEE International Conference on Privacy, Security, Risk and Trust (Passat'09), pp 1-8, 2009. (15%=16/115) [slides]
Daniela Oliveira, Jedidiah Crandall, Gary Wassermann, Shaozhi Ye, Felix Wu, Zhendong Su, and Frederic Chong. Bezoar: Automated Virtual Machine-based Full-System Recovery from Control-Flow Hijacking Attacks. In Proceedings of 2008 IEEE/IFIP Network Operations and Management Symposium (NOMS'08), pp 121-128, 2008. (30%=64/233)
Ming Jia, Shaozhi Ye, Xing Li, and Julie Dickerson. Web Site Recommendation Using HTTP Traffic. In Proceedings of the 7th IEEE International Conference on Data Mining (ICDM'07), pp 535-540, 2007. (20%=101/526)
Lerone Banks, Shaozhi Ye, Yue Huang, and S. Felix Wu. Davis Social Links: Integrating Social Networks with Internet Routing. In Proceedings of ACM SIGCOMM Workshop on Large-Scale Attack Defense (LSAD'07), pp 121-128, 2007.
Shaozhi Ye, Ji-Rong Wen, and Wei-Ying Ma. A Systematic Study of Parameter Correlations in Large Scale Duplicate Document Detection. In Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'06), Lecture Notes in Artificial Intelligence (LNAI), vol: 3918, pp 275-284, 2006. (15%=67/501) [slides] (Best Student Paper Nomination)
Liang Chen, Shaozhi Ye, and Xing Li. Template Detection for Large Scale Search Engines. In Proceedings of the 21st Annual ACM Symposium on Applied Computing (SAC'06), pp 1094-1098, April 2006. (30%=16/55)
Yangbo Zhu, Shaozhi Ye, and Xing Li. Distributed PageRank Computation Based on Aggregation-Disaggregation Methods. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM'05) , pp 578-585, 2005. (20%=76/425)
Yi Wang, Shaozhi Ye, and Xing Li. Understanding Current IPv6 Performance: A Measurement Study. In Proceedings of the 10th IEEE Symposium on Computers and Communications (ISCC'05), pp 71-76, 2005. (40%=147/400)
Jingfang Xu, Shaozhi Ye, and Xing Li. Query based Chinese Phrase Extraction for Site Search. In Proceedings of the fifth international conference on Web Information Systems Engineering (WISE'04), Lecture Notes in Computer Science (LNCS) vol:3306, pp 125-134, 2004. (25%)
Shaozhi Ye, Guohan Lu, and Xing Li. Workload-Aware Web Crawling and Server Workload Detection. In Proceedings of the second Asia-Pacific Advanced Network Research Workshop, pp 263-269, Jul 2004.
Shaozhi Ye, Ruihua Song, Ji-Rong Wen, and Wei-Ying Ma. A Query-Dependent Duplicate Detection Approach for Large Scale Search Engines. In Proceedings of the sixth Asia Pacific Web Conference (APWeb'04), Lecture Notes in Computer Science (LNCS), vol:3007, pp48-58, 2004.
Ji-Rong Wen, Ruihua Song, Deng Cai, Kaihua Zhu, Shipeng Yu, Shaozhi Ye, and Wei-Ying Ma. Microsoft Research Asia at the Web Track of TREC 2003. In Proceedings of the 12th Text Retrieval Conference (TREC 2003), pp 408-417, Nov, 2003.
Yue Li, Hui Liu, Gang Zhu, Shaozhi Ye, and Xing Li. Analysis of IPv6 over Search Engine. In Proceedings of the fifth Joint AEARU Workshop on Web Technology and Computer Science. Oct 2003.
Hui Liu, Ran Peng, Shaozhi Ye, and Xing Li. An Efficient Centroid Based Chinese Web Page Classifier. In Proceedings of the first Asia-Pacific Advanced Network Research Workshop, pp 9-14, 2003.
Technical Report:
Shaozhi Ye. Online Social Network Measurements and Search Privacy Protection. CSE-2010-15, University of California, Davis. (Ph.D Dissertation)
Shaozhi Ye, Felix Wu, Raju Pandey, and Hao Chen. Noise Injection for Search Privacy Protection. CSE-2008-10, University of California, Davis.
David Waetjen, Joshua Viers, Allan Hollander, Shaozhi Ye, and James Quinn. Data Management Strategies Report. In Cosumnes Research Group: Final Report, Chapter 6: Data Management, Jun. 2006.
Note: The materials are presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders.
Awards and Honors
The 8th place at Topic Distillation task and the 5th place at Named/homepage Finding task in Web Track, TREC2003. With Microsoft Research Asia.
The first place at Chinese Web Page Categorization Competition, in conjunction with the first Chinese Symposium on Search Engine and Web Mining, organized and sponsored by China Computer Federation, Beijing, P.R.China, April 2003. With Compass group.
Teaching Assistant
ECS 110 Data Structures and Programing, Fall 2005, Winter 2006, Spring 2006.
Internship
Jun. 2009 - Sep. 2009: Filesystem power loss tests, Google.
Jun. 2008 - Sep. 2008: Filesystem benchmark tools, Google.
Jun. 2006 - Sep. 2006: Distributed filesystem benchmark, IBM Almaden Research Center.
Mar. 2005 - Jun. 2005: Academic publication retrieval, Web Search and Mining Group, Microsoft Research Asia.
Jul 2003 - Nov 2003: Large scale duplicate document detection, Web Search and Mining Group, Microsoft Research Asia.
Jul 2001 - Aug 2001: Autopage SDK(a J2EE Web develop toolkit), Kaipu Internet Information Co. Ltd., China.
Reviewing Activities
The 2009 IEEE International Conference on Social Computing (SocialCom'09)
The 27th Conference on Computer Communications (INFOCOM'08)
The 32nd International Conference on Very Large Data Bases (VLDB'06)
The 2006 IEEE International Conference on Communications (ICC'06)
The third Chinese Symposium on Search Engine and Web Mining (SEWM'05)
The 2004 IEEE/WIC/ACM International Conference on Web Intelligence (WI'04)
The Joint Conference of 10th Asia-Pacific Conference on Communications and fifth International Symposium on Multi-Dimensional Mobile Communications (APCC/MDMC'04)
"The greatest challenge to any thinker is stating the problem in a way that will allow a solution." -- Bertrand Russell