Home‎ > ‎



Email: longchen AT gmail DOT com

  • 10+ years of experience in software development through all lifecycle phases.
  • 5+ years of experience in multi-threading/parallel programming with multi-core and GPUs.
  • Broad knowledge of computer architectures, algorithms, operating systems, multimedia systems, and computer networks.
  • Recognized problem solving skills and analytical thinking.
  • Quick learner; reliable; excellent verbal and written communication skills; team player.

Parallel computing, Programming multi-core/many-core architectures, GPGPU, Parallel algorithms.


Staff Engineer                      Qualcomm Incorporated, San Diego, CA                            10/2010 – 12/2015        
  • Work on next generation of mobile graphics processor architectures.
Research Assistant                  University of Delaware, Newark, DE                                 09/2005 – 10/2010        
  • Research on parallel algorithm design, performance analysis, and program optimization for multi-core architectures, including NVIDIA GPUs and the IBM Cyclops-64 160-core architecture.
  • Authored ten publications in parallelizing and optimizing scientific applications/kernels.
Ph.D. Intern                            Pacific Northwest National Laboratory, Richland WA         06/2009 – 12/2009      
  • Proposed and developed novel dynamic load balancing techniques for GPU-enabled systems.
  • Improved the performance of a GPU molecular dynamics application by 20%.                                                
Software Engineer (intern)      ET International Inc, Newark, DE                                     06/2008 – 08/2008  & 06/2007 – 08/2007
  • Developed system software (simulator, kernel libraries, etc) for the Cyclops-64 architecture.
  • Designed and performed tests for the Cyclops-64 system software.
  • The Cyclops-64 system software is being used by IBM and government agencies.
System Developer                   e-Cop Pte Ltd, Singapore                                                   01/2004 – 01/2005       
  • Designed and implemented functional modules for an enterprise security management system, which provides a unified platform for monitoring and managing security equipments.
  • This system provides services to numerous governments and enterprises in multiple Asian countries.
Research Scholar                     National University of Singapore, Singapore                        04/2001 – 04/2003      
  • Proposed and designed efficient media retrieval strategies for distributed systems.
  • Reduced up to 90% on the client’s waiting time & buffer requirement of a Java-based VoD system.
Software Engineer (intern)      Datang Telecom Corp, Xi'an, P. R. China                            12/1999 – 12/2000     
  • Designed and implemented the gateway control module for an ITU-T H.323 VoIP system.
  • Developed signalling testing tools that were used by the entire team.
  • The VoIP system has been put into commercial operation.
Programming Languages    :            C/C++, OpenGL, Java, Assembly, Pascal, Python, Perl
Parallel Programming         :            Pthreads, MPI, OpenMP, CUDA
Platforms                          :            Linux, Windows
Development Tools           :            GCC, MS Visual Studio, Borland Delphi, Borland JBuilder

Ph.D. in Computer Engineering, 2010                                  University of Delaware, Newark, DE, U.S.A.
Dissertation: Exploring Novel Many-Core Architectures for Scientific Computing
Advisor: Dr. Guang R. Gao (Fellow, ACM & IEEE)

Master of Engineering in Computer Engineering, 2004        National University of Singapore, Singapore
Bachelor of Engineering, 1998                                              Northwestern Polytechnical University, Xi’an, P. R. China

  1. Long Chen, Oreste Villa, Guang R. Gao, Exploring Fine-Grained Task-based Execution on Multi-GPU Systems, In Proc. of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), Austin, Texas, September 26 - 30, 2011.
  2. Oreste Villa, Long Chen, Sriram Krishnamoorthy, High Performance Molecular Dynamics Simulation on Single and Multi-GPU Systems, in Proc. of the IEEE International Symposium on Circuits and Systems (ISCAS 2010), Paris, France, May 30 - June 2, 2010.
  3. Long Chen, Oreste Villa, Sriram Krishnamoorthy, Guang R. Gao, Dynamic Load Balancing on Single- and Multi-GPU Systems, in Proc. of the IEEE International Parallel & Distributed Processing Symposium (IPDPS 2010), Atlanta, Georgia, April 19 - 23, 2010. PDF
  4. Long Chen, Guang R. Gao, “Performance Analysis of Cooley-Tukey FFT Algorithms for a Many-core Architecture, in Proc. of the High Performance Computing Symposium (HPC 2010), Orlando, Florida, April 12 - 15, 2010. PDF
  5. Long Chen, “Programming Many-core Architectures: A Case Study of Optimizing the Fast Fourier Transform on Cyclops-64,” VDM Verlag, 2008.
  6. Liping Xue, Long Chen, Ziang Hu, Guang R. Gao, “Performance Tuning of the Fast Fourier Transform on a Multi-core Architecture,” in Proc. of the 1st Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG 2008), Goteborg, Sweden, January 27, 2008. PDF
  7. Liping Xue, Long Chen, Ziang Hu, Guang R. Gao, “Performance Tuning of Fast Fourier Transform on a Multi-core Architecture,” the 20th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2007), Urbana, Illinois, October 11 - 13, 2007(poster).
  8. Long Chen, Ziang Hu, Junmin Lin, Guang R. Gao, “Optimizing the Fast Fourier Transform on a Multi-core Architecture,” in Proc. of the IEEE International Parallel & Distributed Processing Symposium (IPDPS 2007), , Long Beach, California, March 26 - 30, 2007. PDF
  9. Haiping Wu, Long Chen, Joseph Manzano, Guang R. Gao, “A User-Friendly Methodology for Automatic Exploration of Compiler Options,” In Proc. of the 2006 International Conference on Programming Languages and Compilers, Las Vegas, USA, June 26 - 29, 2006. PDF
  10. Haiping Wu, Eunjung Park, Long Chen, Juan del Cuvillo, Guang R. Gao, “User-Friendly Methodology for Automatic Exploration of Compiler Options: A Case Study on the Intel XScale Microarchitecture,” In the Proc. of the 2006 International Conference on Programming Languages and Compilers, Las Vegas, USA, June 26 - 29, 2006. PDF
  11. Long Chen and Bharadwaj Veeravalli, “Multiple Servers Movie Retrieval Strategies for Distributed Multimedia Applications: A Play-while-retrieve Approach,” IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Human, Volume 36, Issue 4, Pages: 786-803, July 2006. PDF
  12. Bharadwaj Veeravalli, Long Chen, Hun Yen Kwoon, Goh Kar Whee, See Ying Lai, Lim Peng Hian, Ho Chin Chow, “Design, Analysis, and Implementation of an Agent Driven Pull-Based Distributed Video-on-Demand System,” Multimedia Tools and Applications, Volume 28, Number 1, Pages: 89-118, February 2006. PDF
IPDPS Student Travel Assistance Grant, the IEEE Computer Society Technical Committee on Parallel Processing, 2010.
Research Scholarship, University of Delaware, 2005 - 2010.
Research Scholarship, National University of Singapore, 2001 - 2003

        Member, ACM        
        Member, IEEE, IEEE Computer Society.

  • ACM International Conference on Computing Frontiers (CF'12)
  • International Conference on Scientific Computing (CSC'11)
  • International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'11)
  • International Conference on Computer Graphics and Virtual Reality (CGVR'11)

  • Science of Computer Programming
  • Computers and Electrical Engineering
  • International Journal of High Performance Systems Architecture
  • Journal of Supercomputing
  • Multimedia Tools and Applications
  • International Journal of Computers and Applications.