Blog

Hackathon: May 2018

by Nick Giangreco

Hello Everyone! We had a small yet successful hackathon here at CUMC!

This time around, after advertising via email and posting posters all over campus, we had a dental student and a CS undergraduate student join us for this month's hackathon.


While David, the dental student, was studying away for his renal class exam, Sarah, the CS undergraduate student, outlined and strategized how to create a programmable robot!

Sarah, like many people, has a million projects that she wants to complete but struggles with choosing which ones to do, especially for helping herself on the job market. But today, we worked together to figure out what she is most interested in, and devised a plan to develop a robot that is controlled through a iOS application on the phone. Here is Sarah with a possible robot design:

Also, Nick worked on developing a EDA notebook for his Kaggle team.


That's it folks! We had a rainy day today so that might have detered people. But there was also the royal wedding which is apparently of large interest to people. I'm not very royal so I'm OK with people watching the wedding haha Anyways see you next time!

Cheers!

Hackathon: April 2018

by Nick Giangreco


Hello everyone! We had another successful hackathon here at CUMC!

We had scientists working on different projects such as:

  • Using R packages and methods, such as Seurat and powerTCR, with their RNA-Seq data,
  • Working through tutorials, such as Word2Vec embedding, with their clinical data,
  • Developing an online platform for facilitating study groups and identifying those that fit the students interests,
  • Creating a Web framework for visualizing and interpreting drug safety data, and
  • establishing more logistics for this club to make it better serve the CUMC community.

We decided to change the group's name to CUMC Data Science Club instead of CUMC Data Science Group. We also drafted a semi-Mission Statement:

The purpose of the CUMC Data Science Club is to provide a supportive and conducive environment to learn and do biomedical data science in the CUMC community.

We are also committed to the following:

The club serves the graduate student community during their PhD career, but the club is open to any CUMC scientist at any point in time.

We brainstormed other events we would like to have in addition to the monthly Hackathons, which have been enormously useful and productive to students. We will have Data Science Lunch Hours, which will be guided workshops/lecture-style presentations/data science show-and-tells with lunch provided. This will be driven by student interest, where we'll email a survey before each term to determine topics of interest to students. From the survey results, we will put out recommended topics that other students may like to present on and invite scientists to present on a topic of interest.

Also, we have BIG NEWS! We are now "officially" recognized as a student club by the office of graduate affairs. Hence, we are trying to ramp up our club mission, student interest, and upcoming events.

We will reach out to gradtalk, the CUMC graduate student listserv, once classes are almost over.

Also, we want to emphasize that this club is open to master's students, postdoctoral fellows, staff scientists, as well as any other student on the medical campus.

We are looking forward to the next hackathon and more events in the future!


P.S. Check out te photos below of our group post-Hackathon and taco bar-ing!

Hackathon: March 2018

by Nick Giangreco


Hello Everyone! This is our first post using this google site! We decided to transition here, compared to my static webpage because we wanted a site that the CUMC community can maintain for itself. This site allows for uploading useful and reliable resources most used by the community. Because the site is edittable in real-time, the most up-to-date resources can be added and viewed to best serve the community, compared to aimlessly searching the Web.


Anyway, the reason for this post is to detail our most recent Hackathon in March 2018. This is our 3rd Hackathon for the group, where scientists spanning different departments, skill levels, and data science goals join together. The community space the hackathon fosters gives scientists a working on computational projects a place to give feedback, share ideas, and ask questions.


Today, scientists came to work on personal projects and research projects from their lab.


One scientist performed a drug-drug similarity test on Medicare Part D claims data to find drugs commonly prescribed together. Others worked on analyzing RNA-Seq data generated from their lab-a common data analysis protocol at CUMC. A notable, ongoing project for one scientist has been to optimize light exposure covering a region of soil for his garden at home, requiring calculation of definite integrals and optimization methods.


We also worked more on advancing the CUMC Data Science Group's website, as well as brainstorming the group's goals, mission, and fit within the CUMC community. We had come a long way the past several months in terms of defining what the group is and the role it should have at CUMC, but we are coming close to a convergence!


Hackathon Project Summary (Katie Shakman):

During the hackathon, I worked on a project to assess the similarity of drugs from publicly available claims data. The claims dataset was reorganized to give each drug as a row. First I used dimensionality reduction by Principal Components Analysis to get each drug as a point in 3-PC-space, and then I applied K-means clustering (figure below).

Each colored dot represents a drug, each gray dot is the center of a cluster.

The interpretation is that we can see groups of drugs that are nearby in this PC-space and use that to try to infer which drugs are most related in terms of their usage.