DS & CS Capstone Symposium

Program 2019

10.30 (room 101) - Opening Speech by our Dean: Pr Keith W. Ross

Session 1

10.45 – 11.45

Track 1 (room 101)

Computer Music

AutoDuet: Melody-Accompaniment Cooperation through Multi-Agent Learning - Zijie Lu (poster) (report)

In this project, we explore the problem of interaction between computer music systems, specifically two-way expressive accompaniment. Expressive performance Interaction between intelligent systems is challenging, especially when it incorporates abstract reaction mechanism such as musicianship. We propose a novel recurrent residual neural network with LSTM layer to model the musicianship for each performer of the duet, and synthesize the performance in an on-line fashion. Our architecture has shown to outperform traditional automatic accompaniment algorithms, and is relatively robust for collaboration.


Automatic Synchronization of Videos with Background Music - Ruoming Gong (poster) (report)

Producing high quality music video is a time-consuming task. It requires professional knowledge and great patience. In this project, we made a video synthesis software, which can automatically synchronize music and videos in order to generate music video. We implemented efficient feature extraction methods to extract feature series from music and videos and utilize multi-layer DTW algorithm to match the music feature series and the video feature series. Our video synthesis software provides user with high degree of freedom and producing enjoyable music video.


Generating Drum Tracks For Classical Jazz Music - Michael Dinku (poster) (report)

In this paper, we would explore the problem of generating drum tracks for classical Jazz music. Generating drum tracks is a challenging problem since it requires to take into account the long term time dependency of the notes, and also the overall quality of the drum track produced. We will introduce, and implement a Constrained LSTM architecture for textual drum track generation. Unlike the previous works done in this direction, we will show that the constrained LSTM architecture will provide a humanly feel and groove to the drum tracks produced. Finally, we will discuss and analyze the results, and provide with potential directions that this project can further be taken in the future.


Performance Beat Tracking By Neural Networks - Yixing Guan (poster) (report)

The task of performance beat tracking has remained as an open research problem for many years in the field of music information retrieval. So far, existing solutions are largely based on the periodicity feature of a performance piece. In this project, we outline three frameworks under which end-to-end models can be trained without exploiting the periodicity feature at all. Experiment results show that the trained Bi-LSTM, CNN-CRF models can all perform relatively well on datasets without any explicit periodicity features, while the Asymmetrical Transformer model is also an acceptable solution if we take into account the training speed.

Track 2 (room 102)

Computer Vision & Image Processing

3D Point Cloud Registration Algorithms for the NYUSH IMA Telewindow Project - Zhanghao Chen & Xincheng Huang (poster) (report)

The goal of this project is to develop a set of algorithms that geometrically align, or “register”, the 3D point clouds captured by the depth cameras of “Telewindow” into one 3D image. Current registration algorithms usually adopt a pipeline architecture with multiple stages. Plenty of variants are available for each stage and there is not a standardized solution suited for all applications. We build an experimental registration pipeline and investigate different combinations of the algorithms from each stage to find the optimal pipeline under the “Telewindow” project setting. We identify the most potential algorithm combinations as the result of our experiments, which is able to provide an accurate registration result given clean point clouds of a user’s figure.


TeleWindow Rendering - Kevin Li & Sarah Wardles (poster) (report)

Over the years, computer graphics technology has rapidly progressed from simple 3D modeling to more complex, detailed recreations of models that aim to mirror life. It seems that at the core of computer graphics is the desire to produce photorealistic recreations of real objects, hardest among those to recreate are the human face and body. In practice, actually achieving a rendering of the human face that is perceived as realistic by the viewer is very difficult to create. Humans are much more sensitive to renderings of their own reflections than to those of computer generated objects, meaning the threshold for a human to detect that a rendered model of a human is in fact a computer rendering is very low. The TeleWindow project’s goal is to create a realistic experience of looking into a mirror when in fact the user is actually looking at a computer generated rendering of themselves. When a user sits in front of the TeleWindow screen, it will be as if they are looking at themselves in a mirror. We, the rendering team, aimed to maximize the aspects of rendering a human face that contribute to the feeling of looking into a mirror when looking at a computer generated rendering of the self through a comparative study of point cloud manipulation and experimentation, resulting with a point cloud implementation that takes advantage of alpha manipulation, dynamic point sizing and point density decimation. This implementation is found to be more than 1.5 times more realistic than the Intel RealSense camera’s default rendering method.


Zero-Shot Facial Expression Recognition - Zijia Lu (poster) (report)

Zero-Shot Facial Expression Recognition task aims at building a classification model that can correctly classify images from the emotion classes that it has never seen before. The project improves classification accuracy by proposing new forms of class definitions and uncovers the limitation of the conventional distance metrics and shows the possible direction for improvement.


Optimized OCR for Maritime Compliance using Deep Learning - Jarred van de Voort (poster) (report)

Early approaches in the field of Optical Character Recognition (OCR) have demonstrated success in converting scanned documents into accurate digital representations. However, they quickly become obsolete when tasked with the more advanced task of being able to recognize documents that are captured with digital cameras, which inherently contain high amounts of variance in perspective, lighting, orientation, and background noise. In this paper, I propose a novel approach to this problem by connecting several deep learning architectures that successfully detect documents within a scene, localize text by identifying areas of interest, and finally identify characters from localized text. The proposed Connectionist Augmented Proposal Text Network (CAPTN) model outperforms state of the art OCR engines by better generalizing to artifacts present in images captured by digital cameras.

Lunch Break

Session 2

12.30 – 13.30

Track 1 (room 101)

Social Networks & Graphs

Fake News Detection on Social Networks - Yiyang Sun, Haidong Xu (poster) (report)

Fake news has always been a problem since ancient days. However, with the help of online social media, fake news can spread more easily and quickly nowadays. Meanwhile, the authors of fake news have also adapted. The methods of reporting the real news are applied (by the reporters) in creating the fake news. It is really hard for people to differentiate fake news from real news. Therefore, we tried different machine algorithms and designed a text CNN model to detect fake political news from social media. By using the word frequency dictionary, the model outperformed traditional detection systems in terms of both accuracy and running time.


Synced: Music Recommendation System Based on Social Network Analysis - Robert Prast (poster) (report)

Social networks’ increasing role in implicitly affecting users decisions have disrupted industries such as e-commerce and online photo sharing. The same paradigms that influence social circles in apps like Instagram and Facebook for text and visual multimedia can be applied to audio for a more intuitive and natural music recommendation algorithm. By balancing users inherent trust in their social circles and a mathematically driven process to find similarity between songs , you can begin to dynamically recommend music that is in tune with a specific users taste in a collaborative, socially driven, approach.


Node Embedding Generation with the Edge Adjacency Matrix - Yu Gai (poster) (report)

We study the problem of node embedding generation for sparse networks, which is interesting because of statistical challenge created by network sparsity. We propose a solution based on non-backtracking random walk, which overcomes the statistical challenge by leveraging the edge adjacency matrix. We provide both empirical and theoretical justifications for our solution.


Towards Differentiable Graph Pooling with Structural and Feature Information - Qi Huang (poster) (report)

Graph Pooling is a graph analytic method in the field of graph representation learning. Starting from an original graph and signal pair, the goal is to construct a smaller graph with transformed signals that retain the desired information as much as possible. Traditional methods fall short of utilizing feature information in signal-rich graphs, while more recently learning-based analytic methods discard most structural information. In this project, we proposed a graph-neural-network-based pooling network that computes the node assignment with node feature information, and refines it with graph structural information via proximal gradient descent. With graph classification as the downstream task, our proposed method achieves favorable performance against the baseline on both real and synthetic datasets.

Track 2 (room 102)

Digital Humanities

Analyzing the Downed Aircrews in Occupied France during WW2 and the People Who Assisted Them - Matthew Couch (poster) (report)

In World War 2, downed Allied pilots had to travel across enemy France to return to England. After their return, each pilot had to take part in a mandatory interview. This project attempts to analyze the pilots’ sentiment in their return interviews. The use of lexicon-based SentiWordNet showed overwhelmingly positive sentiment in the document set, and IBM’s Watson Tone Analyzer had a large output of tentative and joyful sentiments. While the results in general show a positive sentiment from the pilots, more investigation needs to be done into what exactly they are positive about. Furthermore, better handling of the historic text is required in order to ensure proper analysis.


Plastic All Around Us - Tisa Segovic (poster) (report) (video)

The primary objective of this capstone project is to create an educational web application about plastic pollution supported by the interactive tools. The web application as a whole focuses on the plastic pollution within maritime areas of the United States. The emphasis in the web application is on the intersection between education and interactivity, since human ability to connect with nature is primarily based on perceived sensory experiences. Thus, the approach in achieving this goal is to create map-based, graphs-based, and quiz-based tools for users to interact and learn from. The data used in these tools is dynamically collected from federal and local government websites. The project gives an insight into the process of collecting, preparing and visualizing that data. Furthermore, it analyses the interactivity achieved within each one of the interactive tools created, and provides the concluding points achieved in this project.


Parallelization of Monopoly® Simulations - Ricardo Chacon, Lu Lu (poster) (report)

Socio-economic disparity is a persistent topic which governments and experts try to tackle. Since its effects on society as a whole are not immediately discernible, using the game of Monopoly® as a base to model the economy we are trying to tackle the lack of statistical data used to prove ways in which we can tackle the issue. By running thousands of simulations under different rulesets we can extract data to illustrate that some rules of the economic game could be altered from those inviolable laws of economics but which in fact are determined by society. Our approach is able to successfully tackle a limited amount of variable rulesets and run thousands of simulations with outputs of statistically meaningful data.


Social Network Analysis of China-related Environmental Literature - Yiyun Fan (poster) (report)

This project aims to provide tools to facilitate social network analysis on China-related environmental literature and its researcher community. China-related environmental study is now a popular study with a rapidly expanding literature; however previous analysis on the literature is mainly qualitative, leading to an interest in quantitative social network analysis which requires access to large amount of formatted literature information. To reach the objective, this project takes the following steps: (1)build a crawler that scrapes article and author information from Web of Science; (2)set up a database for data storage; (3) design an interface for data retrieval and social network analysis; (4) perform social network analysis on a subset of China-related environmental literature. The project has successfully scraped and stored 62578 article records, 376488 author records, 2754111 citation records and 262232 keywords records in total, built an interface for data retrieval that allows specification of keyword, range of year, and minimum cited times, and has performed SNA on a subset of China-related environmental literature.

Coffee Break

Session 3

14.00 – 15.00

Track 1 (room 101)

Data Analysis for Real-World Applications

Adversarial Training for Personality Classification - Virgil Tataru (poster) (report)

Psychology research has shown personality to be an effective predictor of people’s behavior. Automatic personality classification from text would provide many benefits to the medical industry. In the machine learning literature, this problem has proven to be much more difficult than other classification tasks. Moreover, the best results are often achieved by models that are fed both user generated text and hand-engineered features. In this paper, we show an LSTM based model employing Virtual Adversarial Training outperforms other deep learning methods for text based personality classification, even when trained only on user generated text.


Human Vocal Sentiment Analysis - Andrew Huang, Puwei Bao (poster) (report)

In this paper, we use several techniques with conventional vocal feature extraction (MFCC, STFT), along with deep-learning approaches such as CNN, and also context-level analysis, by providing the textual data, and combining different approaches for improved emotion-level classification. We explore models that have not been tested to gauge the difference in performance and accuracy. We apply hyperparameter sweeps and data augmentation to improve performance. Finally, we see if a real-time approach is feasible, and can be readily integrated into existing systems.


LipONet: Lipreading using Neural Ordinary Differential Equation - Abdullah Mobeen, Muddassar Sharif, Shikhar Sakhuja (poster) (report)

LipNet, the most recent approach towards using deep learning to decipher lip reading is the most effective lip-reading technique currently. However, it takes very long and consumes an incredible amount of resources to train the model. Our research consists of several experiments to replace Recurrent Neural Network(RNN) and Spatiotemporal Convolutional Neural Network(STCNN) with Neural Ordinary Differential Equation(ODE). Our experiments showed that ODE Net can not only successfully replace STCNN but also save a lot of time and resources by eliminating the manual process of finding an optimal number of layers and by decreasing the memory cost associated with training Neural Network.


Machine Learning Applied to Failure Detection - Yiqin Qiu, Sihan Peng (poster) (report)

Time-out based failure detection algorithms are supported by accurate estimation of the heartbeat message arrival time. However, the frequently-happening process failures and network fluctuations make the estimation process non-trivial. Meanwhile, machine learning algorithms are the state-of-art algorithms in the field of time series estimation. Therefore, we apply various machine learning techniques, including ARIMA, CNN, and RNN, to failure detection algorithms. We also extend the RNN failure detector to cooperate by sharing the hidden states. We assess the algorithms through simulation based on real network transmission data. Our experiment results indicate that some machine learning based algorithms improve the quality of detection; however, the computational cost of those algorithms prohibits its application with today’s technology.

Track 2 (room 102)

Software Engineering for Real-World Applications

Optimal Satellite Constellation Topologies for Quantum Information Transfer - Kalkidan Fikadu (poster) (report)

In 2016 Chinese Academy of Science launched the first quantum satellite called Micius. The satellite has been used to test entanglement distribution over a long distance, and quantum teleportation of a particle from a station on the ground to Micius satellite. In this project, we adopt an Iridium-like satellite constellation for quantum information transfer to analyze the performance of the network by solving the shortest Manhattan distance between any two satellites connected by successive quantum channels. Moreover, we identify potential areas of improvement to the Iridium topology based on enhancements suggested in other publications.


Simulating Quantum Physics with Quantum Computing - Sean Coneys, Kevin Orellana (poster) (report)

Quantum computing promises to offer great advances in computational speed by harnessing the power of quantum superposition and entanglement. One of the most promising applications of quantum computing is in the simulation of quantum physics, which is known to be a computationally difficult problem. Using classical computers, calculating the quantum state of an N-particle system requires resources that scale exponentially with N, making the problem intractable at the level of 30 or 40 particles. However, using quantum computers one can calculate various properties of the system, such as the time dynamics with an exponential speedup. This project focuses on quantum simulations of the Transverse Ising model.


Automated Database Management for Thyssenkrupp - Bosen Yang, Laura Lehoczki (poster) (report)

Thyssenkrupp used to save their production line data in Excel and rely on manual extraction of data for analysis in the desktop software Minitab. Our software solves the automated data conversion and database management of the plant as well as automated correlation analysis of their production line data, all the while integrated with their existing system and practices. While our program eliminates the biggest bottleneck in day to day data aggregation and analysis, and it’s scalable to other machines of the plant too, it’s only the first small step towards Thyssenkrupp’s way to becoming a smart factory. Source code for this project is available at https://github.com/laural21/capstone.


Abstract Operating System - Skye Im, Feng Chiu (poster) (report)

Operating systems are a crucial portion of any computer science education, but are also one of the most difficult to teach because of the high complexity of the concepts involved, as well as an operating system’s position as a bridge between abstract computations and hardware implementation. Existing work focuses on teaching operating systems as used in the real world, such as modifying existing kernels or developing new ones that are capable of running on hardware. We present an Abstract Operating System (AOS), a simulation-based high-level teaching tool for teaching operating systems courses that runs in user-space, as well as a browser-based visual interface to interact with the AOS. We demonstrate that despite abstracting the lower-level interactions of an OS, AOS allows students to better learn the organizing principles and concepts of such systems.