Besides Mathematics, I am interested in Statistics and Computer Sciences. Here are some of the projects I have done.
STATISTICS PROJECTS:
Multiple Linear Regression Analysis:
On Microsoft Corporation Stock: We wish to use multiple linear regression to produce a good model for traders to predict the value of future stock returns using the history of stock daily closing prices and volumes. The data (MSFT.xls) contains daily closing prices of Microsoft Corporation from March 20 2012 to April 2 2013 (259 observations), downloaded from YAHOO FINANCE. Since the daily return is not even close to normally distributed in general, we use log return instead. We examine the association between the log return of the most recent closing prices and that of old ones.
Download the documentation and SAS codes here
On U.S. Energy Market: Energy companies have a significant and import impact on the U.S economy. To evaluate the impact of the oil and natural gas companies on the United State stock market, we chose to evaluate the rate at which money is exchanged from one transaction to another (i.e., velocity) for the S&P 500. We developed a multiple linear regression model to predict the following day’s velocity of the S&P 500 based on a large sample of oil and gas companies during a 5-year time period (13 April 2009 – 4 April 2014). We hypothesized that our model would provide a reasonable estimate of the next day’s S&P 500 velocity. This model could be used to predict subsequent day’s based on a subset of the best predictors in the model.
Download the documentation and R codes here
Time Series Analysis:
On the Annual Number of Earthquakes: The two most important variables affecting earthquake damage are the intensity of ground shaking caused by the quake and the quality of the structures in the region. People cannot stop earthquakes from happening. People can however significantly mitigate their effects by identifying hazards, building safer structures, and learning about earthquake safety. One of the purposes that scientists have recorded the number of earthquakes annually is to have a better understanding of their patterns and give better earthquake predictions. Good prediction ability for earthquakes can help reduce the damages earthquakes cause. Time series analysis can be useful in identifying such patterns and producing a model for forecasting.
Download the documentation and SAS codes here (cover page, main text, appx A, B, C)
On the U.S. Milk Production: America’s dairy industry is an important contributor to our nation’s overall economy. There are dairy farms spread across all 50 states and Puerto Rico. Dairy is the number one agricultural business in California, Wisconsin, New York, Pennsylvania, Idaho, Michigan, New Mexico, Vermont, Arizona, Utah, and New Hampshire. Dairies create a ripple effect on the agricultural economy and the economic well-being of rural America. When a dairy farmer sells a dollar of milk, it generates economic activity of $3, and every $1 million of U.S. milk sales generates 17 jobs. The U.S. dairy industry is estimated at $140 billion in economic output, $29 billion in household earnings, and more than 900,000 jobs. Milk does not stay on the farm: where milk goes, jobs follow. Our dairies create jobs for people who grow and ship feed for our cows, as well as for veterinarians, insurance agents, accountants, bankers, and others. Dairy farmers purchase machinery, trucks, fuel, and more from local companies, which generates jobs and income. After milk leaves our farms, it travels by truck to a processor, where people make cheese, ice cream, butter, yogurt, and other dairy products. Truckers, packaging manufacturers and food marketers complete the cycle by transporting and marketing the dairy products everyone loves. This means jobs in the transportation, distribution and retailer grocer industries. Therefore, it is vital that dairy scientists have recorded the numbers of milk production through the years for studying and forecasting. Good milk production forecasting allows planners and policy makers of dairy industry as well as those of related ones to estimate accurate future supply and demand, and formulate appropriate strategies for profit maximization. The objective of this project is to use SARIMA, a time series modeling technique, to fit a model based on historical data to forecast milk production levels.
Download the documentation and SAS codes here (cover page, main text, appx A, B, C)
Intervention Analysis on the U.S. Air Miles: Pre & Post 9/11: Few inventions have changed how people live and experience the world as much as theinvention of the airplane. Over time, air travel has become so commonplace that it would be hard to imagine life without it. The airline industry, therefore, certainly has progressed. However, there are some downturns in its history. A decline in yields and fares was foreseen by early 2001, but it was not until September 11th that the real hurt began. The terrorist attacks on September 11, 2001 shook the United States in a profound way, deeply upsetting the national perception of safety within U.S. borders. No industry or sector of the economy felt the impacts of these events more than the airline industry. Both the immediate reaction to the attacks and the long term repercussions have negatively affected the industry. Since passenger demand has a big impact on the U.S. airline industry and, it is very important to analyse the monthly airline passenger miles in the U.S. A good air mile forecasting can assist planners and policy makers to estimate accurate future demand, and then formulate appropriate strategies for profit maximization. The objective of this project is to use intervention analysis, a time series modeling technique, to fit a model based on historical data to forecast air mile levels.
Download the documentation and SAS codes here (cover page, summary, main text, appx A, B, C)
Time Series Regression on The Demand for Beef vs Price: We consider a SAS dataset called “BeefPrice” which consists of quarterly retail prices and per capita consumption for beef. The data period covers the first quarter of 1977 through the third quarter of 1999. In the dataset, the variable P = price of beef and Q = quantity of beef demanded. We have also been given the Log (P) (=lP) and Log (Q) (=lQ) values in the SAS dataset. The objective is to give the best time series regression model for Log(Q).
Download the documentation and SAS codes here (cover page, summary, main text, appx)
Multivariate Analysis:
COMPUTER SCIENCE PROJECTS:
JAVA Projects:
Word Ladder: This problem finds the shortest distances of a word to every other word, so it is natural to represent words as vertices in an undirected weighted graph and use the Dijkstra algorithm to perform the finding process. This program is created in Eclipse Java.
Download source files here, and documentation here
Coin Removal: Suppose two people play a game in which they set up a line of coins consisting pennies, nickels, dimes, and quarters. Also, suppose that they each contribute half the value of the coins initially and shuffle them to create a random starting line of coins. One person chooses a coin from left most or right most and then put it in his/her pocket. The other then chooses a coin from one of the ends of remaining coins. The two opponents take turns removing a coin in this manner until there is no coin left. The player who has the larger sum of money wins. In this project, we introduce and analyze three different algorithms, and then implement them to compare how they compete one to another. These algorithms are called Simple Greedy (SG), Dynamic Programming (DP) and Extra Greedy (EG).
Download source files here, and documentation here
Clue Game: This is a murder mystery game for three to six players, devised by Anthony E. Pratt from Birmingham, England, and currently published by the American game and toy company Hasbro. The object of the game is to determine who murdered the game's victim, where the crime took place, and which weapon was used. Each player assumes the role of one of the six suspects, and attempts to deduce the correct answer by strategically moving around a game board representing the rooms of a mansion and collecting clues about the circumstances of the murder from the other players. This is a Java GUI programming project.
Download source files here
Encipher/Decipher Algorithms: The project implements algorithms Enigma, JeffersonWheel, Rot13, Solitaire, Vigenere, and Bifid. These algorithms can encipher/decipher user input. A simple GUI is developed for this program. This program is created in Eclipse Java
Download source files here
C++ Projects:
Poker Game: This program sets up the poker game for the user to play with. It shuffles the card deck and let the user to place wagers. This program is created in Microsoft Visual C++ 2005 for Windows.
Download source files here
Website Tracking: The program will read the input text file from user containing information about a series of hits to various websites. It will then store this information and provide the user with a menu that will look up information about a website of his/her choosing. Also, it allows to print the bar chart for each input link. This program is created in Microsoft Visual C++ 2005 for Windows.
Download source files here
File Reader: The program displays files by pages (some number of lines). The user can use command line of the form "HW3 -i file1 [file2[ file3[ ...]]]" where i is a positive integer containing the number of lines in a page, and file1, file2,... are lists of file to display. The user can also run the program directly from HW3.com. In this case, the user is prompted step by step. This program is created in Microsoft Visual C++ 2005 for Windows.
Download source files here
C# Projects:
Chinese Chess Game: This program allows users to input and replay a Chinese chess game. Chess table and pieces are drawn automatically by the program, and no images are needed. Colors of the pieces can be changed as desired. This program is great for people who wish to master chess playing strategies. This program is created in Microsoft Visual Studio dot Net.
Download source files here
Internet Programming Projects:
Networking Projects:
Relational Database Projects: