This script is a dynamic data streaming and formatting tool designed to pull user data from a public API, format it, and insert it into a SQL Server database. The script leverages Python's requests library to fetch data, processes the JSON response to extract relevant user information, and organizes it into a structured format. It then ensures the target database table exists and inserts the formatted data into the SQL Server. Additionally, the script is designed to run continuously, fetching and inserting new data every few seconds, making it ideal for real-time data streaming applications. This script showcases the integration of API data retrieval, data processing, and database operations in a seamless and efficient manner.
Kaggle - Time series sales forecasting
Forecasts aren’t just for meteorologists. Governments forecast economic growth. Scientists attempt to predict the future population. And businesses forecast product demand—a common task of professional data scientists. Forecasts are especially relevant to brick-and-mortar grocery stores, which must dance delicately with how much inventory to buy. Predict a little over, and grocers are stuck with overstocked, perishable goods. Guess a little under, and popular items quickly sell out, leading to lost revenue and upset customers. More accurate forecasting, thanks to machine learning, could help ensure retailers please customers by having just enough of the right products at the right time.
Specifically, i'll build a model that more accurately predicts the unit sales for thousands of items sold at different Favorita stores. I'll practice my data analysis and machine learning skills with an approachable training dataset of dates, store, and item information, promotions, and unit sales
Data report
Below is a data report using Power Bi to give a visual representation and insight into the data by the grocery retailer. The interactive report can be sent upon request.
Machine Learning and Prediction model (In progress)
American Express - Default Prediction
Credit default prediction is central to managing risk in a consumer lending business. Credit default prediction allows lenders to optimize lending decisions, which leads to a better customer experience and sound business economics. Current models exist to help manage risk. But it's possible to create better models that can outperform those currently in use.
American Express is a globally integrated payments company. The largest payment card issuer in the world, they provide customers with access to products, insights, and experiences that enrich lives and build business success.
The directive was to apply machine learning skills to predict credit default Specifically, leverage an industrial scale data set to build a machine learning model that challenges the current model in production. Training, validation, and testing datasets include time-series behavioral data and anonymized customer profile information.
TITANIC - MACHINE LEARNING FROM DISASTER
The sinking of the Titanic is one of the most infamous shipwrecks in history.
On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.
While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.
In this challenge, we build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
EY - Working World Data Challenge
Overview
This challenge is to predict the occurrence of a single species of frog for a single location using a single data source at a coarse spatial resolution.
The output will be a species distribution model of one species of frog. Species distribution models are one of the most widely used ecological tools, a cornerstone in many countries worldwide of environmental regulation and conservation.
Why frogs? Frogs are an indicator species. This means they are a go-to for scientists wanting to find out more about the environmental health of a particular ecosystem.
Because they have permeable skin, frogs are very sensitive to pollutants, and because they can live on both land and in the water, they are a good indicator of the health of these two different environments.
Frogs are poorly served by existing species distribution models. They have very localized distributions, more restricted than suggested by a potentially suitable habitat, and therefore existing models struggle to represent their range accurately.
As indicators of ecological health and proxies for biological diversity, the disappearance of frogs is of great concern. Where frogs occur, we see healthy, thriving, resilient ecosystems. Where frogs have disappeared, we see ecosystems in poor health. All the 2030 Sustainable Development Goals (SDGs) are underpinned by healthy ecosystems. This means we won’t reach our goals if we don’t prevent and reverse the loss of healthy ecosystems.