Tools Used: ArcGIS Pro, QGIS
Project Type: Spatial Analyst & Geohazard Mapping
Project Overview
This project conducted a comprehensive spatial pattern analysis of earthquake occurrences in Indonesia using ArcGIS Pro, covering a 20-year dataset from 2000 to 2020. Leveraging Indonesia's position along the Pacific Ring of Fire, I integrated seismic data from multiple sources to evaluate the distribution, intensity, and clustering of earthquakes across the archipelago. The analysis employed advanced statistical techniques—Average Nearest Neighbor (ANN), Kernel Density Estimation (KDE), and Hot Spot Analysis (Getis-Ord Gi*)—to map seismic activity, identify high-risk zones, and assess their statistical significance, providing actionable insights for disaster risk reduction and urban planning.
Project Highlights
Collected and filtered extensive earthquake data (magnitude, depth, and location) spanning 2000-2020 across Indonesia’s tectonic boundaries
Applied Average Nearest Neighbor (ANN) analysis, revealing a clustered pattern with a Z-score below -2.58, indicating significant spatial aggregation
Utilized Kernel Density Estimation (KDE) with varying fishnet grid sizes (e.g., 1.5x1.5 km to 0.5x0.5 km) to generate a continuous density surface, distinguishing high, moderate, and low seismic density areas
Conducted Hot Spot Analysis (Getis-Ord Gi*) to identify statistically significant hotspots with 95% confidence, focusing on regions like Sumatra and Java
Visualized results with thematic maps, including cold spot and hotspot delineations, to support targeted mitigation strategies
Skills & Concepts Demonstrated
Proficiency in spatial pattern analysis and geostatistical methods for earthquake hazard assessment
Deep understanding of tectonic influences and statistical tools like ANN, KDE, and Getis-Ord Gi*
Advanced application of GIS for seismic data processing, interpolation, and visualization
Interpretation of spatial statistics and Z-scores for identifying earthquake-prone areas and informing disaster preparedness policies
Tools Used: ArcGIS Pro, Spatial Statistics Tools, Kernel Density Estimation
Project Type: Crime Mapping & Spatial Pattern Analysis
This project involved analyzing the spatial distribution of crime incidents in Chicago using point-based crime data. Through multiple spatial analysis techniques, the project aimed to identify crime hotspots, central tendencies, and spatial dispersion patterns to support urban safety insights.
Crime Mapping: Mapped geocoded crime incidents to visualize the overall spatial pattern.
Mean Center & Standard Distance: Calculated the mean center to locate the average position of crimes, and standard distance to understand the spread of incidents around the center.
Kernel Density Estimation (KDE): Generated a heatmap to highlight areas with high concentrations of criminal activity.
Quadrant Analysis: Divided the study area into quadrants to evaluate the spatial concentration of crimes within each zone.
Spatial pattern analysis using point data
Application of central tendency and dispersion in GIS
Creation of density surfaces using KDE
Visual and statistical evaluation of crime clustering
Tools Used: ArcGIS Pro, ArcPy
Project Type: Spatial Analyst & Coverage Mapping
Project Overview
This project analyzed fire station coverage across the Greater Toronto Area (GTA) using GIS to assess service accessibility. I created buffer zones around fire stations to identify areas with and without coverage, integrating data on green spaces and bikeways to evaluate response efficiency and gaps in service.
Project Highlights
Generated 1-mile buffer zones around fire stations to map coverage areas
Identified no-service zones by overlaying buffers with GTA boundaries
Incorporated green spaces and bikeway data to assess response route options
Visualized coverage gaps using thematic mapping for urban planning insights
Skills & Concepts Demonstrated
Proficiency in buffer analysis and spatial overlay techniques
Understanding of service area mapping and gap identification
Integration of multiple spatial datasets for comprehensive analysis
Effective use of GIS for emergency response planning
Tools Used: ArcGIS Pro
Project Type: Environmental Analysis & Spatial Modeling
Project Overview
This project utilized a Space-Time Cube (STC) in ArcGIS Pro to analyze spatiotemporal patterns of fine particulate matter (PM2.5) across Ethiopia, focusing on its impact on air quality and public health. I constructed an STC using satellite-based PM2.5 data from 1998 to 2019 to visualize and assess pollution trends, incorporating the Equal Earth projection for accurate global representation.
Project Highlights
Built a Space-Time Cube to aggregate spatial and temporal PM2.5 data
Applied Equal Earth projection for precise area representation in global analysis
Identified hot and cold spots using Getis-Ord Gi* and Mann-Kendall trend tests
Analyzed limitations like data sparsity and edge effects with mitigation strategies
Skills & Concepts Demonstrated
Proficiency in spatiotemporal data analysis with STC
Understanding of map projections and their impact on data accuracy
Application of statistical tools for pattern detection
Problem-solving for data resolution and computational challenges
This project visualized surface temperature distribution across Africa using two interpolation techniques: Inverse Distance Weighted (IDW) Smooth, and Modified Kriging. Temperature point data were interpolated into continuous raster surfaces to highlight spatial trends, particularly extreme heat zones in North Africa and cooler climates in the south. The project focused on comparing the visual output and spatial accuracy of both methods to understand their suitability for large-scale environmental mapping.
Generated temperature surfaces using IDW and Kriging
Compared spatial smoothness and predictive reliability of each method
Applied Equal Earth projection for accurate global representation
Used classified color ramps to enhance interpretability of temperature zones
Spatial interpolation and environmental modeling
Geostatistical vs. deterministic technique comparison
Effective symbology and map design in ArcGIS Pro
Analyzing spatial temperature patterns across diverse climates
This project analyzed population distribution across Texas counties using a raster-based dataset as the primary source. A population density map was generated and further examined through hot spot analysis (Getis-Ord Gi)* and spatial clustering to detect statistically significant patterns. The analysis helped identify urban population hubs, low-density rural regions, and spatial inequalities. A detailed analysis report was generated within ArcGIS Pro to document findings.
Converted raster population data to county-level density estimates
Created a choropleth map of population density across Texas counties
Applied Getis-Ord Gi* to detect hot and cold spots of population density
Performed cluster analysis to highlight concentrated population zones
Produced a full spatial analysis report summarizing methods and results
Raster data processing and spatial summarization
Spatial statistics and hot spot detection in ArcGIS Pro
Demographic visualization using choropleth maps
Report writing and interpretation of geospatial trends
Tools Used: ArcGIS Pro
Project Type: Spatial Analyst & Transportation Planning
Project Overview
This project assessed bus accessibility and identified areas needing future expansion in the Chattanooga region in Tennessee using ArcGIS Pro. I analyzed population density, poverty levels, and vehicle access data to map current bus coverage and highlight underserved areas, such as Signal Mountain, Red Bank, and East Ridge, for improved transport planning.
Project Highlights
Mapped bus service coverage using buffer zones around existing routes
Analyzed population density and poverty data to prioritize expansion areas
Identified regions with no vehicle access to guide transit infrastructure development
Visualized traffic flow patterns to optimize route efficiency in Chattanooga
Skills & Concepts Demonstrated
Proficiency in spatial analysis for transportation planning
Understanding of demographic data integration with GIS
Application of buffer and overlay techniques for accessibility mapping
Interpretation of traffic and accessibility patterns for urban development
This project analyzed the spatial distribution of dissolved oxygen (DO) levels in Chesapeake Bay for the years 2014 and 2015 to assess water quality conditions. Using interpolated environmental data, maps were generated to visualize critical low-oxygen zones, often associated with eutrophication and stress in aquatic ecosystems. The maps help identify temporal shifts in hypoxic areas that can affect fish habitats and biodiversity.
Visualized spatial variation in DO levels using continuous raster interpolation
Identified hypoxic zones (DO < 5 mg/L) across two consecutive years
Mapped environmental change to support aquatic health assessment
Applied a clear red–green color ramp to indicate low to high oxygen levels
Environmental data visualization in ArcGIS Pro
Raster interpolation for water quality metrics
Thematic map design for time-based comparison
Understanding of ecological indicators and their spatial trends
Tools Used: ArcGIS Pro, Spatial Analyst, Global Moran’s I, Geoprocessing Tools
Project Type: Spatial Statistical Analysis & Cartographic Modeling
This project focused on exploring spatial autocorrelation patterns in U.S. county- and state-level GDP data by industry using ArcGIS Pro. The goal was to analyze whether geographic patterns of economic activity—specifically in the Agriculture and Finance sectors—are clustered, dispersed, or random across space, and to evaluate the influence of scale and spatial relationships.
Prepared and cleaned GDP datasets by sector and geography (BEA data).
Analyzed MAUP by comparing GDP percentages at the county and state levels.
Created choropleth maps to visualize agricultural GDP distribution.
Identified spatial neighbors using contiguity and distance methods.
Calculated Global Moran’s I to assess spatial clustering.
Interpreted statistical results to understand spatial autocorrelation.
Implementation of spatial statistical techniques in GIS
Understanding and addressing ecological fallacy and MAUP
Creation and interpretation of spatial weight matrices
Development of cartographic products for economic data communication
Application of inferential spatial statistics to real-world economic datasets
Tools Used: ArcGIS Pro, Intersect Tool, Erase Tool, Attribute Table Analysis
Project Type: Temporal Change Detection & Population Analysis
This project focused on analyzing territorial and demographic changes in selected districts of Uganda between 2006 and 2020. Using geospatial tools, I compared changes in district boundaries and assessed their impact on population distribution. The goal was to understand how administrative shifts affect spatial analysis outcomes and data interpretation.
Boundary Change Detection: Identified areas that were retained, newly added, or excluded during boundary updates using the Intersect and Erase tools.
Population Analysis: Calculated population totals for unchanged, added, and removed areas using attribute table queries and summaries.
Thematic Mapping: Created maps highlighting different land areas and visualizing population distribution changes over time.
MAUP Discussion: Reflected on the Modifiable Areal Unit Problem (MAUP) and its impact on spatial and bivariate statistical analysis.
Significant changes in district boundaries were observed, especially in Kitgum and surrounding areas.
The population in newly added areas in 2020 was 5,870, while unchanged areas retained a population of 242,256.
564 people were located in areas removed after 2006.
Spatial aggregation impacts were discussed in the context of MAUP, emphasizing the importance of reporting at multiple scales.
Temporal GIS analysis using intersect and erase tools
Attribute data manipulation and summary statistics
Population distribution mapping and change detection
Understanding of scale sensitivity in spatial data (MAUP)
Tools Used: ArcGIS Pro, Raster Calculator, Spatial Analyst, Zonal Statistics
Project Type: Raster Analysis & Hazard Mapping
This project focused on raster-based analysis of hurricane storm surge and its potential human impact in Galveston, Texas, a region vulnerable to coastal flooding. I used raster datasets to model storm surge extents and estimate the population potentially affected by a Category 4 hurricane.
The project was divided into two major components:
Part-A: Hurricane Storm Surge Modeling
Processed Digital Elevation Model (DEM) data to delineate storm surge impact zones.
Used raster reclassification to simulate water levels for various storm categories (Tropical Storm through Category 5).
Created a detailed storm surge map to visualize flood-prone areas in Galveston.
Part-B: Estimating Affected Population
Integrated census block-level population data with the surge raster layer.
Applied zonal statistics to estimate the number of people within impacted areas.
Produced a population vulnerability map showing people affected by a Category 4 hurricane — totaling 172,714 individuals in the modeled area.
Raster modeling for hazard mapping and simulation
Use of raster calculator and conditional overlay techniques
Population exposure analysis using zonal statistics
Cartographic design for hazard and risk communication
Application of GIS for disaster preparedness and urban planning
Tools Used: ArcGIS Pro, Spatial Analysis, Table Joins, Cartographic Design
Project Type: Lab-Based GIS Final Project
This project involved creating a comprehensive spatial analysis and map portfolio to support the proposal of a Buffalo Commons—a concept focused on restoring native prairies and bison habitats across the Great Plains. Using multiple GIS techniques, I identified the most suitable counties for inclusion in the Buffalo Commons based on demographic, agricultural, and environmental factors.
Created custom feature classes and managed geospatial data within a geodatabase.
Performed attribute table joins to combine datasets from population, farmland, federal land ownership, and proximity to Indian lands.
Conducted a multi-criteria suitability analysis to locate optimal regions for rewilding efforts.
Designed a series of thematic maps, including:
Population Density Map
Population Change Map
Farmland Value Map
Federal Lands Map
Adjacent to Indian Lands Map
Final Suitability Map
Final Buffalo Commons Proposal Map
Developed a professional report summarizing the methodology, results, and cartographic outputs.
What I Learned
This project sharpened my understanding of geodatabase design, spatial reasoning, and visual storytelling with maps. It also strengthened my ability to manage and interpret large spatial datasets to support real-world land management and conservation goals.
Tools Used: ArcGIS Pro
Project Type: GIS Disaster Mapping & Analysis
Project Overview
This project analyzed tornado damage in Anderson County, Texas, using GIS to map the extent and severity of destruction. I classified damage levels (minor to extensive) and overlaid tornado strength (F1-F5) to visualize the impact across the region.
Project Highlights
Mapped tornado damage zones with qualitative classification
Integrated tornado strength data (F1-F5) with damage severity
Used color gradients to represent damage levels clearly
Designed a map with scale and north arrow for spatial context
Skills & Concepts Demonstrated
Proficiency in disaster-related GIS mapping
Application of qualitative data classification
Visualization of spatial damage patterns
Effective use of symbology and legends
Tools Used: ArcGIS Pro, Online GIS Data Sources, Geoprocessing Tools
Project Type: Lab-Based Mapping Project
In this project, I created a detailed and professional-quality map of Caprock Canyons State Park in Texas. The goal was to integrate data from multiple sources, perform geospatial processing, and apply effective cartographic techniques to produce an informative and visually appealing map.
Prepared lab data by organizing and structuring a GIS workspace.
Downloaded and integrated external GIS datasets (e.g., elevation, land cover, hydrography) from authoritative sources such as USGS and state databases.
Used geoprocessing tools (clipping, merging, projecting) to refine and align datasets with the Caprock Canyons region.
Applied standard topographic map symbols using USGS references to ensure professional symbology.
Referenced and utilized GIS data source guides to understand best practices for reliable and appropriate data usage.
Developed a composite map highlighting terrain, trails, water features, park boundaries, and elevation contours.
This project enhanced my abilities in data acquisition, data cleaning, and map design. I learned how to evaluate GIS data sources, perform core geoprocessing operations, and apply cartographic conventions for real-world geographic communication.
Tools Used: ArcGIS Pro
Project Type: GIS Data Analysis & Visualization
Project Overview
This lab explored qualitative and quantitative data classification techniques using GIS. I analyzed spatial data for the United States, including state-level qualitative classification, regional qualitative patterns, Florida’s population quantitative classification analysis. The results were visualized through thematic maps.
Project Highlights
Classified U.S. states qualitatively based on individual characteristics
Mapped regional qualitative patterns across U.S. regions
Performed quantitative classification of Florida’s population data
Utilized Color Brewer for effective color schemes
Skills & Concepts Demonstrated
Mastery of qualitative and quantitative classification methods
Application of thematic mapping techniques
Use of color schemes for data visualization
Analysis of spatial patterns and distributions
Tools Used: ArcGIS Pro
Project Type: GIS Fundamentals – Topographic Mapping
Project Overview
This project focused on creating a topographic map of Wyoming using contour lines to represent elevation changes. I utilized ArcGIS Pro to process elevation data and design a map with clear contour intervals, scale bars, and coordinate systems to illustrate the region’s terrain features.
Project Highlights
Generated contour lines from digital elevation models (DEMs)
Applied the Albers Equal Area projection for accurate area representation
Designed a map layout with multiple scale options (kilometers, miles, feet)
Labeled key geographic features for context and readability
Skills & Concepts Demonstrated
Proficiency in topographic data processing and visualization
Understanding of map projections and coordinate systems
Effective use of scale and legend design
Interpretation of elevation data for terrain analysis
Tools Used: ArcGIS Pro
Project Type: Cartographic Portfolio & Projection Analysis
This project focused on understanding how different map projections affect the representation of spatial data. I created a series of maps using various projection systems to explore their impact on the shape, size, direction, and area of geographic features. The final output is a comparative map projection portfolio highlighting both global and regional distortions.
Applied a variety of map projections to the same base data for visual comparison
Analyzed how different projections introduce distortion in spatial characteristics
Designed a clean, consistent cartographic layout for each map to support comparative analysis
Explored projections suited for specific purposes such as navigation, global display, and regional analysis
Mercator Projection – Cylindrical projection used for navigation
Miller Cylindrical – Modified Mercator with reduced polar distortion
Mollweide, Robinson, Fuller, Goode’s Homolosine – Pseudocylindrical and interrupted projections focused on global display
Polar Stereographic – Accurate for polar regions
Albers Equal Area (USA Contiguous) – Suitable for statistical mapping across the U.S.
North America Continental Projection – Designed for continental-scale accuracy
In-depth knowledge of coordinate systems and projection types
Cartographic principles applied to map design and layout
Visual communication of spatial distortion
Evaluation of projection suitability based on geographic scope and purpose
Tools & Packages Used: R, AmesHousing, tidyverse, jtools, easystats, GGally, performance
This project involved building and evaluating multiple linear regression models to analyze and predict housing prices in Ames, Iowa. Using real estate data, the analysis focused on identifying key structural and locational features that explain variation in sale prices, with attention to data transformation, outlier handling, and model diagnostics.
Data Cleaning & Feature Engineering:
Selected key variables (e.g., Gr_Liv_Area, Total_Bsmt_SF, Garage_Cars, number of bathrooms).
Engineered new variables like total bathrooms and home age.
Filtered dataset to focus on residential density zoning and removed extreme outliers.
Exploratory Data Analysis (EDA):
Used ggpairs() to explore relationships between numeric variables.
Visualized sale price distribution and neighborhood-level differences using histograms and boxplots.
Model Building & Transformation:
Built an initial multiple linear regression model (lm) using sale price as the outcome.
Diagnosed multicollinearity and model assumptions using check_model().
Applied a log-transformation to sale price to correct skewness and improve model fit.
Interpreted log-transformed model coefficients with exponentiation for easier interpretation.
Multiple Linear Regression
Log Transformation
Outlier Detection & Removal
Feature Engineering
Diagnostic Checking (residuals, normality, multicollinearity)
The model showed that above-ground living area, garage capacity, and bathroom count significantly affect home prices.
Log-transforming the sale price improved normality and homoscedasticity of residuals.
Residential zoning type plays a role in price variation, suggesting potential spatial influence.
Tools & Packages Used: R, tidytext, tidyverse, sentimentr, DT
This project involved performing text mining and sentiment analysis on user-generated reviews collected from Niche.com about Lubbock, Texas. The goal was to extract meaningful patterns, frequently used phrases, and emotional tone from open-ended text responses using tidy text principles.
Tokenization & Cleaning:
Transformed raw review data into a tidy format using unnest_tokens()
Removed standard English stop words as well as custom ones like “lubbock” and “texas”
Identified and counted frequently occurring words
N-gram Analysis:
Extracted trigrams (3-word phrases) to uncover common themes and repeated expressions
Visualized the most frequent trigrams using horizontal bar charts
Sentiment Scoring:
Applied the sentimentr package to calculate sentiment scores for each review line
Computed average emotion scores per entry and visualized the sentiment distribution
Created an interactive summary table showing text excerpts and their emotional tone
Text Preprocessing (Tokenization, Stop Word Removal)
Frequency Analysis & Visualization
N-gram Modeling (Trigram)
Sentiment Scoring and Aggregation
Interactive Output with DT::datatable()
The most common descriptive terms highlight both positive and negative aspects of local life.
Frequent trigrams offer insight into repeated experiences or community narratives.
Sentiment scores indicate a diverse emotional range across reviewers, which could inform city branding or planning efforts.
R, tidyverse, MASS, GGally, ggcorrplot, scales, AmesHousing
This project was designed to explore the concept of correlation and regression using simulated data and real-world housing data from Ames, Iowa. The objective was to visualize linear associations between numeric variables and understand their relationships through correlation matrices, scatterplots, and regression models. The project served as a hands-on application of statistical and data visualization techniques covered in the Quantitative Methods with R course.
1. Simulated Correlation Visualization:
Created a custom function Assoc() to simulate bivariate normal data with a specified correlation using mvrnorm() from the MASS package.
Visualized how the strength and direction of correlation (from -1 to +1) affect the scatterplot distribution.
Included error handling for invalid correlation values.
2. Ames Housing Data Exploration:
Loaded and filtered the Ames, Iowa housing dataset using the AmesHousing package.
Selected key variables: Sale_Price, Lot_Area, Gr_Liv_Area, Full_Bath, Half_Bath, Fireplaces, and Garage_Cars.
Engineered a new variable BathRooms by combining full and half bathrooms for simplified analysis.
3. Correlation & Pairwise Relationship Analysis:
Computed the correlation matrix and visualized it using ggcorrplot with labeled correlation coefficients.
Used ggpairs() from the GGally package to explore pairwise relationships through scatterplots and density plots.
4. Linear Regression Visualization:
Plotted Gr_Liv_Area vs. Sale_Price to visually inspect the relationship between home size and selling price.
Overlaid a linear regression line using geom_smooth(method = "lm") to highlight the predictive trend.
Applied scales::dollar_format() to format sale prices as currency for better readability.
Bivariate Simulation and Visualization
Custom Function Creation in R
Data Cleaning and Feature Engineering
Correlation Matrix and Pairwise Plotting
Linear Regression Modeling
Data Visualization using ggplot2
Simulated data helped intuitively demonstrate how varying levels of correlation influence the spread of data points.
The Ames Housing dataset provided practical experience in feature selection, data transformation, and statistical exploration.
A clear positive linear relationship was observed between home living area and sale price, confirming domain expectations.
Visualization tools like ggcorrplot and ggpairs() are powerful for quickly diagnosing relationships among numeric variables.
R for data processing and visualization
tidyverse (dplyr, ggplot2) for data wrangling and plotting
readr for importing CSV files
This project analyzes how in-state tuition costs for U.S. four-year colleges relate to their graduates’ early-career and mid-career salaries. Using two datasets—tuition_cost and salary_potential—we merged, cleaned, and visualized the data to identify trends and highlight specific institutions such as Rice University and Texas Tech University.
Data Acquisition & Merging
Imported tuition_cost.csv and salary_potential.csv.
Merged datasets by college name to create a unified dataset.
Data Cleaning & Transformation
Filtered to include only 4-year degree programs.
Converted state and type to factors for categorical analysis.
Created binary indicators to highlight specific universities of interest.
Exploratory Data Analysis (EDA)
Summary statistics to understand tuition distribution and salary ranges.
Separated visualizations for early and mid-career pay comparisons.
Visualization
Scatter plots showing the relationship between tuition cost and salary potential.
Differentiated school type (Public/Private) using shape aesthetics.
Used color highlights for featured institutions:
Rice University (green highlight)
Texas Tech University (maroon highlight)
Data Merging using merge()
Categorical Encoding using factors and binary flags
Custom ggplot2 aesthetics (color-coding, shape differentiation)
Comparative Analysis between early-career and mid-career earnings
Private institutions tend to have higher tuition and often higher salary potential.
Rice University stands out with high early- and mid-career salaries relative to its tuition cost.
Texas Tech University demonstrates competitive salary potential despite lower tuition, making it a strong ROI candidate.
Visualizing specific institutions provides context for policy and personal decision-making in higher education choices.
📂 Source
tidyverse, tmap, tmaptools, sf, forcats, wordcloud, lsr, tidycensus
This project explores recent earthquake activity across California, Nevada, and Alaska using spatial data and statistical tools in R. It aims to visualize earthquake magnitudes, identify high-risk regions, and compare seismic intensities across locations using both descriptive and inferential statistics.
1. Data Cleaning & Transformation
Extracted key variables such as time, magnitude, and location.
Parsed date values and standardized location names (e.g., converting "CA" to "California").
Removed records with non-positive magnitudes and ambiguous location labels.
2. Exploratory Visualizations
Plotted a histogram of magnitudes to understand distribution.
Created a line chart to visualize average daily magnitude trends over 30 days.
Ranked U.S. states with high average magnitudes and displayed them using a horizontal bar chart.
Generated a word cloud to visualize frequency of earthquakes by location.
3. Regional Analysis (California, Nevada, Alaska)
Compared earthquake intensity using:
Jitter plots for visual spread.
Bubble maps with tmap for spatial patterns.
Bar plots for state-wise intensity ranking.
4. Statistical Tests
Conducted t-tests to compare mean magnitudes between California & Nevada, and California & Alaska.
Calculated Cohen’s d to measure effect size and determine the practical significance of differences.
California showed the most frequent and intense earthquake activity in the dataset.
Statistically significant differences were found between California and other regions.
Visual tools like bubble maps and jitter plots provided meaningful insights into regional seismic variability.
Word clouds were effective for quickly identifying hotspot regions.
General Social Survey (GSS) 2018 – SPSS format file (GSS2018.sav), focusing on survey responses about gun ownership, gender, and region.
Dataset link: NORC GSS Data Explorer
This project investigates patterns of gun ownership in the United States by gender and Census division using the 2018 General Social Survey. Statistical and spatial techniques were applied to examine whether gun ownership varies significantly across geographic and demographic groups, and to visualize the distribution of ownership rates across the country.
1. Data Cleaning & Preparation
Imported GSS data from an SPSS .sav file using haven.
Selected relevant variables (SEX, REGION, OWNGUN) and removed missing values.
Filtered to include only valid responses for gun ownership (Yes/No).
Created labeled factors for gender and Census divisions.
2. Exploratory Visualizations
Used Shadow plots to compare gun ownership by gender and by Census division.
Computed percentage ownership per division and visualized results on a choropleth map using tmap.
Generated a summary table showing proportional differences between divisions.
3. Statistical Testing
Conducted a Chi-squared test of independence to determine if gun ownership rates differ significantly between Census divisions.
Analyzed standardized residuals to identify divisions with unusually high or low ownership rates.
Calculated association statistics (Cramer’s V) to assess effect size.
Gun ownership rates vary noticeably across U.S. Census divisions, with some regions showing significantly higher prevalence.
Gender differences in ownership were visible, but regional variation played a stronger role.
Chi-squared analysis confirmed a statistically significant association between Census division and gun ownership.
R (statistical computing), tidyverse (ggplot2, dplyr) for data manipulation and plotting, Base R (t.test, summary) for statistical analysis
This project demonstrates how to simulate datasets and compare two groups using statistical analysis and visualization in R. The workflow involves generating synthetic data for a control group and a treatment group with adjustable mean differences, then applying statistical tests and plotting the results.
Data Simulation
Used rnorm() to generate random normal values.
Defined a custom function generateSimulationData() to create control and treatment groups with adjustable mean shifts.
Exploratory Data Analysis (EDA)
Computed summary statistics for each dataset.
Visualized distributions with density plots and boxplots using qplot() from the ggplot2 package.
Statistical Testing
Applied independent two-sample t-tests (t.test()) to compare group means.
Ran analyses for
No mean difference (mean shift = 0)
Significant mean difference (mean shift = 2.3)
Visualization Parameters
Used transparency (alpha = 0.5) for overlapping density plots.
Applied smoothing (adjust = 1.5) for density curves.
Limited x-axis to a range of -4 to 6 for consistent comparison.
Mean Shift = 0 → Groups show overlapping distributions; t-test indicates no significant difference.
Mean Shift = 2.3 → Groups are clearly separated; t-test reveals a statistically significant difference.
Tools & Packages Used: R, readr, ggplot2, DescTools, scales
This project focuses on analyzing how long abandoned vehicles have been reported parked, using the dataset Abandoned_Vehicles_Map.csv. The goal was to examine descriptive statistics for the entire dataset and multiple random samples of different sizes, to explore how sample size impacts the accuracy and stability of statistical estimates.
1. Data Import & Cleaning
Imported raw CSV data using read_csv().
Selected the column "How Many Days Has the Vehicle Been Reported as Parked?".
Removed NA values.
Filtered the dataset to keep only records between 1 and 365 days.
2. Descriptive Statistics
Used DescTools::Desc() to calculate key statistics including mean, median, standard deviation, range, skewness, and kurtosis.
Conducted this analysis for both the full dataset and random samples of sizes 5, 10, 25, 100, 500, 1,000, 3,000, and 7,500.
3. Sampling Process
Applied set.seed() to ensure reproducibility.
Used sample() to create random subsets from the cleaned dataset.
Compared sample statistics with population values to observe changes as the sample size increased.
Small samples (n = 5, 10) produced unstable and inconsistent results.
Medium to large samples (n ≥ 500) closely approximated the full population statistics.
Results illustrated the law of large numbers, showing that larger sample sizes improve the accuracy of population parameter estimates.
Tools & Packages Used: R, tidyverse, usdata, plotly, highcharter, MazamaSpatialUtils
Project Overview
This project investigates demographic changes at the county level across Texas between 2000 and 2017. Utilizing publicly available county population data, the analysis aimed to visualize population growth trends and geographic patterns through both static and interactive visualizations. The goal was to provide insights into which counties experienced significant population increases or declines, supporting urban planning and policy discussions.
Key Steps and Analysis
Data Preparation:
Loaded and examined the county dataset from the usdata package.
Filtered data to include only Texas counties, selecting population figures for the years 2000 and 2017 and calculated population change.
Exploratory Visualization:
Created scatterplots comparing 2000 and 2017 county populations using base R and ggplot2.
Applied logarithmic transformations to better visualize the wide range of population sizes.
Added linear regression lines to identify overall trends in population growth.
Interactive Visualizations:
Developed interactive scatterplots using plotly for dynamic data exploration by users.
Built an interactive county-level choropleth map of Texas using highcharter, visualizing percentage population change with a color gradient to easily identify growth or decline regions.
Geospatial Data Integration:
Merged population data with spatial county codes from the MazamaSpatialUtils package to accurately join data for mapping.
Techniques Applied
Data filtering and transformation with tidyverse
Log-scale plotting for better distribution understanding
Interactive visualization with plotly and highcharter
Choropleth mapping to convey geographic variation
Linear regression trend analysis
Key Takeaways
Most Texas counties exhibited population growth from 2000 to 2017, with some regions growing significantly faster than others.
Log transformation revealed clearer patterns by reducing skew caused by highly populous urban counties.
Interactive maps and plots enhance understanding of spatial and temporal trends, supporting data-driven decision-making in regional planning.
Tools & Packages Used: R, tidyverse, tidycensus, tigris, sf, forcats, ggplot2
Project Overview
This project aims to analyze and visualize key socioeconomic indicators at the census tract level within Lubbock County, Texas, using 2019 5-year American Community Survey (ACS) data. The focus variables include median home value, median age, and median household income. By integrating spatial geometry with demographic data, the study provides detailed geographic insights into income distribution, housing market values, and age demographics across the county’s neighborhoods.
Key Steps and Analysis
Data Acquisition and Preparation:
Retrieved ACS 5-year estimates for census tracts in Lubbock County for variables related to median home value, median age, and median household income.
Used the tidycensus package to pull both attribute data and tract geometries for spatial mapping.
Cleaned and transformed the variable names for clarity using factor recoding.
Data Transformation:
Dropped spatial geometry for summary statistics computations.
Reshaped data from long to wide format to facilitate comparative analysis of variables.
Calculated mean values for median age, household income, and home value across all tracts.
Exploratory Data Visualization:
Created histograms to explore the distribution of the three variables, showing variation across census tracts.
Developed thematic maps using ggplot2 and sf to visualize spatial patterns of median age, median household income, and median home value.
Applied color gradients via scale_fill_distiller with distinct palettes to intuitively represent the spatial variations in each socioeconomic indicator.
Techniques Applied
Accessing and integrating spatial ACS data using tidycensus and sf.
Data cleaning and factor recoding for better variable handling.
Pivoting data between long and wide formats using tidyverse tools.
Summary statistics for central tendency measures.
Geospatial visualization of demographic data through choropleth maps.
Faceted histograms for variable distribution analysis.
Key Takeaways
Median home values, household incomes, and median age show clear spatial disparities within Lubbock County census tracts.
The maps reveal neighborhoods with higher incomes and home values, helping identify potentially affluent areas.
Age distribution maps highlight demographic patterns that can inform urban planning, social services, and economic development efforts.
The combination of spatial and statistical analysis offers a comprehensive understanding of community characteristics at a fine geographic scale.
Tools & Packages Used: R, ggplot2, lubridate, leaflet, scales, plotly, reader
Project Overview
This project analyzes abandoned vehicle cases in Chicago using 311 Service Request data. The analysis focuses on understanding patterns in vehicle abandonment, such as duration parked, most common wards, seasonal trends, and geographic distribution. The dataset includes completion dates, vehicle makes, location details, and the number of days vehicles were reported as parked before removal.
Key Steps and Analysis
Data Import and Cleaning
Imported the dataset Abandoned_Vehicles_Map.csv containing Chicago 311 abandoned vehicle records.
Selected relevant columns: completion date, vehicle make/model, days parked, ZIP code, ward, and coordinates.
Removed incomplete cases and filtered out unrealistic durations (0 days or more than 365 days).
Converted categorical fields (Ward, Make, ZIP) into factors and standardized date formats.
Sampling for Visualization
Created a 1% random sample of the dataset for faster visualization without losing representativeness.
Exploratory Analysis
Generated histograms and density plots of days parked to visualize abandonment duration distribution.
Identified wards with the highest frequency of abandoned cars (Wards 45, 36, and 11) and analyzed them separately using boxplots and jitter plots.
Examined monthly trends by grouping cases by the month the vehicle was reported.
Visualization
Histogram & Density Plot: Showed the percentage distribution of cars based on the number of days abandoned.
Boxplots: Compared abandonment durations across most-affected wards.
Monthly Boxplots: Used Plotly for interactive visualization of abandonment patterns by month.
Leaflet Map: Mapped the geographic distribution of abandoned cars in the sample, with clickable markers showing vehicle make.
Techniques Applied
Data cleaning, factor conversion, and date parsing in R.
Random sampling for performance optimization.
Statistical visualization with ggplot2 and interactive graphics with plotly.
Spatial mapping using leaflet.
Seasonality and ward-level comparative analysis.
Key Takeaways
The most affected wards for abandoned cars were Ward 45, Ward 36, and Ward 11.
Many vehicles were abandoned for less than 40 days, but outliers show some stayed parked for nearly a year.
Certain months show higher abandonment activity, suggesting possible seasonal influences.
Mapping abandoned cars helps visualize clustering patterns and potential hotspots for city intervention.
Tools & Packages Used: R, tidyverse, mapview, sf, ggthemes, readr, lubridate
Data Source
IMSR_Incident_Locations_View: Final Occurrence — National Interagency Fire Center (NIFC) Incident Management Situation Reports (IMSR)
Project Overview
This project explores wildfire incidents across the United States using IMSR Final Occurrence data. The analysis identifies temporal trends, state-level patterns, and the scale of major fire events, while visualizing the geographic distribution of burned areas. Interactive mapping tools were used to enhance spatial insights into fire occurrence and severity.
Key Steps and Analysis
Data Import and Cleaning
Imported the wildfire dataset IMSR_Incident_Locations_View: Final_Occurrence.csv.
Selected relevant columns: longitude, latitude, fire size (acres), initial report date, and incident ID.
Removed records with missing values and standardized column formats.
Extracted the date component from initial_imsr_date and converted it to Date format.
Created month as a factor to track seasonal trends.
Derived a state variable from the first two characters of incident_id.
Temporal Analysis
Plotted a bar chart of wildfire frequency by month to identify peak fire seasons.
Used theme_minimal() for a clean visual presentation.
Major Fire Identification
Filtered fires larger than 10,000 acres to determine which states experienced significant fire events.
Highlighted the largest recorded fire per state where the size exceeded 30,000 acres.
Visualized these results using a bar chart with theme_economist() for clarity.
Distribution Analysis
Created a histogram of fire sizes (log-transformed x-axis) to better display the distribution of large and small events.
Applied theme_calc() for enhanced readability.
Geospatial Visualization
Converted the cleaned dataset into a spatial object (sf) using latitude and longitude.
Mapped wildfire incidents interactively with mapview, where point size and color reflected the area burned.
Allowed zoom and click interactions for detailed location-level exploration.
Techniques Applied
Data cleaning and transformation in R.
Date parsing and feature engineering (month, state).
Filtering and grouping for state-level largest fire identification.
Statistical visualization with ggplot2 and themed plots (theme_minimal, theme_economist, theme_calc).
Interactive mapping using mapview and spatial objects with sf.
Key Takeaways
Wildfires show clear seasonal trends, with certain months having significantly higher occurrence rates.
Some states experienced extremely large fires exceeding 300,000 acres, indicating critical wildfire-prone zones.
The majority of fire events were small, but a small percentage of large-scale fires account for a disproportionate amount of burned land.
Interactive spatial mapping provides powerful visual context for understanding wildfire clustering and intensity.
Tools Used: Landsat 8 (OLI/TIRS), ENVI, ArcGIS Pro
Project Type: Remote Sensing & Spectral Index Analysis
Project Overview
This project analyzed vegetation health, water conditions, and urban development in southern Bangladesh using Landsat 8 multispectral imagery. By applying NDVI, NDWI, and NDBI, I identified spatial patterns of vegetation density, water turbidity, and built-up surfaces such as Dhaka’s urban core.
Project Highlights
Processed Landsat 8 imagery with 30 m multispectral and 11-band spectral coverage.
Created true color and false-color composites (NIR and SWIR enhanced) to distinguish vegetation, soil, and water features.
Extracted spectral profiles for vegetation, urban areas, and water bodies to compare reflectance behavior across wavelengths.
Calculated NDVI, NDWI, and NDBI to map vegetation vigor, water presence, and urban density.
Produced a combined RGB (NDBI : NDVI : NDWI) composite to highlight clear water, sediment-laden water, bare soil, dense vegetation, and built-up regions.
Skills & Concepts Demonstrated
Spectral index modeling (NDVI, NDWI, NDBI)
Interpretation of reflectance curves across visible, NIR, and SWIR bands
False-color compositing for land-cover differentiation
Remote sensing workflow from preprocessing to visualization
Tools Used: ENVI, Landsat 7 & Landsat 8
Project Type: Remote Sensing | Post-Classification Change Detection
Project Overview
This project analyzed land-use and land-cover change across the Dallas-Fort Worth region by comparing classified Landsat images from 2000 (Landsat 7) and 2023 (Landsat 8). Using supervised Maximum Likelihood Classification and a post-classification comparison, the study quantified urban expansion and identified major landscape transitions over 23 years.
Project Highlights
Created independent land-cover classifications for 2000 and 2023 using four classes: High-Density Urban, Low-Density Urban, Non-Urban, and Water.
Collected 20-40 ROI samples per class (≥5,000 pixels each) to train Maximum Likelihood models.
Detected and quantified changes using ENVI’s Change Detection Statistics Matrix.
Identified that approximately 34.1% of Non-Urban pixels transitioned to Urban areas, indicating significant urban growth.
Generated a thematic change map highlighting spatial locations where Non-Urban areas converted to High- or Low-Density Urban development.
Skills & Concepts Demonstrated
Supervised image classification (Maximum Likelihood)
ROI sampling and multispectral interpretation
Automated post-classification change detection
Analysis of spectral and thematic transitions over time
Visualization and interpretation of urban expansion patterns
Tools Used: ENVI, Landsat 8
Project Type: Remote Sensing | Supervised Image Classification
Project Overview
This project involved performing supervised land-use/land-cover classification on a Landsat 8 image of Lubbock, Texas (August 16, 2023). Using training samples and two classification algorithms—Minimum Distance and Maximum Likelihood—I evaluated how different methods distinguish key land-cover types such as cropland, shrubland, water, urban areas, clouds, and shadows.
Project Highlights
Collected ROIs for seven classes: Urban, Active Cropland, Fallow Cropland, Shrubland, Water, Cloud, and Shadow.
Ensured each class met pixel count requirements (≥2000 per class; ≥1200 for water).
Generated and compared two supervised classifications: Minimum Distance and Maximum Likelihood.
Observed that Maximum Likelihood performed better due to its consideration of class variance and covariance.
Identified common misclassifications, such as urban vs. bare soil and water vs. shadow, due to overlapping spectral signatures.
Skills & Concepts Demonstrated
ROI sampling and spectral signature interpretation
Supervised classification (Minimum Distance & Maximum Likelihood)
Comparison of classifier performance
Understanding spectral confusion between land-cover types
Strategies for improving classification accuracy, including additional training samples, spectral indices, and advanced classifiers
Tools Used: Landsat 8 (TIRS), ENVI
Project Type: Thermal Remote Sensing & Brightness Temperature Analysis
Project Overview
This project used Landsat 8 thermal bands to analyze wildfire conditions in New South Wales, Australia during the 2019 fire season. By converting thermal DN values to Top-of-Atmosphere Radiance and Brightness Temperature, the study mapped temperature variation across burned areas, forests, water bodies, rangeland, urban surfaces, and active fire locations.
Project Highlights
Stacked and organized eight Landsat 8 spectral bands, creating linked true-color, SWIR false-color, and thermal (TIRS1) views.
Identified active fire zones, where SWIR2 appeared extremely bright due to thermal emission peaking near ~2.9 µm (Wien’s Law).
Converted TIRS1 DN → TOA Radiance → Brightness Temperature using metadata constants (ML, AL, K1, K2).
Extracted temperatures for multiple land covers, showing clear contrasts: Forest (~294 K), Burned forest (~307 K), Rangeland (~315 K), Urban (~307 K), Water (~293 K), Active fire (~368 K)
Demonstrated how emissivity differences influence observed thermal patterns between vegetation, burned areas, built environments, and water.
Skills & Concepts Demonstrated
Thermal infrared physics (emitted radiation, Planck & Wien principles)
TOA radiance and brightness temperature calculations
SWIR–TIR wildfire detection
Interpretation of land-cover temperature variability and emissivity effects
Tools Used: USGS EarthExplorer, ENVI
Project Type: Remote Sensing | Data Acquisition & False-Color Visualization
Project Overview
This project focused on learning how to search, download, and prepare Landsat Collection 2 Level-2 Surface Reflectance imagery using USGS EarthExplorer. After obtaining the complete Landsat 8 product bundle, I created an NIR false-color composite to visualize vegetation health and land-cover patterns.
Project Highlights
Searched for Landsat 8/9 Level-2 Surface Reflectance imagery using EarthExplorer’s location, date range, and dataset filters.
Downloaded and extracted a full Landsat product bundle containing 22 files, including metadata and individual spectral bands.
Loaded bands in ENVI and generated a NIR false-color composite (R: NIR, G: Red, B: Green) to highlight vegetation.
Used the USGS Bulk Download Application to practice large-scale automated downloads for multiple scenes and surface reflectance bands.
Verified successful order processing through a confirmation email containing order details and dataset metadata.
Skills & Concepts Demonstrated
Remote sensing data search & acquisition workflow
Understanding Landsat data structure and file organization
NIR false-color composite creation for vegetation analysis
Use of USGS Bulk Download Application for large datasets
Tools Used: Landsat 9, ENVI
Project Type: Remote Sensing | Spectral Analysis & Water Quality Assessment
Project Overview
This project analyzed pre- and post-storm spectral signatures in the Derna region of Libya following Storm Daniel (September 2023). Using Landsat 9 Surface Reflectance imagery, I extracted and compared spectral profiles for vegetation and coastal water to quantify environmental changes caused by flooding and sediment transport.
Project Highlights
Extracted a spectral profile for a healthy crop pixel, showing classic vegetation behavior: low visible reflectance due to chlorophyll absorption, a strong NIR peak from internal leaf scattering, and reduced SWIR reflectance from water absorption.
Created linked ENVI views to compare water spectral profiles from before (Sept 2) and after (Sept 18) Storm Daniel.
Observed a major increase in visible-band reflectance after the storm, caused by suspended sediment and organic matter introduced by flooding.
Analyzed two different September 18 water locations, showing how sediment concentration varies spatially.
Identified NIR and SWIR differences related to turbidity, surface roughness, and particle load.
Skills & Concepts Demonstrated
Spectral profile extraction for vegetation and water features
Understanding visible, NIR, and SWIR responses to chlorophyll, water absorption, and turbidity
Environmental change detection using multispectral data
Interpretation of sediment-rich vs. clear-water spectral signatures
Tools Used: Landsat 8 Surface Reflectance, ENVI
Project Type: Remote Sensing | Vegetation Index Modeling
Project Overview
This project used Landsat 8 Surface Reflectance imagery to analyze vegetation conditions in Lubbock, Texas using two vegetation indices: Simple Ratio (SR) and Normalized Difference Vegetation Index (NDVI). The goal was to compare how each index behaves numerically and how effectively they distinguish vegetation from other land-cover types.
Project Highlights
Created SR and NDVI layers from Landsat 8 multispectral data and compared their statistical ranges.
Identified expected vegetation behavior: cropland showed high SR and high NDVI, shrubland showed moderate values, and water returned negative NDVI.
Demonstrated the numeric differences between indices: SR values can exceed 5, while NDVI is normalized between −1 and 1.
Ran a short NDVI time-series showing clear seasonal patterns, with vegetation greening during summer months and declining in cooler seasons.
Skills & Concepts Demonstrated
Vegetation index computation and interpretation
Comparison of normalized vs. unbounded spectral ratios
Linking index values to land-cover characteristics
Seasonal vegetation analysis using NDVI time-series
Tools Used: ENVI, NAIP Aerial Imagery, Landsat 8
Project Type: Remote Sensing | Image Resolution & Sensor Characteristics
Overview
This project examined how digital remote sensing images are acquired and stored by comparing high-resolution NAIP imagery with multispectral Landsat 8 data. The analysis focused on brightness values, radiometric resolution, and how spatial resolution affects the ability to identify ground features.
Key Highlights
Identified low, medium, and high brightness values in the NIR band and estimated an 8-bit radiometric resolution.
Compared true-color and false-color composites to distinguish grass, trees, water, and synthetic turf.
Evaluated how NAIP’s fine spatial resolution enables feature recognition, while Landsat’s coarse resolution limits detail.
Summarized major differences between NAIP and Landsat 8 in spatial, spectral, temporal, and radiometric characteristics.
Skills & Concepts Demonstrated
Understanding spatial, spectral, temporal, and radiometric resolutions
Interpreting brightness values and NIR reflectance
Building and comparing true-color and false-color composites
Feature identification at different spatial resolutions
Reading and interpreting metadata for satellite imagery
Tools Used: Python, ArcPy, ArcGIS Pro
Project Type: Geospatial Programming | Workflow Automation
Project Overview
This project automated a data preparation workflow for transportation feature classes stored in a geodatabase. Using ArcPy, the script dynamically lists and describes feature classes, checks their coordinate systems, and conditionally copies or projects each dataset into a new geodatabase and feature dataset so all layers share a common projected coordinate system.
Project Highlights
Dynamically listed feature classes using ListFeatureClasses().
Read spatial reference properties and WKID values using arcpy.da.Describe().
Programmatically created a new file geodatabase and feature dataset.
Applied conditional logic:
If the WKID matched the target coordinate system, the feature class was copied.
Otherwise, the feature class was projected to the target coordinate system.
Built outputs using robust path handling (os.path.join) and enabled overwrite for repeatable runs.
Skills & Concepts Demonstrated
ArcPy automation, geodatabase schema creation, spatial reference validation, conditional geoprocessing (copy vs. project), and reproducible GIS workflow scripting.
Project Type: Environmental Data Analysis | Geospatial Visualization
Tools Used: Python, Jupyter Notebook, Pandas, Matplotlib
Study Area: Dallas-Fort Worth, Texas
This project analyzed spatial and temporal patterns of fine particulate matter (PM2.5) concentrations across the Dallas–Fort Worth metropolitan area using Python-based data analysis. The workflow focused on data cleaning, statistical exploration, and visualization to assess air quality variability and identify periods of elevated particulate pollution in an urban environment.
Key Highlights
Processed and cleaned PM2.5 monitoring data using Pandas
Analyzed temporal trends to identify periods of elevated particulate concentration
Visualized PM2.5 distributions and time-series patterns with Matplotlib
Interpreted results in relation to urban air quality and environmental conditions
Air quality and environmental data analysis
Python data processing and time-series analysis
Data visualization with Matplotlib
Interpretation of environmental monitoring data
Reproducible analysis using Jupyter Notebook
Description
Urban road network and travel corridors within the GeoLife GPS dataset study region.
Project Description
This project analyzed human mobility patterns using GPS point data from the GeoLife dataset. Average Nearest Neighbor (ANN) analysis was applied to evaluate whether movement points were clustered, random, or dispersed. Distance-based visualization was also used to examine how spatial influence changes with different neighborhood parameters.
Key Tasks
Mapped GPS trajectory points representing human movement
Applied Average Nearest Neighbor (ANN) analysis to assess spatial clustering
Interpreted observed vs. expected distances and statistical significance
Visualized distance-based surfaces using different parameter values
Identified dense activity zones and recurring travel routes
Key Findings
Observed mean distance was far smaller than expected under randomness
ANN ratio indicated strong clustering
Extremely low p-value confirmed statistical significance
Tools Used
ArcGIS Pro, Spatial Statistics (ANN)
Tools Used: Python, ArcPy, ArcGIS Pro
Project Type: Geospatial Programming | Raster Analysis Automation
Study Area: Lubbock, Texas, USA
Project Overview
This project applied Python-based raster processing techniques to standardize spatial resolution, reclassify raster values, and perform map algebra operations for spatial analysis in the Lubbock, Texas region. The workflow automated resampling, conditional evaluation, and raster transformation to ensure consistent and reproducible spatial analysis.
Project Highlights
Resampled raster datasets to a common spatial resolution using appropriate interpolation methods for continuous and categorical data
Reclassified raster values into meaningful classes while avoiding overlapping ranges
Applied map algebra expressions to generate new raster layers from arithmetic and logical operations
Used conditional evaluation to assign output values based on raster thresholds
Automated raster processing tasks using ArcPy for efficient and repeatable workflows
Skills & Concepts Demonstrated
Raster resampling and data-type awareness
Reclassification and remap strategies
Map algebra and conditional raster operations
ArcPy-based geospatial automation
Reproducible spatial analysis workflows
Study Area
Southeastern South Asia, including southern Bangladesh, eastern India, and western Myanmar along the Bay of Bengal.
Data Used
Global Human Settlement Layer (GHSL) population GeoTIFFs for 1975 and 2030 (500 m resolution).
Project Overview
This project used NumPy and ArcPy to process raster-based population data and analyze population change over time. GHSL population rasters were converted to NumPy arrays, resampled using different methods, and compared to evaluate how resampling techniques influence percent change estimates.
Key Tasks
Converted GeoTIFF population rasters to NumPy arrays
Examined raster projection, resolution, and value distributions
Applied Nearest Neighbor and Bilinear resampling
Calculated percent population change using raster map algebra
Compared and reclassified change results to identify growth patterns
Skills Demonstrated
NumPy-based raster analysis, ArcPy automation, resampling effects, map algebra, raster reclassification, and reproducible geospatial workflows.
Tools Used: ArcGIS Pro, Python, ArcPy
Project Type: Geospatial Programming | Machine Learning Regression
Study Area: California, USA
Project Overview
This project applied forest-based regression models to predict median housing values across California, comparing non-spatial and spatial approaches using socioeconomic and proximity-based variables.
Key Tasks
Prepared the California Housing Dataset and created spatial point features
Built a non-spatial forest-based regression model (Validation R² = 0.62)
Developed a spatial regression model incorporating distance to amenities (Validation R² = 0.66)
Evaluated model performance and mapped predicted housing values statewide
Key Variables
Median income as the strongest predictor of housing value
Housing age reflecting development patterns
Population with moderate influence
Ocean proximity indicating a coastal price premium
Skills Demonstrated
Forest-based regression, spatial vs. non-spatial modeling, ArcPy automation, feature engineering with proximity variables, model evaluation, and cartographic visualization.
Study Area
Flood-prone regions represented in the COLFloodZones shapefile.
Project Description
This project used ArcPy to explore flood zone spatial data and automate the creation of multiple buffer zones for proximity and risk analysis. The workflow combined dataset inspection, spatial metadata extraction, and scripted buffer generation to efficiently analyze spatial relationships around flood-prone areas.
Key Tasks
Examined shapefile properties using ArcPy, including dataset name, data type, spatial reference, extent, and attribute fields
Programmatically accessed and reviewed spatial metadata for quality and consistency checks
Generated a series of buffer distances using Python’s range() function
Created multiple concentric buffer zones through a for-loop automation in ArcGIS Pro
Visualized and verified buffer outputs to support flood risk and proximity assessment
Tools Used
Python, ArcPy, ArcGIS Pro
Study Area
Lubbock, Texas area (point locations provided in WGS 84 coordinates).
Project Description
This project used Python and Matplotlib to visualize the spatial distribution of point locations and analyze how house values influence geographic centrality. I mapped house locations, calculated the mean center, summarized coordinate ranges, and created density plots to show clustering patterns.
Key Tasks
Plotted house locations from latitude/longitude point lists
Calculated the mean center and mapped a bounding box (min/max lat/long)
Visualized point distribution using histograms (lat and lon)
Created density surfaces using hexbin and 2D histogram heatmaps
Interpreted spatial clustering patterns and how values can shift the center (weighted mean concept)
Tools Used
Python, Matplotlib, NumPy
Project Type: Geoprocessing Automation | ModelBuilder Workflow Design
Tools Used: ArcGIS Pro, ModelBuilder
Study Area: Marlborough Region, New Zealand
Project Description
This project developed a reusable ModelBuilder geoprocessing tool to automate proximity-based analysis of invasive grass species near human contact locations. The model streamlines buffering and spatial summarization to support biosecurity and environmental risk assessment.
Key Tasks
Built a ModelBuilder workflow combining Pairwise Buffer and Summarize Within tools
Parameterized inputs for human contact locations, invasive species, buffer distance, and output units
Automated buffer creation and area-based summarization of invasive species within proximity zones
Configured model environments and outputs for reproducible analysis
Executed and validated the model as a custom geoprocessing tool
Skills Demonstrated
ModelBuilder automation, geoprocessing workflow design, parameterization, proximity analysis, spatial summarization, and reproducible GIS modeling.
Tools Used: Python, ArcPy, ArcGIS Pro
Project Type: Geospatial Automation & Raster Analysis
Project Overview
This project involved developing a Python script using ArcPy to automate the calculation of the Normalized Difference Vegetation Index (NDVI) from red and near-infrared (NIR) raster bands. The script is designed to function as an ArcGIS Pro tool, allowing users to compute NDVI efficiently through a parameterized interface.
Project Highlights
Built a reusable Python tool that accepts Red band, NIR band, and output path as user-defined parameters.
Implemented the NDVI formula (NIR−Red)/(NIR+Red) using ArcPy raster operations.
Integrated error handling and status messages for robust execution within ArcGIS Pro.
Automated raster processing to reduce manual GIS workflow steps and improve reproducibility.
Skills & Concepts Demonstrated
ArcPy raster processing and automation
Python scripting for GIS tools
Geospatial workflow optimization
Integration of Python scripts into ArcGIS Pro toolboxes
Project Type: Climate Data Analysis | Geospatial Visualization
Tools Used: Python, ArcGIS Pro, ERA5 Reanalysis Data
Study Area: Global
Project Description
This project analyzed global wind direction variability from 1985 to 2024 using ERA5 reanalysis data. Circular statistics and geospatial processing were applied to quantify seasonal and decadal wind direction changes and identify regions experiencing significant directional shifts.
Key Tasks
Processed ERA5 10-m wind direction data into seasonal and decadal raster datasets
Applied circular statistics to compute mean wind direction and angular change
Mapped directional shifts using normalized angular difference (±180°)
Visualized global patterns with compass-based color ramps and contour outputs
Interpreted results in relation to atmospheric circulation and climate dynamics
Key Contributions
First global, multi-decadal assessment of wind direction variability using ERA5
Identified consistent directional shifts in mid-latitude and subtropical regions
Demonstrated the effectiveness of circular statistics for directional climate analysis