In this event, you will be tasked with identifying a societal issue related to the annual topic and creating a detailed portfolio using data collected from various sources. You will also create a digital display about your findings. If you advance to the semifinals, you will create a digital display and synopsis of a data set provided in the on site challenge. This is a partner event so you can do it with a friend and split the work!
Submissions for this event are online. You will have to report for the on site challenge if you make it to the semifinals. Find event rubric here: Event Rubrics & Forms. This event has past portfolios available here: Past Portfolios.
2025 - 2026 Theme
Identify and use a "Tourism"- related open-source data set for analysis and research. In the scientific poster, cite the source of the data, including the URL/domain and file format.
Understand basic analysis techniques and create clear charts or graphs to highlight important patterns and results. Python libraries like pandas and matplotlib are very helpful for more advanced data manipulation, analysis, and visualization. Tableau can be used to create interactive and visually appealing dashboards and data displays.
Learn how to gather relevant data and clean it by removing errors or missing information to make it ready for analysis. You can use Google Sheets and/ or Excel for this part.
You should know how to explain your analysis and results in writing and speaking. Judges need to understand not just what you found, but how and why it matters.
The rubric also evaluates the design and visual appeal of your presentation. It is helpful to know how to create attractive and clear presentations. If you are not sure how to do this, don't worry. There are many tutorials available on YouTube that can help you learn quickly.
Title page (team identification number, event title, year, state, and conference location)
Table of contents
Introduction and data overview
Data dictionary
Purpose or the importance of the chosen issue
Methods used to obtain the data
Results including data analysis and graphs
Conclusions
Next steps
Digital Scientific Poster
References
Appendix
Citation of all ideas, fonts, and images
Student copyright checklist
Consent and release forms
Dataset: A collection of data, often organized in rows and columns.
Observation: A single data point or record in a dataset.
Data Mining: The process of discovering patterns in large datasets.
Mean: The average value of a set of numbers.
Median: The middle value in an ordered dataset.
Mode: The most frequently occurring value in a dataset.
Correlation: A measure of how two variables are related to each other.
Outlier: A data point that is significantly different from others.
Trend: A pattern or general direction in the data over time.
Model: A simplified representation used to analyze data and make predictions.
U.S. Census Bureau (https://data.census.gov)
Offers demographic, geographic, housing, and economic data about the U.S. population.
Our World in Data (https://ourworldindata.org)
Well-organized, research-based datasets on global issues like climate, health, food, and technology.
UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/index.php)
A classic collection of structured datasets often used in machine learning and data science.
Statistan (https://www.statista.com)
Offers graphs and statistics from surveys, industries, and studies (some content requires a subscription).
CDC Data and Statistics (https://www.cdc.gov/datastatistics)
Health-related datasets, disease tracking, and public health statistics.
U.S. Government Open Data (data.gov)
A large collection of public datasets provided by U.S. government agencies, covering topics like health, environment, and education.
UN Data (https://data.un.org)
International statistics on population, economy, environment, and more from the United Nations.
Kaggle (kaggle.com)
A popular platform with many free datasets on a wide range of topics. It also offers tutorials and competitions to practice data science skills.
Google Dataset Search (datasetsearch.research.google.com)
A search engine specifically for finding datasets from across the web.
World Bank Open Data (data.worldbank.org)
Provides global development data and statistics useful for economic, social, and environmental projects.
Quandl (quandl.com)
Offers financial, economic, and alternative datasets, some free and some paid.
Awesome Public Datasets (github.com/awesomedata/awesome-public-datasets)
A GitHub repository listing hundreds of free datasets on many subjects.
Microsoft Excel or Google Sheets
These spreadsheet programs are great for organizing data in tables, performing calculations, sorting, filtering, and creating simple charts. They are beginner-friendly and useful for cleaning data by removing duplicates or fixing errors.
Tableau
Tableau is a data visualization software that lets you build interactive dashboards. It helps you present data insights visually and dynamically, making it easier for judges or audiences to explore the information through charts, maps, and filters without needing to code.
Google Slides, PowerPoint, or Canva
These tools help you create multimedia presentations to showcase your project. They offer templates, design features, and ways to embed images, charts, or videos, so your presentation looks polished and professional.
Google Docs or Microsoft Word
These word processing programs are useful for writing your portfolio, documenting your methods, explaining your analysis, and summarizing your findings. They help organize your project in a clear, readable format.
Python (with pandas and matplotlib libraries)
Python is a popular programming language for data science. The pandas library helps you manipulate and analyze data efficiently, while matplotlib allows you to create detailed graphs and visualizations. Python is powerful for handling large datasets and performing advanced analysis.
Jupyter Notebook
This is an interactive environment that lets you combine live code, visualizations, and explanatory text in one document. It’s especially useful if you use Python because you can show your entire data analysis process step-by-step.
OpenRefine
OpenRefine is a free tool specifically designed for cleaning and transforming messy data. It helps detect inconsistencies, fix formatting, and restructure data to make it easier to analyze.
YouTube Channels and Tutorials
Channels like “Data School,” “freeCodeCamp,” and “Corey Schafer” offer beginner-friendly lessons on data science tools like Python and Excel.