CensusConnect: Democratizing Data Access from the U.S. Census
This application bridges the gap between complex data and everyday users. This blog post is meant to serve as a demonstration of the User Design Process that was taken to develop CensusConnect.
General Topic
My focus is on democratizing access to vital demographic and socioeconomic data from the American Community Survey (ACS). Currently, this crucial public information is locked behind complex technical barriers, variable codes, and unintuitive interfaces that make it challenging for many stakeholders to access and utilize effectively. This creates an equity issue where only those with specific technical expertise can leverage this public resource. This makes researchers that have the benefit of taking the right classes, having the right mentors, and being able to afford time to make mistakes, have a massive headstart when working with this data.
Target Populations
Community organizations and nonprofits who need demographic data for grant writing and program planning but lack data science expertise
Policy makers and government officials at local levels who need accessible ways to understand their communities
Academic researchers and students who struggle with the current data access tools
Personal Motivation
My motivation stems from personal experience trying to access and work with ACS data for community research projects. I discovered that while this incredibly valuable data is technically "public," it remains practically inaccessible to many who could benefit from it. There is also a lack of safety measures to ensure that the data is consistent, leading to inaccurate data for the researchers involved.
Current Issues
Having to understand complex variable codes that change across years
Struggling with massive data files with unclear column names
No built-in accessibility features for users with disabilities
Requiring significant technical expertise just to begin basic analysis
From our assignment description:
Anytime you have a design idea that requires action or input from a variety of people, you need to consider all the stakeholders.
An app that prompts a person to exercise more only has 1 person that matters: the target users.
But an app that encourages someone to visit a doctor, or participate in a mentoring activity, or reach out for help actually has several. The direct stakeholders(s) are the "target user." Everyone else who cares or is involved in someway are "indirect stakeholders."
Your design must consider the costs and benefits to all direct and indirect stakeholders in order for your design to be successful.
Read: Article (Interaction Design Foundation) - "Mapping Stakeholders." How design teams identify direct and indirect stakeholders. https://www.interaction-design.org/literature/article/map-the-stakeholders
Tool: Stakeholder Analysis - once you know who your direct and indirect stakeholders are, consider the power dynamic. What issues or concerns do you need to be sensitive about? This interactive tool helps you determine who you need to keep happy. https://coast.noaa.gov/data/digitalcoast/pdf/stakeholder-analysis-worksheet.pdf
Competitive Analysis: ACS Data Access Tools and Solutions
1. Comprehensive Solution List
Official Tools
1. Census Data API
- Source: https://www.census.gov/data/developers/data-sets/acs-5year.html
- Features: Direct data access, comprehensive coverage
- Limitations: Requires programming knowledge, complex documentation
2. American FactFinder (Legacy)
-Source: https://www.census.gov/newsroom/press-releases/2020/american-factfinder-retiring.html
- Source: https://www.census.gov/programs-surveys/acs/
- Features: Web interface for data queries
- Note: Discontinued but important to analyze for historical context
3. data.census.gov
- Source: https://data.census.gov/
- Features: Current official interface, advanced filtering
- Limitations: Complex interface, limited visualization options, cannot select multiple years. This is the tables link, and is the Census's best attempt at an accessible way to collect data
Third-Party Tools
4. IPUMS
- Source: https://usa.ipums.org/usa/
- Features: Harmonized data across years, detailed documentation
- Limitations: Requires account, focused on individual-level data
5. Social Explorer
- Source: https://www.socialexplorer.com/
- Features: Excellent visualization, user-friendly interface
- Limitations: Paid subscription, limited API access
6. Census Reporter
- Source: https://censusreporter.org/
- Features: User-friendly profiles, good documentation
- Limitations: Limited to recent years, basic visualizations
7. R Packages
- tidycensus: https://walker-data.com/tidycensus/
- censusapi: https://www.hrecht.com/censusapi/
Features: Programmatic access, data cleaning tools
Limitations: Requires R knowledge
8. Python Libraries
- cenpy: https://github.com/cenpy-devs/cenpy
- census: https://github.com/datamade/census
Features: Python API wrapper, data processing tools
Limitations: Technical expertise required
Social Explorer
Census Reporter
IPUMS
Semi-structured interviews are the most appropriate primary research method for this project as they will allow us to gain insights into the challenges and needs of different stakeholders working with ACS data. Through one-on-one conversations, we can learn about specific technical and accessibility barriers that users face when working with census data, more in-depth than what I have experienced.
My first batch of interviews were specifically targeted to more advanced researchers, obtaining priorities that would make this process easier for anyone at any expertise level.
I did a follow-up two interviews with users that have very little to no experience working with data, running through current U.S. Census Data Retreival. Below are the results.
Navigational Challenges (User 1 & User 2):
Observations:
Users without prior experience struggled to navigate the ACS website, often failing to intuitively locate required data. For example, User 1 had difficulty identifying that the Georgia data was already displayed, while User 2 noted frustrations with switching between tables and changing year settings.
The disconnect between user expectations and the website’s organization of data caused delays and confusion, especially when searching for state-specific or demographic-specific data.
For example, both users faced challenges identifying specific counties or states from broader lists or geographic entities.
Recommendations:
Introduce a guided navigation structure, such as a wizard or step-by-step prompts, to help users locate the correct geographic entity and data type.
Add persistent geographic context indicators (e.g., a breadcrumb trail showing selected state or county) to clarify what data is currently being displayed.
Provide direct links to commonly requested data, such as poverty rates by state, counties, or age groups, to reduce time spent navigating.
Search Functionality Issues (User 1):
Observations:
The search functionality did not direct users effectively to the desired information. For example, when User 1 searched for Utah counties, the results were broad and required additional manual filtering.
The presence of multiple similar options under the same keyword (e.g., poverty sub-genres) overwhelmed the user and increased the likelihood of selecting irrelevant data.
Recommendations:
Simplify the search result categorization by grouping related results under clear labels or categories (e.g., "Poverty Rates by Age" vs. "Poverty Rates by County").
Introduce advanced search filters (e.g., state, year, demographic) to narrow results and provide direct access to relevant tables.
Offer pre-configured search templates for common queries (e.g., "18+ male population by county") to eliminate repetitive navigation.
User Interface (UI) Complexity (User 1 & User 2):
Observations:
The UI lacks intuitive features for filtering and switching between data views. User 1 struggled with isolating rows (e.g., removing female and total categories) and noted that the process was tedious and error-prone.
User 2 expressed frustration when changing year settings, which inadvertently led to exiting the current table and restarting the search.
Recommendations:
Add a dynamic filtering tool that allows users to quickly isolate variables (e.g., male population, 18+ age group) without scrolling or manually deselecting columns.
Implement freeze panes for table headers and columns to improve navigation when working with large datasets.
Provide an "Undo" feature to correct accidental clicks or selections without restarting the process.
Challenges for New Researchers (User 2):
Observations:
A new researcher (User 2) highlighted the steep learning curve of the ACS website, with confusion about switching tables and identifying age brackets (e.g., the issue with "15–19" age groups when searching for "18+").
User 2 explicitly noted dissatisfaction with the lack of flexibility in the website’s design, stating: "This website sucks."
Recommendations:
Develop personas tailored for new researchers, including guided walkthroughs, a simplified interface, and a clear explanation of key terms (e.g., age brackets, survey types).
Include on-screen tutorials or tooltips that explain how to filter, change years, and navigate between tables.
Create a sandbox mode where users can practice searches or data manipulation without affecting live queries.
Automation and Data Extraction Issues (User 1):
Observations:
User 1 struggled to isolate specific rows and columns (e.g., male estimates for 18+ in Utah counties) and had to resort to external tools (e.g., AI or coding) for guidance.
Downloading the entire table and manually filtering data was inefficient and error-prone.
Recommendations:
Introduce a custom export tool that allows users to select specific variables (e.g., “men, 18+, by county”) and download filtered results directly as a CSV file.
Provide a row/column selector tool within the UI for users to filter data before downloading it.
Offer API-free queries through an integrated tool that lets non-technical users specify variables and download results.
Key Themes Across User 1 & User 2:
Complex Navigation:
Both users experienced confusion with finding specific data, particularly when dealing with geographic entities or switching between data views.
The system lacked clear context indicators or efficient navigation paths.
Insufficient Guidance:
New users felt overwhelmed by the site’s structure and terminology, indicating a need for step-by-step guides, tooltips, or walkthroughs.
Inefficient Data Handling:
Both users struggled with isolating specific data points, relying on cumbersome manual processes or external tools for assistance.
Frustrating UI and Workflow Disruptions:
Tasks like switching year settings or filtering rows disrupted the workflow and caused unnecessary friction.
Overall Recommendations for Improvement:
Enhanced Navigation and Search Features:
Introduce intuitive filtering, advanced search options, and direct links to commonly requested data.
Add persistent context indicators and dynamic search result grouping.
Streamlined Data Manipulation:
Build tools to isolate and download specific rows/columns directly from the UI.
Offer customizable exports and pre-configured templates for repetitive queries.
User-Centered Design:
Tailor the interface for both new and experienced researchers with guided tutorials, tooltips, and an improved table layout.
Simplify complex workflows, such as switching between years or isolating variables.
Automation Support:
Enable API-free queries and automation-friendly tools to assist users in extracting precise datasets efficiently.
My professor sent me the following link, to guide my thinking while creating my personas.
https://engineering.atspotify.com/2024/09/are-you-a-dalia-how-we-created-data-science-personas-for-spotifys-analytics-platform/
Spotify identifies the following personas for their data science researchers:
Given that my application is geared to new researchers, I have extended this list of personas to include two more.
My thinking here was that I wanted an extremely simple user interface; one that users could select that data they need, generate the API call, and display the data along with details about the data they pulled that would otherwise go unnoticed or would live in complex technical documentation spread throughout the Census website.
After consulting with peers and after the creation of my personas, I started to get more serious about the design.
After consulting with peers and after the creation of my personas, I started to get more serious about the design.
The basic idea remained the same, users would be able to request the data from a simple user interface. I added features like saving to projects after.
Lessons Learned
A more fully flushed-out prototype would lead to a better peer analysis! I currently only have functionality for the important parts of the cognitive walkthrough.
I need more icons so that users can explore the website more.
Undo and Redo are currently not very supported. Everything has the potential to be saved, but I need a back button.
My Save Search Button needs to be moved and is clearly a priority. Will be moved before project submission.
Major thank you to Ariana and Micho for this Analysis <3
Performing a cognitive walkthrough, I laid out the following three tasks, each one building on the last:
Requesting Total population of All States, ACS survey 5-year estimate, for the year 2022.
Saving that search to the Food Insecurity Project.
Viewing the Food Insecurity Project, and returning to that search from the Food Insecurity Project page.