(Update 10/11 - Tableau dashboard is now downloadable!)
Helping prioritize time and resources is a tangible way data can bring value to your community's day-to-day work. Data can help identify clients who may have fallen through the cracks and illustrate progress towards housing your "long-stayers".
This exercise will walk you through how to use Tableau's DATEDIFF function to calculate length of time between two dates. From there we'll create a scatter plot to ID high-priority clients. We'll finish by adapting the dashboard to your own data. This activity will take about 30-60 minutes.
This prioritization matrix is modeled on Charles Temple and Cody Spencer's work on Jacksonville CoC's client dashboard - as spotlighted by the BfZ blog and Malcom Gladwell's Revisionist History Podcast!
You are free to download the embedded dashboard to see the final results but I highly encourage you to start from scratch!
Download the sample client data here and open the file in excel to see what we're working with.
For purposes of this exercise, the critical columns are Client ID, Date of Identification, Exit Date, and VI-SPDAT. The other columns are optional - feel free to introduce these other dimensions into your analyses to see if you can find any interesting patterns.
Note that most of our clients don't have an exit date. There'll also be other peculiarities within the data such as multiple dates of identification for a single client and so forth. It'll be up to you to interpret and figure out how to navigate those bumps!
When you adapt your data to this template, either make sure your column headers remain the same or make sure to adapt your calculations and visualizations in Tableau to your column headers.
Once you get a feel for the data, open Tableau and connect to the excel file as your data source. Before we dive into the calculation, let's set up our data in a table format to make sure we're going down the right path.
Drag Client ID onto rows, then drag Date of Identification next to it on rows. Because Date of Identification is a date, Tableau will default to displaying it as YEAR . Right-click the pill, select "Exact Date" about 3/4 of the way down the menu. Tableau will then display the data as a Measure - to change it to a Dimension, right-click again and select "Discrete" near the bottom of the menu.
Date of Identification should now appear as a date in table format next to Client ID.
Drag Exit Date to the right of Date of Identifcation and repeat the above process to get a column for Exit Date. There'll be a lot of nulls, just like in the original excel file.
For all of the clients who don't have an exit date, we need to set an end-date for our length of time calculations. This data is a snapshot from September 1st, 2020 so we will pretend that this report was pulled for that date. For clients that have an exit date, we want to use their listed Exit Date as their end date and every client still in the system will have their end date as 9/1/2020.
To create an end date for each client, we need to create a conditional statement.
Conditional statements work similarly in Tableau as they do in excel and various programing languages. The general structure to conditional statements in Tableau is:
IF A logical statement that results in either a TRUE or FALSE
THEN Instructions on what to return if the results are TRUE
ELSE Instructions on what to return if the results are FALSE
END
You can use ELSEIF to chain together IF statements. Getting comfortable with using IF, THEN, ELSE, and ELSEIF (always capping your statements with END!) will open endless possibilities for your analyses.
In this case we want to check to see if a given client has an [Exit Date] to use as their end date, otherwise we want to use the date of the snapshot, 9/1/2020. To check if a field is empty/null, we can use ISNULL(). The resulting conditional statement looks like this:
IF ISNULL([Exit Date])
THEN #9/1/2020#
ELSE [Exit Date]
END
If you are using your own live data, use the TODAY() function instead of hard-coding #9/1/2020# in your conditional statement to always return today's date.
In plain english, this is saying "If Exit Date is null, print 9/1/2020, otherwise print Exit Date."
Drag End Date to the right of Exit Date and repeat the same reformatting process to get a column for End Date. You should see that clients that don't have an exit date have 9/1/2020 as their end date and clients with an exit date simply have their exit date.
Now we have everything we need to create our length of time calculation.
Create another calculated field named Days Homeless. To create our LOT calc, we'll use the DATEDIFF() function. DATEDIFF has three required parameters, each separated by a comma:
"date_part": Identifies the unit of time you want to count by. Can be 'day', 'week', 'month', etc.
start_date: The beginning of your date range.
end_date: The end of your date range.
[start_of_week]: Optional element for non-traditional week start days.
Since we are measuring days homeless, use 'day' (with the quote marks!) for date_part. Then use [Date of Identification] as the start_date and [End Date] for the end_date. [End Date] will use the results of the conditional statement we set up earlier.
If you don't have any errors, hit OK. Then double-click your newly created Days Homeless measure or drop it onto the Text tile on the marks pane. Each client should now have their number of days homeless based on their Date of Identification and their End Date.
For fun, try dropping Days Homeless onto the columns shelf and see what happens!
Now that we the components for our prioritization matrix, it's time to create a scatter plot. Scatter plots are best for visualizing the relationship between two continuous measures - in this case we'll use our Days Homeless calculation and the VI-SPDAT measure.
Create a new sheet and double click VI-SPDAT then Days Homeless to have Tableau drop them in columns and rows. What do you notice about this visualization?
Right now, there's only one circle showing up. Additionally, the circle is placed at around 9,000 on the VI-SDAPT axis and 450,000 on the Days Homeless axis.
First, we need to change the level of detail of the visualization to the individual client level. The single mark currently represents the total Days Homeless and VI-SPDAT of every client summed together.
Drag Client ID onto the details pane. This changes the level of detail of the visualization to represent every client.
You may notice that some clients are showing they have VI-SPDAT scores in the 20's and 30's. This could be because they were assessed on two different occasions - since our VI-SPDAT measure is set to SUM(), multiple records are added together. There are a number of ways you can resolve this:
Right click VI-SPDAT, change SUM to MAX to display the highest single score within each client.
Change SUM to AVG to average a client's scores together.
Used a [FIXED] equation to identify the most recent score.
In this case, we will change SUM(VI-SPDAT) to MAX. Similarly, SUM(Days Homeless) will be influenced by multiple records. Again in this case, we will change SUM to MAX.
You have a priority matrix! To wrap up, let's make this viz more useful for analysis.
Now that the bones of the matrix are set, there are a number ways to help your end user get the most out of the data. Thinking about potential audiences, end users are most likely going to be interested in outlier clients and any disparate outcomes between demographic groups.
Below are a list of modifications made to go from here to a more polished end product. Feel free to deviate from the path if an idea strikes!
Add Exit Destination Category as a filter, right-clicked to Show Filter, and filtered to only show Null. We're first interested in clients who are still in the system but the quick filter allows for more exploration of outcomes.
Add Race to the Color tile on the Marks pane. It's helpful to introduce more contrast between the marks by clicking the Color tile. By default, the two most common race categories end up having very similar colors.
Change the opacity of the marks by clicking the Color tile to better see overlap and density. Somewhere around 60%-70% works for this instance.
Change the marks to filled dots by clicking the Shape tile.
Increase the size of the axes text by right-clicking each axis, format, font to 12pt.
Fill the entire space of the dashboard by clicking the Standard drop-down in the icon menu bar at the top of the screen and selecting Entire View.
Rename the title to something more descriptive by double-clicking the sheet title.
You now have an interactive client priority matrix! End users can hover over each dot to identify long-stay clients or clients with high VI-SPDAT scores and prioritize efforts to find resolutions. Think about how you would introduce this visualization to your teammates and talk through how they can draw insights from the data.
It's time to bring the your length of time calculation skills to your community! The core of this priority matrix only requires four columns of data (or equivalents) that should be included in every by-name list:
Client ID
Date of Identification
Exit Date
VI-SPDAT
You can either create a new excel file based off the template or directly replace the data in the original excel file with your own. The rest of the columns of data in the template are optional and can be overwritten with your own demographic or programmatic fields.
If you created a new data source with your community data, try going through the exercise again with your own data. It'll be easier the second time!
If you overwrote the template, right click the Client Snapshot data source in the data source pane and click Refresh. Any new fields you introduce that aren't a part of the example data will break the filter and coloring of the final viz but those can be easily replaced with your own fields.
What kinds of insights can you draw from your own data?
If you're interested in a few extra challenges in the original example data, make sure you have a clean connection to the original file and try to make visualizations that inform these questions:
On average, are there disparities in LOT to exit across racial lines?
What would a Days Homeless x Age priority matrix look like?
What does the distribution of clients across decades look like? Is there a difference in this distribution across veteran status?
Download the workbook at the top of the page to see some examples of solutions. There's no one right answer!