Table Operations

Is it possible to aggregate information by using Tableau Prep?

  • The aggregate step is for aggregating the results based on functions of sum, average, minimum, or maximum or etc.

  • Also, it's possible to apply these aggregating functions based on grouping of some fields.

  • For example, in this training, we want to calculate the moving averages of different counties in the South Carolina for cases and deaths. As a result, county field should be a grouping variable.

By double clicking on the county field it would be added to the Grouped Fields.

For calculating the moving averages based on a weekly time window, double click on weekly field and add it to the Grouped Fields. Now, we need to add cases and deaths into the Aggregated Fields for calculating the averages by double clicking on them:

Now, change the function of SUM for both cases and deaths fields into AVG:

We need to repeat this procedure by creating new aggregate steps for biweekly and tendays fields. As a result, finally we should have this flow diagram for weekly, biweekly, and ten days moving averages calculated based on Aggregate step.

How to merge different steps

Join and union steps are used for merging two different dataset or different steps in Tableau Prep. The Join step would join two different dataset or steps by matching some fields and also based on certain conditions and furthermore having more option for joining. On the other hand, Union step simply put steps or different dataset on top of each other without any condition or option for merging.

Union step

In order to add Union step click on the plus sign and choose Union step on Aggregate 1 field.

After successful adding of a Union step drag and drop Aggregate 2 and Aggregate 3 steps into the Union step and make sure it is added. Finally, after successful merging of these three steps, you should see this diagram.

Furthermore, we see that because the weekly, biweekly, and ten days moving averages might overlap on some dates but not on many other dates, there are some NULL values on the final union of these three steps.

As a result, this strategy to unify these three aggregated steps by using Union step might not be a good idea. So, please remove the Union step before going to the next section for Join step.

Join step

The first difference between Join and Union steps is that Join step only accepts two steps or dataset as inputs. As a result, by adding a new Join step and adding the Aggregate 2 on it, you should see this diagram:

The first option for Join step is to specify the Join type which could be:

  1. leftOnly: the left only option only takes data from the first join member which would be Aggregate 1 in this example.

  2. left: the left option takes data from the first join member and the overlap between Aggregate 1 and Aggregate 2.

  3. rightOnly: the right only option only takes data from the second join member which would be Aggregate 2 in this example.

  4. right: the left option takes data from the second join member and the overlap between Aggregate 1 and Aggregate 2.

  5. inner: only takes data from overlap of first and second join members.

  6. full: takes all the data from first and second join members.

In fact, these relationships are visualized here as:

Now, if we want to find the data that would overlap between weekly and biweekly dataset, we could put a condition where weekly = biweekly and find the overlap between Aggregate 1 and Aggregate 2.

Saving a cleaned dataset for visualization in Tableau Desktop

Output step is the final step in the flow to write the cleaned and modified data into a file that could be opened directly in Tableau Desktop. As a result, we want to create three Output step to write Aggregate 1, Aggregate 2, and Aggregate 3 into comma separated value (CSV) files for visualization of moving averages into the Tableau Desktop.

Output step has some options to save the output into a file or publish it or even put it into a database. Also, you can specify the name, location, type (Tableau Hyper or CSV), or creating a new table or appending to the existing one:

Make sure in order to save the output you hit the Run Flow button to write the file.

The final packaged flow is shared here for reproducibility: https://bit.ly/3jwdgcj and https://bit.ly/33d2bHY

Also, the final outputs for weekly, biweekly, and ten days window moving averages are shared here: