Data analytics and visualization

The first dataset that we will use for data analytics and visualization is the COVID-19 dataset across state of South Carolina. This dataset contains latitude and longitude of counties of South Carolina, as well as number of cases and number of deaths in each county. In order to initialize a visualization dashboard click on My Workspace:

Then click on data-sc.csv file to initialize the visualization dashboard:

Finally, again click on data-sc.csv file:

The empty visualization dashboard should look like this:

The first visualization mode that will be used here is called ArcGIS Maps for Power BI that would give us ability to visualize the data imposed on maps for better clarification. In order to choose this visualization mode, in the Visualizations section click on Map:

The first information that needs to be set in order to visualize the COVID-19 dataset imposed on South Carolina map at the county level are latitude and longitude to drag and drop them as:

The other information available in this dataset are number of cases and deaths in each county. We will visualize the number of deaths by adding it to Size:

Furthermore, we will add the number of cases as Color in the map by using the Format panel and clicking on Data colors and then clicking on fx:

In the opened panel, on the Based on the field option choose the Cases variables and in the Minimum option choose a yellow color and in the Maximum option choose a red color to represent the lowest and highest number of cases and finally click on OK to confirm the changes:

As a result, you should see this visualization by putting the number of cases as color:

In order to change the Map Style, click on Map Styles and choose for example dark map theme:

Final COVID-19 dataset in South Carolina visualization dashboard should look like this:

In this section the COVID-19 dataset across all the counties in the US that is provided in data.csv file could be visualized in the same manner described before and you should be able to create this visualization:

The last visualization mode that would be described here is using scatter plot based on breast cancer dataset. This dataset contains several parameters measured for breast cells such as their size, radius, roundness, etc. as well as if this cell is cancerous (called malignant) or if this cell is not cancerous (called benign).

In this section, we use Scatter plot to visualize the two different variables in this dataset. These two variables represent the measurements of mean area and worst concave points of cells. First choose the Scatter chart and put mean area in the X-axis and worst concave points in the Y-axis and well as putting the Class into the legend. Make sure, you choose Don't summarize option to prevent Power Bi to aggregate the data automatically:

Another option that might help to customize the visualization, is called Slicing the dataset. For example, we want to limit the visualized scatter plot to only the points that their mean area value is less than 1000:

The final visualization of this scatter plot should look like this:

The final visualization modes or filters that would be described here are Table and Matrix. Table are useful for extracting statistical information from the dataset. This filter could extract minimum, maximum, average, summation, etc. of variables or classes as:

Now, the statistics mode of average, minimum, maximum, etc. could be changed as:

Matrix filter is useful to visualize dataset in terms of matrices based on different variables and classes similar to scatter plot but in the numerical form. The Matrix filter is accessed as below and could be used to show mean area versus worst concave point across different classes as:

Home