Introduction to Gephi software

Introduction to Gephi Software Video:

Gephi software:

  • The main Gephi software window consists four divisions of:

  1. Rendering window

  2. Layout manager

  3. Appearance configurations

  4. Filter configurations

Loading dataset into the Gephi software:

  • In order to load a dataset or spreadsheet into the Gephi software: Data Laboratory ----> Import Spreadsheet

  • The first data that we will use for visualization is the COVID-19 dataset for state of South Carolina (data-sc.csv file downloaded in the previous section).

  • In the CSV file to import section put the address of data-sc.csv file.

  • Separator should be set to Comma, due to the fact that this file is in comma separated value format.

  • Import as configuration should be set to Node table.

  • Click on Next, and in the next windows change the data types of Latitude, Longitude to Double.

  • Change the data types of Cases and Deaths to Integer.

  • Finally click on OK and confirm the loading of the dataset into the Gephi software.

  • You see that this graph does not have any edges and just contains 46 nodes that represent the 46 counties in the South Carolina state.

Layouts to arrange nodes:

  • Layout configurations helps to arrange nodes or edges based on a specific criteria.

  • The first layout that is used here is Geo Layout to arrange nodes that represent each county in the state of South Carolina based on their geographical latitude and longitude.

  • Scale in Geo Layout defines the scale of the position of the nodes after rearrangement. Smaller values will lead to small distances between nodes, but higher values would expand the graph or network.

  • Latitude and Longitude should be chosen based on available information in the dataset.

  • Make sure the Projection configuration is set to Mercator.

Appearance configuration to visualize the Data:

  • In order to visualize the data on the configured graph, we could change the size or color of nodes based on available information in the dataset.

  • There are two distinct information in this dataset that represent the number of cases and deaths for COVID-19 in the state of South Carolina, that we could visualize both of them by using appearance configurations.

  • In Appearance window click on Nodes and then click on color palette.

  • Then click on Ranking to choose the variable that we want to color the nodes as cases.

  • Now click on Size and then Ranking that are the option next to the color palette in order change the size of nodes based on deaths in each county.

Labeling the nodes to show more information:

  • Labeling could be used to put more information on graphs such as the name of the counties in the above network and show the number of cases next to the name of the county.

  • In order to choose which attribute of the graph should be for labeling, click on Attributes:

  • Now instead of Label, choose county and cases as labels:

  • In order to visualize the labels, click on Show Node Labels:

  • Now in order to change the size of the labels in a meaningful manner, we need to back to Appearance configuration and click on Label Size and again click on Ranking and choose cases. Please the min and max size of the labels between 0.25 to 0.5.

Change background color:

  • The background color could be changed extending the options in the rendering window and choose Background color to change it:

The final rendering of the network that shows the number of cases and deaths because of COVID-19 in the state of South Carolina is shown below: