Conclusion

Results:

Mellitus is a practical template for accurately modeling a town, its demographics, and its diabetes rate. Factors have been identified that contribute to diabetes and have been manifested in a computational method. Mellitus is especially useful for analyzing the output of DANN 2.0. When using a traditional Python interpreter to run DANN, the diabetes rate prediction is outputted a single time per run. County Prediction Correct Roosevelt 7.8% 7.4% Socorro 15.9% 14.7% Los Alamos 6.8% 5% De Baca 16.4% 23.3% Table 2- Accuracy testing of DANN 3.1 21 In Mellitus, DANN can be used, not only to predict a diabetes rate but to identify which and to what significance input variables are contributing to the prediction. It can be concluded from the many techniques that we used that the variables that have a significant impact on diabetes rates in New Mexico counties are, from least significant to most: percent of American Indian and Alaska Native, poverty, education, commute time, and health insurance. This was determined using the plot in Mellitus 5.1, visualizing the virtual diabetes rate. Each variable’s slider was adjusted to a value that resembled a proportional half-way value. For instance, if the variable was a percentage, it was adjusted to fifty percent. If the variable was a numerical value, the slider was adjusted to half-way. This process was applied to one variable at a time, while the other variables were set to zero. Of course, this isn’t an effective way of simulating a real-world town, but it is sufficient for determining how much each variable’s impact is on the diabetes rate. Then it was possible to visualize a change in the virtual diabetes rate using the plot. A more significant change in the slope of the line indicates a more significant effect on diabetes rates. Figure 10 shows the results of this experiment. A surprising aspect of this test is the negative correlation between mean commute time and diabetes rates. The previously shown scatter plot suggested a positive correlation. It is unclear why DANN reacted to the scenario this way, but it could be due to a shortage of training data or the lack of the RELU activation function. As stated earlier, the commute time value in DANN 2.0 has to be divided by 100 so that it works with the Sigmoid function. However, since the commute time training data is also divided by 100, this should not interfere with our results.

Netlogo Execution Shell Bug:

Due to a bug in Netlogo 6.1.1, Mellitus is unable to support DANN 3.1. In contrast to 2.0, 3.1 requires external packages like Keras and Tensorflow. Anaconda is a popular data science platform that contains all of the packages that Mellitus requires, so we decided to use an Anaconda virtual environment with the Netlogo py extension. When using an Anaconda virtual environment, the program gives an error message that states “Extension exception: This is a bug. Please report”. The appropriate path to the environment was entered in the Python tab in Netlogo. If the path to the Python executable is empty, Netlogo will attempt to find the appropriate executable. When the executable is empty, Netlogo will show various bug messages, and the packages that DANN 3.1 requires still cannot be used. In an attempt to resolve this issue, Mellitus was translated to several versions of Netlogo other than 6.1.1 to detect if the program reacts differently. In every version, the program showed an error message, each stating something along the lines of a bug existing in Netlogo.

Suggestions for Further Implementation:

In the future, more variables should be implemented into Mellitus. DANN can be enabled to output an even more accurate diabetes rate by expanding the data set to include all 33 New Mexico counties. Currently, DANN evaluates 20 counties, which is not as desirable as one would like. This concept could even be applied outside New Mexico. Web scraping could be used to collect mass amounts of data on subjects such as fast-food or recreational facilities and can be used on a global scale. We would also like to conquer the execution shell bug in Netlogo and have DANN 3.1 support in Mellitus. If we could gain access to a large database that includes the data that is needed on a national or global scale, DANN could be trained to perfection and may even have mainstream usage in biomedical and government administrative fields. If Mellitus is correct, ethnicity has less effect on diabetes than other factors that can be controlled. When changes are being made that alter demographic variables in a town, people can be educated of the domino effect that comes with them. Many factors that benefit the general New Mexican society also help the cause for lower diabetes rates. This includes more access to education and affordable health insurance. The first step has already been taken in expanding our project by making Texas versions of DANN. Click here for more information.