Our initial analyses focused on Saara’s temperature and CGM data, as well as CGM data from our male volunteer. Figure 1a shows Saara’s smoothed temperature data, collected by her Oura Ring, and Figure 1b shows the corresponding Lomb-Scargle periodogram. The algorithm clearly works on the temperature data, picking up a ~33 day cyclic pattern which matches what we see in the smoothed data.
Figure 1a: Saara's Smoothed Temperature
Figure 1b: Saara's Temperature Lomb-Scargle
Figures 2a and 2b show Lomb-Scargle results for Saara’s and the male volunteer’s p5 glucose data, respectively. Figures 2c and 2d show Lomb-Scargle results for Saara’s and the male volunteer’s median glucose data, respectively. Obviously, there is a clear peak around the 32 day mark for Saara’s glucose metrics, matching the temperature results, that are not seen in the male data. All of this leads us to correctly classify Saara as a cyclic individual and the male volunteer as an acyclic individual.
Figure 2a: Saara's p5 Glucose
Figure 1b: Male's p5 Glucose
Figure 2c: Saara's Median Glucose
Figure 2d: Male's Median Glucose
After analyzing Saara’s personal CGM data, we used the trends we noticed to inform our feature selection for the Dexcom dataset. Results for the Lomb-Scargle periodogram analysis can be seen in Figures 3a and 3b below, where each point represents a single individual. Females are plotted in blue and males are plotted in red. The x-axis is the periodicity in days of the strongest signal detected by the algorithm. The y-axis is the power of the strongest signal. So, for menstrual cycle rhythms, we would like to see a higher power signal in the 22-43 day range for females in their 30s compared to females in their 60s and males.
Just based on visual inspection, we can clearly see that women in their 30s tend to have higher power (> 7.5) signals in the 22-43 day range, as seen by the large amount of blue dots present in the boxed cyclic range. When we separate the plots based on age, we see that the high-power females are only present in the 30s age bracket, while men and women in their 60s have similar, lower-power Lomb-Scargle results.
Figure 3a: Lomb-Scargle Scatterplot, 30s
Figure 3b: Lomb-Scargle Scatterplot, 60s
Continuing on, once we confirmed that the high power cyclic signals were mainly present in diabetic individuals in their 30s, we visualized the data using kernel density estimations (KDEs), seen in Figures 4a and 4b. This allows us to get a better idea of where individuals are most concentrated in the Lomb-Scargle distribution. The darkest areas of each plot, for both males and females in their 30s, are in the lower left corner, indicating that most people tended to have short, low power cycles. However, it is evident that females in their 30s have a peak in the boxed cyclic range for all six metrics that isn’t as clear for the males. There is more dark blue in this region for the females, in contrast with the large amount of white space for the males. Consequently, we can conclude that the glucose signals for the young females may be impacted by menstrual cyclicity, in contrast with the acyclic males.
Figure 4a: Lomb-Scargle KDE, Females in 30s
Figure 4b: Lomb-Scargle KDE, Males in 30s
The last piece of our analysis was comparing the T1D and T2D individuals. Once again, since we already determined that “cyclic range” signals were more prevalent in the younger age bracket, we focused specifically on individuals in their 30s (Figures 5a-5b). Interestingly, though we can see a peak in the boxed region for most of the glucose metrics, the T1D distributions are much more compact. This may indicate that individuals with T1D experience less glucose variability than individuals with T2D, dampening the power of signals related to menstrual cyclicity.
Figure 5a: Lomb-Scargle KDE, T1D in 30s
Figure 5a: Lomb-Scargle KDE, T2D in 30s
Table 1: Chi-Square Test P-Values
In order to quantify the significance of the rhythms detected in the younger females, we conducted Chi-Square tests between the expected proportions (10000/2000 = 5x younger females vs. older females, 12000/4000 = 3x females vs. males, and equal T1D vs. T2D) and the observed values. Results can be seen in Table 1 to the left, with significant values in bold (p < 0.05).
Leader: Saara Kriplani