CONCLUSION

In summary, we successfully trained a machine learning framework that can predict the variant final determination

CF-causing variants
Non CF-causing variants
Variants of unknown significance
Variants of varying clinical consequences

based on two numerical variables

number of alleles in CFTR2
allele frequency in CFTR2

with 93.10% accuracy.

This is a promising start, however there is ample opportunity for further training and expansion of our model. For example, there are other markers of CF including biological or clinical markers such as a sweat test, pulmonary function test, or pancreatic function. In clinical settings, sweat tests are conducted on newborns to test their chloride concentration in order to predict CF diagnosis [1]. Pulmonary function tests can give insight into patient's respiratory function through readouts such as functional reserve capacity (FRC), vital capacity (VC), slow vital capacity (SVC), expiratory reserve volume (ERV), and residual volume (RV) [2].There are also factors in the blood that can be screened for, such as immunoreactive trysinogen (IRT) [3]. Adding these factors to the trained and tested data may increase the versatility of the model to predict patient diagnosis outside of genetic testing. Additionally, larger data sets that include non-CF patients should be employed in future work.

Figure 7: Potential avenues for improving upon our current machine learning model.

Page updated

Google Sites

Report abuse

CONCLUSION

CF-causing variants

Non CF-causing variants

Variants of unknown significance

Variants of varying clinical consequences

number of alleles in CFTR2

allele frequency in CFTR2