GlassBERTa - Language Modelling as an Unsupervised Pre-Training for Glass Alloys
RELEASE NOTE
-Reshinth Adithyan, TS Aditya, Roakesh, Jothikrishna, Kalaiselvan Baskaran
RELEASE NOTE
-Reshinth Adithyan, TS Aditya, Roakesh, Jothikrishna, Kalaiselvan Baskaran
Alloy Property Prediction is a task under the sub field of Alloy Material Science(Metallurgy) wherein Machine Learning has been applied rigorously. This is modeled as a Supervised Task wherein Alloy Composition is provided for the Model to predict a desired property. Efficiency of tasks such as Alloy Property Prediction, Alloy Synthesis can be modeled additionally with an Unsupervised pre-training Task.
We describe the idea of pre-training using Language Modelling kind of approach in terms of Alloy Compositions.We specifically inspect that random masking proposed in is not suitable for modelling Alloys.
We further go on proposing two types of masking strategies that are used to train GlassBERTa on 300K Glass Alloy Compositions for the objective of Masked Language Modelling and fine tune it to encompass the properties of an Alloy Composition. The results suggest that pre-training improves the performance of the model in the downstream task and is an important field of research for further development.
We're releasing the pre-print, dataset alongside the code soon. We have released the pre-trained model via HuggingFace Model Hub.
GlassBERTa Model Hub
FOOT NOTE:
We would like to thank, Bipin Krishnan for his extended support for this work to make the datasets available.