Introduction
Over the past month I have been writing the computer program which will aim to actually generate the artificial intelligence architectures. As this is a prerequisite to the data, I so far do not have any concrete data collection to show; over the next two weeks, I plan for this to change.
Progress So-far
Basically the enitre month of January, save about the first week, has been spent buidling out the main computer program. Here, most of the process has been laid out for the Neural Architecture Optimization, where the input AI architectures and randomly generated ones will be systematically tested for loss rate on a chosen dataset, in this case a set of image classification datasets named CIFAR-10 and CIFAR-100. The other half of the program, where I will inject NeuralPower and Paleo, two algorithms that determine AI model energy usage, into the predictor stage of the Neural Architecture Optimization flow has not been laid out yet (more on this later).
---
Challenges
Yeah, I would say there's been some challenges. First off, the source code for both Neural Architecture Optimization and NerualPower/Paleo were both linked in the research papers. The plan was to use this to help streamline the process of implementing the program, as I could theoretically just reason the process laid out in the research papers in the code and go from there. Boy, was I wrong here.
The source code for the research papers is written in a programming language called Python, the name inspired by the comedy group Monty Python. Python is the de facto language for anything data science related, including artificial intelligence and deep learning. I wanted to port all this code over to a language called Rust, as that is the language I'm most familiar with, and I didn't plan on learning Python just for this research project. The problems start here.
I initially planned on hand porting all of the source code, but slowly began to realize that would not be possible within the time frame, due to the sheer amount of code in both NAO and NeuralPower. After about a week of writing, I decided to look around for better ways to port the research papers' code over.
I stumbled upon a transpiler named py2many that converts code in Python to other programming languages, Rust being one of them. This tool single-handedly saved me weeks worth of work by programatically taking the Python code into Rust code.
The second big challenge came when attempting to find the same or similar libraries to the one used in the Python version of the code. Within Python, there are two main libraries for data science: Tensorflow and PyTorch. While there exists a Tensorflow binding for Rust, it does not have all the features of the one used by the NAO and NeuralPower code. Some of this is due to the major version differences between the Tensorflow in NAO and the one currently used in Rust, and some is to due features not present in the Rust binding of Tensorflow. I was able to find a PyTorch-inspired library in Rust--Burn, but there are still large differences between PyTorch and Burn, but that can be allieviated.
Now, the biggest challenge will be the grind to finish the program--mainly the Neural Architecture Optimization scafolding and NeuralPower/Paleo injected into the performance predictor.
Data Collected
Sigh, because of the nature of this research project, the entire comptuer program must be written in order to collect data which has not happened yet (more on this in the problems and reflection section). Thus, I have no data yet.
left: the whole workstation for the research. Since the entire project is just a computer program, I only need a computer to work.
right: part of the main Neural Architecture Optimization train_search function. Here, we are cycling through the given number of epochs (runs), training and testing the generated architectures on a given dataset, trying to find the architectures with the highest accuracy rate.
right, continued: some of this code may get refactored as I fully port the source code from PyTorch to Burn.
Reflection
A lot of work has been done, and there's a lot of work left to do. I am not quite happy yet with the position that the research project is in, but I am quite happy with all I have done so far.
My biggest regret would be not beginning the draft of the program over winter break; I instead focused on completing college applications here. If I had begun sooner, I would most likely be a lot further. But, I do not think I would have sought out a transpiler, or used the PyTorch version of NAO and NeuralPower.
The only other big regret would be greatly underestimating the scope of the two research papers off which I am basing this research project. I had initially thought that it would be fairly trivial to port the source code--a lot of work sure, but nothing out of the ordinary. I was wrong, I do not like porting or using or understanding data scientists' code; the far larger size of the source code did not help either.