Research runs in cycles. Those can be of PhDs or Grants but eventually blocks of research come to an end. When they do there are many things we can consider outputs. As well as papers, qualifications and outreach we have software. While software is listed as an output it is more often an abandonment. Code is often not reused and rarely maintained. This is because the code, written for a purpose and normally by a single author, isn't usable by others. It can be hard to port to a different computer or operating system. It can be hard to tell how to use functions or algorithms. It can even be incorrect.
These problems are challenging to correct after authors leave projects. Much easier is the creation reusable code in the first place. Code with clear definitions and descriptions, stored accessibly and tested for distribution and use. This is achievable. There are many tools and methods to assist development to these goals. By improving our software output we make the next cycle easier as software can be reused rather than re-implemented.
In the tutorial we will look at general methods for structuring, documenting, storing and testing code. We will introduce general and python specific tools for supporting developers.
Methods:
Versioning and Version Control
Documenting in Code
Test driven development
General Tools:
Git
GitHub
Sphinx
Python tools:
IDEs
unittest
Poetry
We will work through examples, but attendees are encouraged to bring their own projects to work on. The final part of the session (20-30 minutes) will be put aside for questions and to help people with their own code.
Further assistance will be available through an online clinic after the conference to help people implement these methods in their own code. Resources including slides, website and manual will be available as well.
There are several parts of the tutorial which work best if you attempt them yourself or at least follow allong on your own machine. For this you will need the following programs, packages and files:
The methods and tools presented in this work are designed to be generally applicable for research software at many scales. Not everything here is needed for a single analysis script but most will be useful for any software with more than one class or module which might need to be reused or continue to be used for several months.