At some point, someone is going to want to extend this codebase, and elucidating my architectural choices and programming philosophy becomes important. I have decades of software development experience, but other than a semester in Pascal programming as a plebe at West Point I don't have any formal developer training.
I started learning procedural languages in the 1980's, and then learned a bit about object-oriented programming in the 1990's and 2000's. I use classes effectively as data structures which have methods applicable to those data structures which mutate the data or provide a transformation of the data in the data structure. However, I was never one for abstract data types or patterns of classes which were very popular.
I don't like the "everything must be an object" approach to software development. But the "everything must be a function" is likewise an idiotic approach to software programming.
A take somewhat of a mathematical approach to programming. Classes are useful when it is necessary to describe them a sets of things open which you want to do operations on. Otherwise I find that functions are more useful. However, I tend to collect functions into classes which are large compositions of classes where I encapsulate functions as methods. This just makes the use of code easier to the end user. Although it does complicate development for new developers to a code base.
Results are more important than architecture. This software package was really developed in a ad-hoc manner by developing first developing scripts that worked, encapsulating those scripts into functions, then developing a data model, and then finally writing classes. This is perhaps the correct way to develop software applications for scientific applications. First, getting results is perhaps the most important thing to do. And getting software up-and-running to develop results is perhaps the most important thing. Determining the correct architecture can be done and refactored after the fact.
Ad hoc programming leads to unmanagable code. However, when unguided this leads to unmanageable code which is unusable for developing automated workflows and documenting workflows. Thoughtful documentation of algorithms and formulas used are absolutely essentially to writing scientific programming code. Refactoring code often for code reuse increases managability of a code base.
Develop code documentation as you develop code. This includes the mathematics and science of what you are doing along with appropriate references.
Mathematics is the language of science. With mathematics being the language of science, it makes sense to formalize the mathematics within a clear enough context and generalized to the point, where the development of the software becomes straight forward. The mathematics of the code, including references, should be fully documented and rigorous.
Process Diagrams. This is somewhat of a difficult question because documentation often has to be done at different levels because API object level documentation isn't enough, and the purpose of objects is often to hide implementation details from the end user. However, for maintainability and extensibility. Architectural diagrams become very important in understand object oriented code within the larger frameworks. Link For Process Documentation Standards
My vision for pypospack is to use it as a workflow tool for doing computational materials. However, this toolkit is not optimized for high-throughput simulations or a combinatorial approach. Superior tools such as pymatgen and MP interfaces are more appropriate tools for that.
Manually create the simulation template for what you want to do. And run it.
Automate the input file to create the simulation template you want. If you are developing on a class, subclass and inherit the methods, rather than changing the method you want. Let me merge the code change. Commit it to examples.
Automate reading the output file to get just the information you need.
Automate using the results of the output file to calculate the properties you are interested in.