Beta Version of the new guesser package and API
02 August 2022
02 August 2022
After testing the basic concepts of the new guess_TopologyAttributes API and making sure it initially works fine, it's now time to finish the beta version of our new API and guesser package.
Clean the Parsers:
The first step is to remove calling guesser methods inside all modules, and this takes place mainly inside parsers for masses and atom types guessing. To begin this cleaning process, I must make sure that removing these guessing processes and transferring them to the universe will not break any current default behavior. This was straightforward for most parsers, as most of them first guess atom types from names and then guess masses from the guessed atom types, which is the same behavior inside the universe using the BaseGuesser. For FHIAIMSParser, XYZParser, and Mol2Parser, masses are guessed from names (which is the same as the Element attribute for these parsers), this can be handled simply inside the guess_masses method of the DefaultGuesser by checking if the element attribute exit, so we can use it safely for mass guessing. The other special case we need to handle is TXYZParser, inside this parser masses are guessed from atom names and not atom types, so if we tried to guess masses by only checking for elements and atom types, the default behavior, in this case, will be broken. To solve this without damaging the default behavior for the rest of the parsers, we can pass a special **kwargs to the DefaultGuesser that indicates we are working with TXYZParser data, so we need to guess masses from names (so even DefualtGuesser has different contexts!).
Bond Guessing:
Bond guessing is treated in a special way inside the universe; the universe initiation already has a guess_bonds parameter, so we need to solve the conflict that may happen between this parameter and the to_guess one.
Case one: guess_bonds=True and to_guess contains ‘bonds’ value. Action: bonds are guessed
Case two: guess_bonds=False and to_guess contains ‘bonds’ value. Action: bonds are guessed.
The to_guess parameter has higher priority in this case, we can't deal with guess_bonds=False as an explicit order of not guessing bonds, it is more of a default value that we can ignore in this case.
Case three: guess_bonds=true and to_guess doesn't contain ‘bonds’ value. Action: bonds are guessed
In this case, guess_bonds are considered an explicit order for bond guessing, and it is important to keep it that way for not to break current behavior
Case Four: guess_bonds=False and to_guess doesn't contain ‘bonds’ value. Action: bonds are not guessed
Another issue with guess bonds is that it happens inside the AtomGroup not directly inside the universe, so we need to pass the context to the AtomGroup and handle the guessing process inside it. In addition, the guess_bonds method takes a vdwradii parameter, which is not part of the universe, so we need to pass it as a **kwargs to the guesser class.
At this point, we have a good first version of our guess_TopologyAttributes API and the BaseGuesser class. The next step is to make unit tests for every new functionality and start documenting it. I believe that working on improving the API and the BaseGuesser will continue till the last day of my GSoC period, as developing more new guesser classes will definitely reveal new improvements and functionalities that need to be added. So, let's wait and see!
Next step:
Test and document the guess_TopologyAttribute and the DefaultGuesser for merging it