This documentation is in Beta mode! Please give feedback by suggesting/commenting in embedded documents or via the Feedback Form!
The outut PDB file contains the designed structure from the inference run. There should be one PDB file for each design and their names will depend on inference.output_prefix on inference.design_startnum. Remember that the output from RFdiffusion is ONLY THE BACKBONE STRUCTURE, any designed pieces of the backbone will be composed entirely of glycine residues and other tools, such as ProteinMPNN will then be able to perform sequence design based on that structure.
This file stores metadata about the specific inference run including the specific contig used along with the full contig used by RFdiffusion. The file also contains information about hos residues in any give inputs map to the residues in the output structureThere should be one for each requested design, named the same as the corresponding PDB file, just with a different file extension.
The details about mapping can be found in the following variables:
○ con_ref_pdb_idx and con_hal_pdb_idx: These arrays contain the indices of the residues from the input PDB file (con_ref_pdb_idx) and their corresponding locations in the output PDB (con_hal_pdb_idx). These will only contain information about chains where inpainting took place.
○ con_ref_idx0/con_hal_idx0: Similar to the previous variables, but they are 0-indexed and do not contain chain information. They are most useful for assessing alignment.
○ inpaint_seq: This array contains any residues that were masked during inference.
If you would like to view the contents of this file, you can extract the information using the Python pickle module.
Example script:
import pickle
with open('path/to/your/outputs/output_file.trb', 'rb') as file:
data=pickle.load(file)
print(data)
Trajectory files are automatically placed in a traj folder within the directory that your output PDB and .trb files are being saved. These files can be visualized in PyMol as multi-step PDBs, but note that they are ordered in reverse! The first PDB is for the t=1 (last) predicution made by RFdiffusion during inference. This is due to how the generative (generating the backbone structures from model data) process is discussed in the literature. It is seen as the reverse of the noising process. See the original RFdiffusion Nature paper, specifically Figure 1a,b for more information. There will be two trajectory files for each designed backbone, one labeled pX0 which stores what the model predicted at each timestep, and one labeled Xt-1 which stores the structure that was fed into the model at each timestep.