core
pdbeccdutils.core.ccd_reader
A set of methods for reading in data and creating internal representation of molecules. The basic use can be as easy as this:
from pdbeccdutils.core import ccd_reader
ccdutils_component = ccd_reader.read_pdb_cif_file(‘/path/to/cif/ATP.cif’).component rdkit_mol = ccdutils_component.mol
- class pdbeccdutils.core.ccd_reader.CCDReaderResult(warnings: List[str], errors: List[str], component: Component, sanitized: bool)
NamedTuple for the result of reading an individual PDB chemical component definition (CCD).
- component
internal representation of the CCD read-in.
- Type:
- errors
A list of any errors found while reading the CCD. If no warnings found errors will be empty.
- warnings
A list of any warnings found while reading the CCD. If no warnings found warnings will be empty.
- sanitized
Whether or not the molecule was sanitized
- Type:
- component: Component
Alias for field number 2
- sanitized: bool
Alias for field number 3
- pdbeccdutils.core.ccd_reader.read_pdb_cif_file(path_to_cif: str, sanitize: bool = True) CCDReaderResult
Read in single wwPDB CCD CIF component and create its internal representation.
- Parameters:
- Raises:
ValueError – if file does not exist
- Returns:
Results of the parsing altogether with the internal representation of the component.
- Return type:
CCDReaderResult
- pdbeccdutils.core.ccd_reader.read_pdb_components_file(path_to_cif: str, sanitize: bool = True, include: list[str] = []) Dict[str, CCDReaderResult]
Process multiple compounds stored in the wwPDB CCD components.cif file.
- Parameters:
path_to_cif (str) – Path to the components.cif file with multiple ligands in it.
sanitize (bool) – Whether or not the components should be sanitized Defaults to True.
include (list[str]) – List of CCDs to be parsed. By default it is empty and parse
provided (all the CCDs. If a list of CCDs)
them (will only parse)
- Raises:
ValueError – if the file does not exist.
- Returns:
Internal representation of all the components in the components.cif file.
- Return type:
pdbeccdutils.core.component
- class pdbeccdutils.core.component.Component(mol: Mol, ccd_cif_block: Block, properties: CCDProperties | None = None, descriptors: List[Descriptor] | None = None)
Wrapper for the rdkit.Chem.Mol object enabling some of its functionality and handling possible erroneous situations.
- Returns:
instance object
- Return type:
- property atoms_ids: Tuple[Any, ...]
Supplies a list of the atom_ids obtained from _chem_comp_atom.atom_id, see:
http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx.dic/Categories/chem_comp_atom.html
The order will reflect the order in the input PDB-CCD.
The atom_id is also also know as ‘atom_name’, standard amino acids have main chain atom names ‘N CA C O’
- compute_2d(manager: DepictionManager, remove_hs: bool = True) DepictionResult
Compute 2d depiction of the component using DepictionManager instance.
- Parameters:
manager (DepictionManager) – Instance of the ligand depiction class.
remove_hs (bool, optional) – Defaults to True. Remove hydrogens prior to depiction.
- Returns:
Object with the details about depiction process.
- Return type:
- compute_3d(version='v3') bool
Generate 3D coordinates using EKTDG method. Version can be specified
- property descriptors: List[Descriptor]
Supply the _pdbx_chem_comp_descriptor category for the PDB-CCD Obtained from PDB-CCD’s _pdbx_chem_comp_descriptor:
- Returns:
List of descriptors for a given entry.
- Return type:
- export_2d_annotation(file_name: str, wedge_bonds: bool = True) None
Generates 2D depiction in JSON format with annotation of bonds and atoms to be redrawn in the interactions component.
- Parameters:
file_name (str) – Path to the file
- export_2d_svg(file_name: str, width: int = 500, names: bool = False, wedge_bonds: bool = True, atom_highlight: Dict[Any, Tuple] | None = None, bond_highlight: Dict[Tuple, Tuple] | None = None)
Save 2D depiction of the component as an SVG file. Component id is generated in case the image cannot be drawn.
- Parameters:
file_name (str) – path to store 2d depiction
width (int, optional) – Defaults to 500. Width of a frame in pixels.
names (bool, optional) – Defaults to False. Whether or not to include atom names in depiction. If atom name is not set, element symbol is used instead.
wedge_bonds (bool, optional) – Defaults to True. Whether or not the molecule should be depicted with bond wedging.
atomHighlight (
dict
oftuple
offloat
, optional) – Defaults to None. Atoms names to be highlighted along with colors in RGB. e.g. {‘CA’: (0.5, 0.5, 0.5)} or {0: (0.5, 0.5, 0.5)}bondHighlight (
dict
oftuple
offloat
, optional) – Defaults to None. Bonds to be highlighted along with colors in RGB. e.g. {(‘CA’, ‘CB’): (0.5, 0.5, 0.5)} or {(0, 1): (0.5, 0.5, 0.5)}
- Raises:
CCDUtilsError – If bond or atom does not exist.
- property external_mappings
List external mappings provided by UniChem. fetch_external_mappings() was not called before only agreed mapping is retrieved.
- fetch_external_mappings(all_mappings=False)
Retrieve external mapping through UniChem based on the InChi Key.
- property formula: str
Supply the chemical formula for the PDB-CCD, for example ‘C2 H6 O’. Obtained from PDB-CCD’s _chem_comp.formula:
http://mmcif.wwpdb.org/dictionaries/mmcif_std.dic/Items/_chem_comp.formula.html
If not defined then the empty string ‘’ will be returned.
- Returns:
the _chem_comp.formula or ‘’.
- Return type:
- property fragments: List[SubstructureMapping]
Lists matched fragments and atom names.
- Returns:
Substructure mapping for all discovered fragments.
- Return type:
- get_conformer(c_type) Conformer
Retrieve an rdkit object for a deemed conformer.
- Parameters:
c_type (ConformerType) – Conformer type to be retrieved.
- Raises:
ValueError – If conformer does not exist
- Returns:
RDKit conformer object
- Return type:
- get_scaffolds(scaffolding_method=ScaffoldingMethod.MurckoScaffold)
Compute deemed scaffolds for a given compound.
- Parameters:
scaffolding_method (ScaffoldingMethod, optional) – Defaults to MurckoScaffold. Scaffolding method to use
- Returns:
Scaffolds found in the component.
- Return type:
- has_degenerated_conformer(c_type: ConformerType) bool
Determine if given conformer has missing coordinates or is missing completelly from the rdkit.Mol object. This can be used to determine, whether or not the coordinates should be regenerated.
- Parameters:
type (ConformerType) – type of conformer to be inspected.
- Returns:
True if more than 1 atom has coordinates [0, 0, 0] or the Conformer is not present
- Return type:
- property id: str
Supply the unique identifier for the PDB-CCD, for example ‘ATP’. Obtained from CCD’s _chem_comp.id:
http://mmcif.wwpdb.org/dictionaries/mmcif_std.dic/Items/_chem_comp.id.html
If not defined then the empty string ‘’ will be returned.
- Returns:
the _chem_comp.id or ‘’.
- Return type:
- property inchi: str
Supply the InChI for the PDB-CCD. Obtained from PDB-CCD’s _pdbx_chem_comp_descriptor table line with _pdbx_chem_comp_descriptor.type=InChI, see:
http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx.dic/Items/_pdbx_chem_comp_descriptor.type.html
If not defined then the empty string ‘’ will be returned.
- Returns:
the InChI or ‘’.
- Return type:
- property inchi_from_rdkit: str
Provides the InChI worked out by RDKit.
- Returns:
the InChI or empty ‘’ if there was an error finding it.
- Return type:
- property inchikey: str
Supply the InChIKey for the PDB-CCD. Obtained from PDB-CCD’s _pdbx_chem_comp_descriptor table line with _pdbx_chem_comp_descriptor.type=InChIKey, see:
http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx.dic/Items/_pdbx_chem_comp_descriptor.type.html
If not defined then the empty string ‘’ will be returned.
- Returns:
the InChIKey or ‘’.
- Return type:
- property inchikey_from_rdkit: str
Provides the InChIKey worked out by RDKit.
- Returns:
the InChIKey or ‘’ if there was an error finding it.
- Return type:
- inchikey_from_rdkit_matches_ccd(connectivity_only: bool = False) bool
Checks whether inchikey matches between ccd and rdkit
- library_search(fragment_library: FragmentLibrary) List[SubstructureMapping]
Identify fragments from the fragment library in this component
- Parameters:
fragment_library (FragmentLibrary) – Fragment library.
- Returns:
Matches found in this run
- Return type:
- locate_fragment(mol: Mol) List[List[Atom]]
Identify substructure match in the component.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – Fragment to be matched with structure
- Returns:
List of fragments identified in the component as a list of atoms.
- Return type:
- property modified_date: date
Supply the pdbx_modified_date for the PDB-CCD Obtained from PDB-CCD’s _chem_comp.pdbx_modified_date:
http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_chem_comp.pdbx_modified_date.html
- Returns:
Date of the last entrie’s modification.
- Return type:
- property mol_no_h: Mol
RDKit mol object without hydrogens
- Returns:
RDKit mol object with stripped Hs.
- Return type:
- property name: str
Supply the ‘full name’ of the PDB-CCD, for example ‘ETHANOL’. Obtained from PDB-CCD’s _chem_comp.name:
http://mmcif.wwpdb.org/dictionaries/mmcif_std.dic/Items/_chem_comp.name.html
If not defined then the empty string ‘’ will be returned.
- Returns:
the _chem_comp.name or ‘’.
- Return type:
- property number_atoms: int
Supplies the number of atoms in the _chem_comp_atom table
- Returns:
the number of atoms in the PDB-CCD
- Return type:
- property pdbx_release_status: ReleaseStatus
Supply the pdbx_release_status for the PDB-CCD. Obtained from PDB-CCD’s _chem_comp.pdbx_rel_status:
http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx.dic/Items/_chem_comp.pdbx_release_status.html
- Returns:
enum of the release status (this includes NOT_SET if no value is defined).
- Return type:
pdbeccdutils.core.enums.ReleaseStatus
- property physchem_properties
RDKit calculated properties related to the CCD compound
- property released: bool
Tests pdbx_release_status is REL.
- Returns:
True if PDB-CCD has been released.
- Return type:
- property scaffolds: List[SubstructureMapping]
Lists matched scaffolds and atom names
- Returns:
List of substructure mappings.
- Return type:
pdbeccdutils.core.ccd_writer
- Structure writing module. Presently the following formats are supported:
SDF, CIF, PDB, JSON, XYZ, XML, CML.
- raises CCDUtilsError:
If deemed format is not supported or an unrecoverable error occurres.
- pdbeccdutils.core.ccd_writer.to_cml_str(component: Component, remove_hs=True, conf_type=ConformerType.Ideal)
Converts structure to the EBI representation of the molecule in CML format: http://cml.sourceforge.net/schema/cmlCore.xsd
- Parameters:
component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
String representation of the component in CML format.
- Return type:
- pdbeccdutils.core.ccd_writer.to_json_dict(component: Component, remove_hs=True, conf_type=ConformerType.Ideal)
Returns component information in dictionary suitable for json formating
- Parameters:
component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Raises:
AttributeError – If all conformers are requested. This feature is
not supported not is planned. –
- Returns:
dictionary representation of the component
- Return type:
- pdbeccdutils.core.ccd_writer.to_json_str(component: Component, remove_hs=True, conf_type=ConformerType.Ideal)
Converts structure into JSON representation. https://www.json.org/
- Parameters:
component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
json representation of the component as a string.
- Return type:
- pdbeccdutils.core.ccd_writer.to_pdb_ccd_cif_file(path, component: Component, remove_hs=True)
Converts structure to the PDB CIF format. Both model and ideal coordinates are stored. In case ideal coordinates are missing, rdkit attempts to generate 3D coordinates of the conformer.
- pdbeccdutils.core.ccd_writer.to_pdb_str(component: Component, remove_hs: bool = True, alt_names: bool = False, conf_type: ConformerType = ConformerType.Ideal)
Converts structure to the PDB format.
- Parameters:
Component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
alt_names (bool, optional) – Defaults to False. Whether or not alternate atom names should be exported.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
String representation of the component in the PDB format.
- Return type:
- pdbeccdutils.core.ccd_writer.to_sdf_str(component: Component, remove_hs: bool = True, conf_type: ConformerType = ConformerType.Ideal)
Converts structure to the SDF format.
- Parameters:
component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Raises:
CCDUtilsError – In case the structure could not be exported.
- Returns:
String representation of the component in the SDF format
- Return type:
- pdbeccdutils.core.ccd_writer.to_xml_str(component: Component, remove_hs=True, conf_type=ConformerType.Ideal)
Converts structure to the XML format. Presently just molecule metadata are serialized without any coordinates, which is in accordance with the content of the PDBeChem area.
- Parameters:
component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
String representation of the component in CML format.
- Return type:
- pdbeccdutils.core.ccd_writer.to_xml_xml(component, remove_hs=True, conf_type=ConformerType.Ideal)
Converts structure to the XML format and returns its XML repr.
- Parameters:
component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
XML object
- Return type:
- pdbeccdutils.core.ccd_writer.to_xyz_str(component, remove_hs=True, conf_type=ConformerType.Ideal)
Converts structure to the XYZ format. Does not yet support ConformerType.AllConformers.
- Parameters:
component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
String representation of the component in the XYZ format
- Return type:
- pdbeccdutils.core.ccd_writer.write_molecule(path, component: Component, remove_hs: bool = True, alt_names: bool = False, conf_type: ConformerType = ConformerType.Ideal)
Export molecule in a specified format. Presently supported formats are: PDB CCD CIF (.cif); Mol file (.sdf); Chemical Markup language (.cml); PDB file (.pdb); XYZ file (.xyz); XML (.xml). ConformerType.AllConformers is presently supported only for PDB.
- Parameters:
path (str|Path) – Path to the file. Suffix determines format to be used.
component (Component) – Component to be exported
remove_hs (bool, optional) – Defaults to True. Whether or not hydrogens should be removed.
alt_names (bool, optional) – Defaults to False. Whether or not alternate names should be exported.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal. Conformer type to be exported.
- Raises:
CCDUtilsError – For unsupported format
pdbeccdutils.core.prd_reader
- pdbeccdutils.core.prd_reader.read_pdb_cif_file(path_to_cif: str, sanitize: bool = True) CCDReaderResult
Read in single wwPDB CCD CIF component and create its internal representation.
- Parameters:
- Raises:
ValueError – if file does not exist
- Returns:
Results of the parsing altogether with the internal representation of the component.
- Return type:
CCDReaderResult
- pdbeccdutils.core.prd_reader.read_pdb_components_file(path_to_cif: str, sanitize: bool = True) Dict[str, CCDReaderResult]
Process multiple compounds stored in the wwPDB CCD components.cif file.
- Parameters:
- Raises:
ValueError – if the file does not exist.
- Returns:
Internal representation of all the components in the components.cif file.
- Return type:
pdbeccdutils.core.prd_writer
- Structure writing module. Presently the following formats are supported:
SDF, CIF, PDB, JSON, XYZ, XML, CML.
- raises CCDUtilsError:
If deemed format is not supported or an unrecoverable error occurres.
- pdbeccdutils.core.prd_writer.to_pdb_ccd_cif_file(path, component: Component, remove_hs=True)
Converts structure to the PDB CIF format. Both model and ideal coordinates are stored. In case ideal coordinates are missing, rdkit attempts to generate 3D coordinates of the conformer.
- pdbeccdutils.core.prd_writer.to_pdb_str(component: Component, remove_hs: bool = True, alt_names: bool = False, conf_type: ConformerType = ConformerType.Ideal)
Converts structure to the PDB format.
- Parameters:
Component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
alt_names (bool, optional) – Defaults to False. Whether or not alternate atom names should be exported.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
String representation of the component in the PDB format.
- Return type:
- pdbeccdutils.core.prd_writer.write_molecule(path, component: Component, remove_hs: bool = True, alt_names: bool = False, conf_type: ConformerType = ConformerType.Ideal)
Export molecule in a specified format. Presently supported formats are: PDB CCD CIF (.cif); Mol file (.sdf); Chemical Markup language (.cml); PDB file (.pdb); XYZ file (.xyz); XML (.xml). ConformerType.AllConformers is presently supported only for PDB.
- Parameters:
path (str|Path) – Path to the file. Suffix determines format to be used.
component (Component) – Component to be exported
remove_hs (bool, optional) – Defaults to True. Whether or not hydrogens should be removed.
alt_names (bool, optional) – Defaults to False. Whether or not alternate names should be exported.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal. Conformer type to be exported.
- Raises:
CCDUtilsError – For unsupported format
pdbeccdutils.core.clc_reader
A set of methods for identifying bound-molecules (covalently bonded CCDs ) from mmCIF files of proteins and creating Component representation of molecules.
- class pdbeccdutils.core.clc_reader.CLCReaderResult(warnings, errors, component, sanitized, bound_molecule)
- bound_molecule
Alias for field number 4
- component
Alias for field number 2
- errors
Alias for field number 1
- sanitized
Alias for field number 3
- warnings
Alias for field number 0
- pdbeccdutils.core.clc_reader.get_chem_comp_bonds(cif_block: Block, residue: str)
Returns _chem_comp_bond associated with a residue
- Parameters:
cif_block – gemmi.cif.Block object of protein mmCIF file
residue – CCD ID
- pdbeccdutils.core.clc_reader.infer_multiple_chem_comp(path_to_cif, bm, bm_id, sanitize=True)
- Parameters:
path_to_cif – Path to input structure
bm – bound-molecules identified from input structure
bm_id – ID of bound-molecule
sanitize – True if bound-molecule need to be sanitized
- Returns:
Namedtuple containing Component representation of bound-molecule
- Return type:
- pdbeccdutils.core.clc_reader.read_clc_cif_file(path_to_cif: str, sanitize: bool = True) CCDReaderResult
Read in single CLC CIF component and create its internal representation.
- Parameters:
- Raises:
ValueError – if file does not exist
- Returns:
Results of the parsing altogether with the internal representation of the component.
- Return type:
CCDReaderResult
- pdbeccdutils.core.clc_reader.read_clc_components_file(path_to_cif: str, sanitize: bool = True) dict[str, CCDReaderResult]
Process multiple compounds stored in the wwPDB CCD components.cif file.
- Parameters:
- Raises:
ValueError – if the file does not exist.
- Returns:
Internal representation of all the components in the components.cif file.
- Return type:
- pdbeccdutils.core.clc_reader.read_pdb_cif_file(path_to_cif: str, to_discard: set[str] = {'HOH', 'UNX'}, sanitize: bool = True, assembly: bool = False) list[CLCReaderResult]
Read in single wwPDB Model CIF and create internal representation of its bound-molecules with multiple components.
- Parameters:
- Raises:
ValueError – if file does not exist
- Returns:
A list of CCDResult representations of each bound-molecule.
pdbeccdutils.core.clc_writer
- pdbeccdutils.core.clc_writer.to_cml_str(component: Component, remove_hs=True, conf_type=ConformerType.Model)
Converts structure to the EBI representation of the molecule in CML format: http://cml.sourceforge.net/schema/cmlCore.xsd
- Parameters:
component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
String representation of the component in CML format.
- Return type:
- pdbeccdutils.core.clc_writer.to_pdb_clc_cif_file(path, component: Component, remove_hs=True)
Converts structure to the PDB mmCIF format. :param path: Path to save cif file. :type path: str :param component: Component to be exported. :type component: Component :param remove_hs: Defaults to True. :type remove_hs: bool, optional
- pdbeccdutils.core.clc_writer.to_pdb_str(component: Component, remove_hs: bool = True, conf_type: ConformerType = ConformerType.Model)
Converts structure to the PDB format.
- Parameters:
Component (Component) – Component to be exported.
remove_hs (bool, optional) – Defaults to True.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal.
- Returns:
String representation of the component in the PDB format.
- Return type:
- pdbeccdutils.core.clc_writer.to_xml_xml(component)
Converts structure to the XML format and returns its XML repr.
- Parameters:
component (Component) – Component to be exported.
- Returns:
XML object
- Return type:
- pdbeccdutils.core.clc_writer.write_molecule(path, component: Component, remove_hs: bool = True, conf_type: ConformerType = ConformerType.Model)
Export molecule in a specified format. Presently supported formats are: PDB mmCIF (.cif); Mol file (.sdf); Chemical Markup language (.cml); PDB file (.pdb); XYZ file (.xyz); XML (.xml). ConformerType.AllConformers is presently supported only for PDB.
- Parameters:
path (str|Path) – Path to the file. Suffix determines format to be used.
component (Component) – Component to be exported
remove_hs (bool, optional) – Defaults to True. Whether or not hydrogens should be removed.
conf_type (ConformerType, optional) – Defaults to ConformerType.Ideal. Conformer type to be exported.
- Raises:
CCDUtilsError – For unsupported format
pdbeccdutils.core.boundmolecule
A set of methods for identifying bound-molecules (covalently bonded CCDs ) from mmCIF files of proteins and creating MultiDiGraph representation of the molecules.
- pdbeccdutils.core.boundmolecule.find_pntr_entry(struct_conn: dict[str, list[str]], residue_pool: list[Residue], partner: int, i: int)
Helper method to find ligand residue in parsed ligands and check its connections.
- pdbeccdutils.core.boundmolecule.infer_bound_molecules(structure, to_discard, assembly=False)
Identify bound molecules in the input protein structure.
- pdbeccdutils.core.boundmolecule.parse_bound_molecules(path: str, to_discard: list[str], assembly=False) MultiDiGraph
Parse information from the information about HETATMS from the _pdbx_nonpoly_scheme and connectivity among them from _struct_conn.
- pdbeccdutils.core.boundmolecule.parse_ligands_from_branch_scheme(branch_scheme: dict[str, list[str]], to_discard: list[str], g: MultiDiGraph, assembly=False)
Parse ligands from _pdbx_branch_scheme category of mmCIF file
- Parameters:
branch_scheme – Dictionary of _pdbx_branch_scheme category
to_discard – List of residue names to be not considered as bound-molecule
g – A Graph object with nodes as Resiudes and their connectivity as edges
- Returns:
A MultiDiGraph object with nodes as Resiudes and their connectivity a
- pdbeccdutils.core.boundmolecule.parse_ligands_from_nonpoly_scheme(nonpoly_scheme, to_discard, assembly=False)
Parse ligands from the mmcif file.
pdbeccdutils.core.depictions
Module to aid generation of 2D depictions and evaluation of their quality
- class pdbeccdutils.core.depictions.DepictionManager(pubchem_templates_path: str = '', general_templates_path: str = '/home/runner/work/ccdutils/ccdutils/pdbeccdutils/data/general_templates')
Toolkit for depicting ligand’s structure using RDKit. One can supply either templates or 2D depictions by pubchem. PubChem templates can be downloaded using PubChemDownloader class.
- depict_molecule(het_id: str, mol: Mol) DepictionResult
Given input molecule tries to generate its depictions.
- Presently 3 methods are used:
Pubchem template - find 2d depiction in pubchem db User-provided templates - try to use general templates From 3D conformer - just apply default RDKit functionality
- Parameters:
id (str) – id of the ligand
mol (rdkit.Chem.rdchem.Mol) – molecule to be depicted
- Returns:
Summary of the ligand depiction process.
- Return type:
- class pdbeccdutils.core.depictions.DepictionValidator(mol)
Toolkit for estimation of depiction quality
- count_bond_collisions()
Counts number of collisions among all bonds. Can be used for estimations of how ‘wrong’ the depiction is.
- Returns:
number of bond collisions per molecule
- Return type:
- count_suboptimal_atom_positions(lower_bound, upper_bound)
Detects whether the structure has a pair or atoms in the range <lowerBound, upperBound> meaning that the depiction could be improved.
- depiction_score()
Calculate quality of the ligand depiction. The higher the worse. Ideally that should be 0.
- Returns:
Penalty score.
- Return type:
- has_bond_crossing()
Tells if the structure contains collisions
- Returns:
Indication about bond collisions
- Return type:
- has_degenerated_atom_positions(threshold)
Detects whether the structure has a pair or atoms closer to each other than threshold. This can detect structures which may need a template as they can be handled by RDKit correctly.
pdbeccdutils.core.fragment_library
- class pdbeccdutils.core.fragment_library.FragmentLibrary(path: str = '/home/runner/work/ccdutils/ccdutils/pdbeccdutils/data/fragment_library.tsv', header: bool = True, delimiter: str = '\t', quotechar: str = '"')
Implementation of fragment library.
- generate_conformers()
Generate 3D coordinates for the fragment library.
pdbeccdutils.core.models
Module housing some of the dataclasses used throughout the pdbeccdutils application.
- class pdbeccdutils.core.models.AssemblyResidue(name: str, chain: str, res_id: str, ins_code: str, ent_id: str, orig_chain: str, operator: str)
- class pdbeccdutils.core.models.CCDProperties(id: str, name: str, formula: str, modified_date: date, pdbx_release_status: ReleaseStatus, weight: float)
Properties of the component coming from _chem_comp namespace.
- class pdbeccdutils.core.models.ConformerType(value)
Conformer type of the Component object.
- Ideal
- Model
- Depiction
2D conformation
- Computed
- AllConformers
- class pdbeccdutils.core.models.DepictionResult(source: DepictionSource, template_name: str, mol: Mol, score: float)
Depictions result details.
- Parameters:
source (DepictionSource) – Source of the depiction.
template_name (str) – template name.
mol (rdkit.Chem.rdchem.Mol) – RDKit mol object.
score (float) – Quality of the depiction, lower is better.
- source: DepictionSource
Alias for field number 0
- class pdbeccdutils.core.models.DepictionSource(value)
Where does the depiction come from.
- Pubchem - Pubchem layout used
- Template - general substructure used
- RDKit - RDKit functionality using Coordgen.
- Failed - Nothing worked.
- class pdbeccdutils.core.models.Descriptor(type: str, program: str, program_version: str, value: str)
Descriptor obtained from the cif file. This is essentially _pdbx_chem_comp_descriptor field.
- Parameters:
- class pdbeccdutils.core.models.FragmentEntry(name: str, source: str, mol: Mol)
Fragment entry in the fragment library
- Parameters:
name (str) – Name or id of the fragment.
source (str) – where does this fragment come from.
mol (rdkit.Chem.rdchem.Mol) – rdkit mol object with the fragment.
- class pdbeccdutils.core.models.InChIFromRDKit(inchi: str, warnings: str, errors: str)
InChI calculated by RDKit from rdkit.Chem.rdchem.Mol
- Parameters:
inchi – InChI calculated by RDKit
warnings – WARNINGS generated by rdkit.Chem.inchi.MolToInchi API
errors – ERRORS generated by rdkit.Chem.inchi.MolToInchi API
- class pdbeccdutils.core.models.LogType(value)
Type of logged output
- ERROR
RDKit generated error
- WARNING
RDKit generated warning
- DEPICTION_FAILED
If 2D depiction failed
- DEPICTION_SCORE
2D depiction score from DepictionValidator
- class pdbeccdutils.core.models.MolFromRDKit(mol: str, warnings: str, errors: str)
rdkit.Chem.rdchem.Mol object generated from RDKit
- Parameters:
mol – mol object geenrated by RDKit
warnings – WARNINGS generated by rdkit’s API
errors – ERRORS generated by rdkit’s API
- class pdbeccdutils.core.models.ParityResult(mapping: Dict[str, str], similarity_score: float)
NamedTuple for the result of parity method along with the details necessary for calculating the similarity score.
- mapping (dict of str
str): Atom-level mapping template->query.
- class pdbeccdutils.core.models.ReleaseStatus(value)
An enumeration for pdbx_release_status allowed values include REL and HOLD, see: http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx.dic/Items/_chem_comp.pdbx_release_status.html
Notes
An additional value ‘NOT_SET’ has been added for case where pdbx_release_status has not been set.
- class pdbeccdutils.core.models.Residue(name: str, chain: str, res_id: str, ins_code: str, ent_id: str)
Represents a single residue.
Attributes: name: Corresponds to _atom_site.label_comp_id chain: Corresponds to _atom_site.auth_asym_id res_id: Corresponds to _atom_site.auth_seq_id ins_code: Corresponds to _atom_site.pdbx_PDB_ins_code ent_id: Entity id id: ID of the Residue
- to_arpeggio()
Gets Arpeggio style representation of a residue e.g. /A/129/ or /A/129A/ in case there is an insertion code.
- Returns:
Residue description in Arpeggio style.
- Return type:
- class pdbeccdutils.core.models.SanitisationResult(mol: Mol, status: str)
Sanitisation result details.
- Parameters:
mol – rdkit.Chem.rdchem.RWMol
status – Status of sanitisation process.
- class pdbeccdutils.core.models.ScaffoldingMethod(value)
Rdkit scaffold methods
- class pdbeccdutils.core.models.Subcomponent(name, id)
Represents a subcompoent in a component
- Parameters:
name – Name of the subcomponent
id – Id of the subcomponent
pdbeccdutils.core.exceptions
- exception pdbeccdutils.core.exceptions.CCDUtilsError
Internal error of the pdbeccdutils package.
- exception pdbeccdutils.core.exceptions.EntryFailedException