pdbecif¶
This mmcif module contains all the classes necessary to read and write either a data or a dictionary mmCIF file.
Reading files can be acheived using either CifFileReader
or MMCIF2Dict
:
pdbecif.mmcif¶
The module contains all the objects necessary to represent either a data CIF file or a dictionary CIF file.
mmCIF data files¶
DATA mmCIF files are represented one of 3 ways (interchangeable):
As a series of objects that encapsulate each major component of mmCIF
CifFile -> DataBlock -> [ SaveFrame -> ] Category -> Item
As a python wrapper to a dictionary. Categories and items are accessed through the familiar python dot (.) notation.
As a dictionary of the form
{ DATABLOCK_ID: { CATEGORY: { ITEM: VALUE } } }
mmCIF dictionaries¶
DICTIONARY mmCIF files can ONLY be represented as (1) above i.e.:
As a series of objects that encapsulate each major component of mmCIF
CifFile -> DataBlock -> [ SaveFrame -> ] Category -> Item
Due to the presence of SaveFrame objects they are not interchangeable as the conversion to dictionary type objects has not yet been implemented.
- class pdbecif.mmcif.CIFWrapper(d, data_id=None, preserve_token_order=False)¶
CIFWrapper is a wrapper object for the output of the MMCIF2Dict object i.e., an mmCIF-like python dictionary object. This implies that mmCIF-like dictionaries written outside this package may be used to initialize the CIFWrapper class as well. The CIFWrapper object emulates python objects by providing access to mmCIF categories and items using the familiar python ‘dot’ notation.
- unwrap()¶
Extract encapsulated data to return an mmCIF-like python dictionary
- class pdbecif.mmcif.CIFWrapperTable(d, preserve_token_order=False)¶
CIFWrapperTable represents (and wraps up) mmCif category like dictionaries. Categories that are stored as dictionary like objects are represented as tables and their items and data are accessed using familiar python ‘dot’ notation.
- search(item, value)¶
Search for values of items in tables.
- Parameters
item (str) – Name of the data item to be looked up.
value (str) – Search key, can be also regular expression: e.g. re.compile(r’[A-Z][a-z]’)
- Returns
This is effectivelly dictionary with row-like structure {row_id: {“category_name: “value”}}.
- Return type
dict
- searchiter(item, value)¶
Highly optimised search for values of items in tables.
- Parameters
item (str) – Name of the data item to be looked up.
value (str) – Search key, can be also regular expression: e.g. re.compile(r’[A-Z][a-z]’)
- Returns
This is effectivelly dictionary with row-like structure {row_id: {“category_name: “value”}}.
- Return type
dict
- class pdbecif.mmcif.Category(category_id, parent)¶
Category objects store and manage Item objects. Categories that contain Items that are lists of values would represent looped categories. Category objects are stored and managed by either DataBlock of SaveFrame objects.
- getItemNames()¶
List the Items (by name) stored by Category
- getItems()¶
Retrieve all Item objects
- remove()¶
Remove Category from SaveFrame or DataBlock and add Category to SaveFrame or DataBlock recycle bin
- removeChild(child)¶
Remove Item from the Category using Item(object) or item name ID(string)
- class pdbecif.mmcif.CifFile(file_path=None, mmcif_data_map=None, preserve_token_order=False)¶
CifFile represents all the objects contained/part of an mmCIF file or dictionary. It stores and manages DataBlock objects.
- getDataBlockIds()¶
List the DataBlocks (by ID) stored by CifFile
- getDataBlocks()¶
Retrieve all DataBlock objects stored by CifFile
- import_mmcif_data_map(mmcif_data_map)¶
Populates all objects necessary to represent mmCIF data files. mmcif_data_map is an mmCIF-like dictionary of the form:
{
DATABLOCK_ID: { CATEGORY: { ITEM: VALUE } }
}
- removeChild(child)¶
Remove DataBlock from the CifFile using DataBlock(object) or DataBlock ID(string) @return True if child removed else False
- class pdbecif.mmcif.DataBlock(block_id, parent)¶
DataBlock stores and manages SaveFrame and Category objects in CIF files.
- getCategories()¶
Retrieve all Category objects
- getCategoryIds()¶
List the Categories (by ID) stored by SaveFrame
- getSaveFrameIds()¶
List the SaveFrames (by ID) stored by DataBlock
- getSaveFrames()¶
Retrieve all SaveFrame objects stored by DataBlock
- remove()¶
Remove DataBlock from CifFile and add DataBlock to CifFile recycle bin
- removeChild(child)¶
Remove Category/SaveFrame from the DataBlock using Category/SaveFrame(object) or Category/SaveFrame ID(string)
- updateId(block_id)¶
Change the DataBlock ID
- class pdbecif.mmcif.Item(item_name, parent)¶
Item objects are stored and managed by Category objects while Item objects store and manage values in CIF files. Items that are lists would represent looped categories.
- getFormattedValue()¶
Return the value as it should appear (formatted) in the CIF file
- getRawValue()¶
Raw value is the unformatted value stored by the item
- remove()¶
Remove Item from Category and add Item to the Category recycle bin
- reset()¶
Clear the value of Item for one or all values to ‘.’
- class pdbecif.mmcif.SaveFrame(saveFrame_id, parent)¶
SaveFrame objects store and manage Category objects (Dictionary CIF only). SaveFrame objects are stored and managed by DataBlock objects.
- getCategories()¶
Retrieve all Category objects
- getCategoryIds()¶
List the Categories (by ID) stored by SaveFrame
- remove()¶
Remove SaveFrame from DataBlock and add SaveFrame to DataBlock recycle bin
- removeChild(child)¶
Remove Category from the SaveFrame using Category(object) or Category ID(string)
- updateId(saveFrame_id)¶
Change the SaveFrame definition ID
pdbecif.mmcif_io¶
- class pdbecif.mmcif_io.CifFileReader(input='data', verbose=False, preserve_order=False)¶
CifFileReader takes a path to an mmCIF file location (data or dictionary CIF and once read will return mmCIF file representation
- read(file_path, output='cif_dictionary', ignore=[], preserve_order=False, only=None)¶
Read in mmCIF file
- Parameters
file_path (str) – Path to the mmCIF file
output (str, optional) – Data type of an object the cif file should be written to. should be one of: cif_dictionary (plain python dictionary); cif_wrapper (CIFWrapper); of cif_file (CifFile). Defaults to “cif_dictionary”.
ignore (list, optional) – List of category names to be ignored. Defaults to [].
preserve_order (bool, optional) – Whether the order of categories should be kept. Defaults to False.
only (list, optional) – List of category names to be retrieved. Others are discarded. Defaults to None.
- Returns
In memory representation of the mmCIF file based on the pased parameters.
- Return type
object
- class pdbecif.mmcif_io.CifFileWriter(file_path=None, compress=False, mode='wt', preserve_order=False)¶
CifFileWriter writes mmCIF formatted files and accepts mmCIF-like dictionary files, CIFWrapper objects, and CifFile objects.
- write(cifObjIn, compress=False, mode='wt', preserve_order=False)¶
Write out object into a mmCIF file.
- Parameters
cifObjIn (object) – Can be one of CifFile, CIFWrapper or dict
compress (bool, optional) – Whether or not the result file. should be gzipped. Defaults to False.
mode (str, optional) – Mode used for file opening. Defaults to “wt”.
preserve_order (bool, optional) – Preserve order of category names in the input object. Defaults to False.
pdbecif.mmcif_tools¶
A very low level access to mmCIF data files. MMCIF2Dict has one method ‘parse()’ that returns (datablock_id, mmCIF_data) tuples as (str, dict)
MMCIF2DICT is very fast at reading mmCIF data.
- class pdbecif.mmcif_tools.MMCIF2Dict¶
MMCIF2Dict is a purely algorithmic parser that takes as input public mmCIF files and creates a python dictionary from them.
Because this parser is highly optimised for public mmCIF format, it is highly unlikely that it will work successfully on any other formatted mmCIF file.
MMCIF2Dict will not work on mmCIF dictionaries!
Users are able to speed up parsing of public mmCIF data files substantially by including a list of categoriies that the parser can ignore if encountered.
For example:
parser.parse(path, ignoreCategories=[“_atom_site”, “_atom_site_anisotrop”])
will ignore all coordinate lines in the file.
- parse(file_path, ignoreCategories=[], preserve_token_order=False, onlyCategories=[])¶
Public method which only functions to check the existence of the mmCIF file in preparation for reading in the private parseFile method.