Skip to content

Atomselection feature of CodeEntropy

Arghya Chakravorty edited this page Apr 2, 2022 · 3 revisions

Atomselection in CodeEntropy

CodeEntropy has features that enable selection of atoms by virtue of their properties. Selection allows an easier way to create cells (aka group of atoms or beads) for which entropies can be computed. For example, selecting all the atoms and making a single cell from it can be used to get the entropy at the molecule-level, selecting all the atoms in a residue of a protein and creating cells for individual residues cab be used to calculate the entropy at the residue-level.

The design of the inbuilt Atomselection feature is inspired by that of CHARMM and VMD. The feature is implemented using a Atomselection class. An object of this class contains various information about the selected atoms which can be accessed for different purposes.

To make a selection, an object of class Basemolecule must have been defined. That can be done using one of the many Readers in the program.

An example of its usage where all the atoms in a residue with resid 6 are selected is shown below:

  from CodeEntropy.ClassCollection import Atomselection as SEL
  
  # assumes an object of Class BaseMolecule called mainMolecule exists.
  sel6 = SEL.Atomselection(mainMolecule,"resid 6")

Its equivalent in VMD (on the top molecule) would be

    set sel6 [atomselect top "resid 6"]

Selection tokens

Selections can be made using the following selection tokens. The feature it shares with CHARMM's selection syntax is that only the first four letters of a selection token are required to make a valid selection and it is case-insensitive.

  • "ALL" # selects all atoms

  • "NONE" # selects no atom

  • "HYDRogen" # selects all the hydrogen atoms (atoms with names starting with letter H)

  • "NAME < atom_name >" # selects atoms with name atom_name

  • "RESName < residue_name >" # selects atoms in residues with name residue_name

  • "RESId < residue_id >" # selects atoms in a residue with resid residue_id

  • "RESD <residue index (0-based)>" # selects atoms in a residue matched by a 0-indexed residue index

  • "BYNUmber < index of an atom (0-indexed) > " # selects a single atom with provided index (as present in the topology file)

  • "BONDed_to < index of an atom (0-indexed) >" # selects all the atoms covalently bonded to the atom with the provided index

  • "BONHydrogen < index of an atom (0-indexed >" # select an atom and the hydrogens covalently bonded to it (includes the atom itself).

As one can see, selection tokens differ in what arguments they accept. Some accept no argument while others accept one. Together with its argument, the token must be supplied as a python string. This makes it easy to use variables and formatted or f-strings. The above example can be replicated in the following way

  res6 = 6
  sel6 = SEL.Atomselection(mainMolecule, f"resid {res6}")
  
  # or equivalently
  sel6 = SEL.Atomselection(mainMolecule, "resid {}".format(res6))
  

Compound selections

Atomselection feature also enables compounded selections where more than one selections can be logically joined to create a new selection. This can be done using & (AND) and | (OR) operators.

For example, to select CA atoms in a residue with resid 30, the following can be done.

  selCA30 = SEL.Atomselection(mainMolecule, "name CA") & SEL.Atomselection(mainMolecule, "resid 30")

It's equivalent in VMD would be

  set selCA30 [atomselect top "name CA and resid 30"]

NOT operation (using ~) is also permitted. To select all the heavy atoms in a residue with resid 30, for example, the following can be done.

  res30heavy = SEL.Atomselection(mainMolecule, "resid 30") & ~SEL.Atomselection(mainMolecule, "hydrogen")

It's equivalent in VMD would be

  set res30heavy [atomselect top "resid 30 and noh"]

Looping through a selection

Properties of individual atoms making up a selection is often useful. Using the get_indices() function, the 0-indices of the selected atoms can be obtained in a python list. Consequently, an index of one of the selected atoms can be accessed using list-indexing operation in python.

Example:

  # print the indices of the atoms in res30heavy
  for i in res30heavy.get_indices():
    print(i)

Another useful feature is to get the index of the atom one by one in the order it is stored in the selection's internal list. This can be done using the get_index() function. For example,

  # print the index of first atom
  print(res30heavy.get_indices()[0])
  
  # equivalent to
  print(res30heavy.get_index())
  
  # Another call will return the next index
  print(res30heavy.get_index()) # == print(res30heavy.get_indices()[1])

The function returns None if the list has exhausted.

Number of selected atoms

The number of atoms selected can also be accessed using nsel attribute of the object.

  # to obtain the total number of atoms selected
  natoms = res30heavy.nsel

Overall, the feature is designed to make it easy for users to work with multiple selections and use python syntax to use them.

This feature is currently under development