Skip to content

Commit

Permalink
select_atoms, select_residues, isresidue, residuesdict
Browse files Browse the repository at this point in the history
  • Loading branch information
diegozea committed Jun 25, 2024
1 parent fda198d commit 151548c
Show file tree
Hide file tree
Showing 10 changed files with 88 additions and 115 deletions.
20 changes: 20 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,25 @@
## MIToS.jl Release Notes

### Changes from v2.19.0 to v2.20.0

* *[Breaking change]* The PDB module has deprecated `residues` and `@residues` in favor of
the `select_residues` function that uses keyword arguments.
So, `residues(pdb, "1", "A", "ATOM", All)` or `@residues pdb "1" "A" "ATOM" All` should be
replaced by `select_residues(pdb, model="1", chain="A", group="ATOM")`.

* *[Breaking change]* The PDB module has deprecated `atoms` and `@atoms` in favor of
the `select_atoms` function that uses keyword arguments.
So, `atoms(pdb, "1", "A", "ATOM", All, "CA")` or `@atoms pdb "1" "A" "ATOM" All "CA"` should be
replaced by `select_atoms(pdb, model="1", chain="A", group="ATOM", atom="CA")`.

* *[Breaking change]* The PDB module has deprecated the methods of the `isresidue` and
`residuesdict` functions that rely on positional arguments in favor of the keyword arguments.
So, `isresidue(pdb, "1", "A", "ATOM", "10")` should be replaced by
`isresidue(pdb, model="1", chain="A", group="ATOM", residue="10")`. Similarly,
`residuesdict(pdb, "1", "A", "ATOM", All)` should be replaced by
`residuesdict(pdb, model="1", chain="A", group="ATOM")`.


### Changes from v2.18.0 to v2.19.0

* *[Breaking change]* The `shuffle` and `shuffle!` functions are deprecated in favor of the
Expand Down
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name = "MIToS"
uuid = "51bafb47-8a16-5ded-8b04-24ef4eede0b5"
version = "2.19.0"
version = "2.20.0"

[deps]
ArgParse = "c7e460c6-2fb9-53a9-8c5b-16f535851c63"
Expand Down
63 changes: 19 additions & 44 deletions docs/src/PDB.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,42 +81,35 @@ CA_1ivo[1] # First residue. It has only the α carbon.
MIToS parse PDB files to vector of residues, instead of using a hierarchical structure
like other packages. This approach makes the search and selection of residues or atoms a
little different.
To make it easy, this module exports a number of functions and macros to select particular
residues or atoms. Given the fact that residue numbers from different chains, models, etc.
can collide, **it's mandatory to indicate the `model`, `chain`, `group`, `residue` number
and `atom` name in a explicit way** to these functions or macros. If you want to select all
the residues in one of the categories, you are able to use the type `All`. You can also use
regular expressions or functions to make the selections.
To make it easy, this module exports the `select_residues` and `select_atoms` functions.
Given the fact that residue numbers from different chains, models, etc. can collide, we
can indicate the `model`, `chain`, `group`, `residue` number and `atom` name using the
keyword arguments of those functions. If you want to select all the residues in one of the
categories, you are able to use the type `All` (this is the default value of such arguments).
You can also use regular expressions or functions to make the selections.

```@example pdb_select
using MIToS.PDB
pdbfile = downloadpdb("1IVO", format=PDBFile)
residues_1ivo = read_file(pdbfile, PDBFile)
# Select residue number 9 from model 1 and chain B
residues(residues_1ivo, "1", "B", All, "9")
# Select residue number 9 from model 1 and chain B (it looks in both ATOM and HETATM groups)
select_residues(residues_1ivo, group="1", chain="B", residue="9")
```

### Getting a `Dict` of `PDBResidue`s

If you prefer a `Dict` of `PDBResidue`, indexed by their residue numbers, you can use the
`residuedict` function or the `@residuedict` macro.
`residuedict` function.

```@example pdb_select
# Dict of residues from the model 1, chain A and from the ATOM group
chain_a = residuesdict(residues_1ivo, "1", "A", "ATOM", All)
chain_a = residuesdict(residues_1ivo, model="1", chain="A", group="ATOM")
chain_a["9"]
```

You can do the same with the macro `@residuesdict` to get a more readable code

```@example pdb_select
chain_a = @residuesdict residues_1ivo model "1" chain "A" group "ATOM" residue All
chain_a["9"]
```
```

### Select particular residues

Use the `residues` function to collect specific residues. It's possible to use a single
Use the `select_residues` function to collect specific residues. It's possible to use a single
**residue number** (i.e. `"2"`) or even a **function** which should return true for the
selected residue numbers. Also **regular expressions** can be used to select residues.
Use `All` to select all the residues.
Expand All @@ -130,7 +123,7 @@ residue_list = map(string, 2:5)
```

```@example pdb_select
first_res = residues(residues_1ivo, "1", "A", "ATOM", resnum -> resnum in residue_list)
first_res = select_residues(residues_1ivo, model="1", chain="A", group="ATOM", residue=resnum -> resnum in residue_list)
for res in first_res
println(res.id.name, " ", res.id.number)
Expand All @@ -142,7 +135,7 @@ A more complex example using an anonymous function:
```@example pdb_select
# Select all the residues of the model 1, chain A of the ATOM group with residue number less than 5
first_res = residues(residues_1ivo, "1", "A", "ATOM", x -> parse(Int, match(r"^(\d+)", x)[1]) <= 5 )
first_res = select_residues(residues_1ivo, model="1", chain="A", group="ATOM", residue=x -> parse(Int, match(r"^(\d+)", x)[1]) <= 5 )
# The anonymous function takes the residue number (string) and use a regular expression
# to extract the number (without insertion code).
# It converts the number to `Int` to test if the it is `<= 5`.
Expand All @@ -152,35 +145,17 @@ for res in first_res
end
```

Use the `@residues` macro for a cleaner syntax.

```@example pdb_select
# You can use All, regular expressions or functions also for model, chain and group:
# i.e. Takes the residue 10 from chains A and B
for res in @residues residues_1ivo model "1" chain ch -> ch in ["A","B"] group "ATOM" residue "10"
println(res.id.chain, " ", res.id.name, " ", res.id.number)
end
```

### Select particular atoms

The `atoms` function or macro allow to select a particular set of atoms.
The `select_atoms` function allow to select a particular set of atoms.

```@example pdb_select
# Select all the atoms with name starting with "C" using a regular expression
# from all the residues of the model 1, chain A of the ATOM group
carbons = @atoms residues_1ivo model "1" chain "A" group "ATOM" residue All atom r"C.+"
carbons = select_atoms(residues_1ivo, model="1", chain="A", group="ATOM", residue=All, atom=r"C.+")
carbons[1]
```

You can also use the `atoms` function instead of the `@atoms` macro:

```@example pdb_select
atoms(residues_1ivo, "1", "A", "ATOM", All, r"C.+")[1]
```

## Protein contact map
Expand All @@ -202,7 +177,7 @@ pdbfile = downloadpdb("1IVO", format=PDBFile)
residues_1ivo = read_file(pdbfile, PDBFile)
pdb = @residues residues_1ivo model "1" chain "A" group "ATOM" residue All
pdb = select_residues(residues_1ivo, model="1", chain="A", group="ATOM")
dmap = distance(pdb, criteria="All") # Minimum distance between residues using all their atoms
```
Expand Down Expand Up @@ -256,8 +231,8 @@ pdbfile = downloadpdb("2HHB")
res_2hhb = read_file(pdbfile, PDBML)
chain_A = pdb = @residues res_2hhb model "1" chain "A" group "ATOM" residue All
chain_C = pdb = @residues res_2hhb model "1" chain "C" group "ATOM" residue All
chain_A = select_residues(res_2hhb, model="1", chain="A", group="ATOM", residue=All)
chain_C = select_residues(res_2hhb, model="1", chain="C", group="ATOM", residue=All)
using Plots
gr()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ Hx = mapcolfreq!(entropy,
# functions from the MIToS `PDB` module:

using MIToS.PDB
res_dict = residuesdict(read_file(pdb_file, PDBFile, occupancyfilter=true), "1", "A") # model 1 chain A
res_dict = residuesdict(read_file(pdb_file, PDBFile, occupancyfilter=true), model="1", chain="A")

# Then, we can iterate the mapping dictionary to link the MSA and PDB based
# values:
Expand Down
2 changes: 1 addition & 1 deletion scripts/Distances.jl
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ set_parallel(Args["parallel"])
model_arg = string(args["model"]) == "All" ? All : string(args["model"])
chain_arg = string(args["chain"]) == "All" ? All : string(args["chain"])
group_arg = string(args["group"]) == "All" ? All : string(args["group"])
res = residues(res, model_arg, chain_arg, group_arg, All)
res = select_residues(res, model=model_arg, chain=chain_arg, group=group_arg, residue=All)
N = length(res)
inter = !Bool(args["inter"])
for i in 1:(N-1)
Expand Down
2 changes: 2 additions & 0 deletions src/PDB/PDB.jl
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,12 @@ export # PDBResidues
contact,
isresidue,
isatom,
select_residues,
residues,
@residues,
residuesdict,
@residuesdict,
select_atoms,
atoms,
@atoms,
findheavy,
Expand Down
3 changes: 2 additions & 1 deletion test/PDB/Contacts.jl
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,8 @@

code = "2VQC"
pdb = read_file(txt(code), PDBFile)
residues = @residues pdb model "1" chain "A" group "ATOM" residue x -> x in ["62","64","65"]
residues = select_residues(pdb, model="1", chain="A", group="ATOM",
residue = x -> x in ["62","64","65"])

@test contact(residues, 6.05) == ( [1 1 0
1 1 1
Expand Down
8 changes: 4 additions & 4 deletions test/PDB/Kabsch.jl
Original file line number Diff line number Diff line change
Expand Up @@ -107,10 +107,10 @@

hemoglobin = read_file(joinpath(DATA, "2hhb.pdb.gz"),PDBFile,group="ATOM",model="1")

α1 = @residues hemoglobin model "1" chain "A" group "ATOM" residue All
α2 = @residues hemoglobin model "1" chain "C" group "ATOM" residue All
β1 = @residues hemoglobin model "1" chain "B" group "ATOM" residue All
β2 = @residues hemoglobin model "1" chain "D" group "ATOM" residue All
α1 = select_residues(hemoglobin, model="1", chain="A", group="ATOM")
α2 = select_residues(hemoglobin, model="1", chain="C", group="ATOM")
β1 = select_residues(hemoglobin, model="1", chain="B", group="ATOM")
β2 = select_residues(hemoglobin, model="1", chain="D", group="ATOM")

a1, a2, rα = superimpose(α1, α2)

Expand Down
Loading

2 comments on commit 151548c

@diegozea
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/109758

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v2.20.0 -m "<description of version>" 151548c9cfe3d4a05d40cbf21e4430ccaadf699e
git push origin v2.20.0

Please sign in to comment.