Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add agat sp filter feature from kill list #105

Merged
merged 34 commits into from
Oct 26, 2024
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
464a432
add help
Leila011 Jul 31, 2024
02bd937
add config
Leila011 Jul 31, 2024
c969fd9
add run script
Leila011 Jul 31, 2024
e529fae
add test data and expected output + script to fetch them
Leila011 Jul 31, 2024
54b3dac
update config: kill_list as Inputs
Leila011 Jul 31, 2024
df7c366
all fetch kill_list.txt
Leila011 Jul 31, 2024
6a482c7
add tests
Leila011 Jul 31, 2024
45fd69a
update changelog
Leila011 Jul 31, 2024
e44163f
run script: fixe `verbose` usage
Leila011 Aug 7, 2024
6471a5c
Merge main into add-agat_sp_filter_feature_from_kill_list
rcannood Aug 13, 2024
bf57bd4
Update src/agat/agat_sp_filter_feature_from_kill_list/config.vsh.yaml
Leila011 Aug 19, 2024
09645c4
Update src/agat/agat_sp_filter_feature_from_kill_list/config.vsh.yaml
Leila011 Aug 19, 2024
4430d2f
Update src/agat/agat_sp_filter_feature_from_kill_list/config.vsh.yaml
Leila011 Aug 19, 2024
b7d8954
Update src/agat/agat_sp_filter_feature_from_kill_list/config.vsh.yaml
Leila011 Aug 19, 2024
dfcf5eb
Update src/agat/agat_sp_filter_feature_from_kill_list/config.vsh.yaml
Leila011 Aug 19, 2024
06a37e3
update --config description
Leila011 Aug 19, 2024
53477d9
add requirements
Leila011 Aug 19, 2024
8367f7f
format the description of --type
Leila011 Aug 19, 2024
b974505
update --config description
Leila011 Aug 19, 2024
bccbd5f
update formatting --type description
Leila011 Aug 19, 2024
9c7fd67
add mutliple to --type
Leila011 Aug 19, 2024
7a22888
create temporary directory and clean up on exit
Leila011 Aug 19, 2024
4819514
convert par_type to comma separated list
Leila011 Aug 19, 2024
04198b0
add set -e
Leila011 Aug 19, 2024
e74f583
fix create temporary directory
Leila011 Aug 19, 2024
08b0ad5
add set -eo pipefail to script and test files
Leila011 Aug 19, 2024
409c13c
fix create temporary directory
Leila011 Aug 19, 2024
c74f180
fix typo
Leila011 Aug 19, 2024
31c7662
cleanup changelog
Leila011 Aug 19, 2024
79ecbc6
cleanup changelog
Leila011 Aug 19, 2024
39e3044
Minor chanegs to config
emmarousseau Oct 7, 2024
421158a
reduce test data size
emmarousseau Oct 7, 2024
c7d0f03
Merge branch 'main' into add-agat_sp_filter_feature_from_kill_list
emmarousseau Oct 26, 2024
02892d3
Merge branch 'main' into add-agat_sp_filter_feature_from_kill_list
rcannood Oct 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,17 @@

## NEW FUNCTIONALITY

* `agat`:
- `agat/agat_convert_genscan2gff`: convert a genscan file into a GFF file (PR #100).

* `bd_rhapsody/bd_rhapsody_sequence_analysis`: BD Rhapsody Sequence Analysis CWL pipeline (PR #96).

* `rsem/rsem_calculate_expression`: Calculate expression levels (PR #93).

* `nanoplot`: Plotting tool for long read sequencing data and alignments (PR #95).

* `agat`:
- `agat/agat_convert_genscan2gff`: convert a genscan file into a GFF file (PR #100).
- `agat/agat_sp_filter_feature_from_kill_list`: remove features in a GFF file based on a kill list (PR #105).


## BREAKING CHANGES

* `falco`: Fix a typo in the `--reverse_complement` argument (PR #157).
Expand Down Expand Up @@ -55,6 +57,7 @@
- `agat/agat_convert_sp_gff2tsv`: convert gtf/gff file into tabulated file (PR #102).
- `agat/agat_convert_sp_gxf2gxf`: fixes and/or standardizes any GTF/GFF file into full sorted GTF/GFF file (PR #103).


* `bedtools`:
- `bedtools/bedtools_intersect`: Allows one to screen for overlaps between two sets of genomic features (PR #94).
- `bedtools/bedtools_sort`: Sorts a feature file (bed/gff/vcf) by chromosome and other criteria (PR #98).
Expand Down Expand Up @@ -91,6 +94,7 @@

* `trimgalore`: Quality and adapter trimming for fastq files (PR #117).


## MINOR CHANGES

* `busco` components: update BUSCO to `5.7.1` (PR #72).
Expand Down
105 changes: 105 additions & 0 deletions src/agat/agat_sp_filter_feature_from_kill_list/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
name: agat_sp_filter_feature_from_kill_list
namespace: agat
description: |
Remove features based on a kill list. The default behaviour is to look at the features's ID.
If the feature has an ID (case insensitive) listed among the kill list it will be removed.
Removing a level1 or level2 feature will automatically remove all linked subfeatures, and
removing all children of a feature will automatically remove this feature too.
keywords: [gene annotations, filtering, gff]
links:
homepage: https://github.com/NBISweden/AGAT
documentation: https://agat.readthedocs.io/en/latest/tools/agat_sp_filter_feature_from_kill_list.html
issue_tracker: https://github.com/NBISweden/AGAT/issues
repository: https://github.com/NBISweden/AGAT
references:
doi: 10.5281/zenodo.3552717
license: GPL-3.0
requirements:
- commands: [agat]
authors:
- __merge__: /src/_authors/leila_paquay.yaml
roles: [ author, maintainer ]

argument_groups:
- name: Inputs
arguments:
- name: --gff
alternatives: [-f, --ref, --reffile]
description: Input GFF3 file that will be read.
type: file
required: true
- name: --kill_list
alternatives: [--kl]
description: Text file containing the kill list. One value per line.
type: file
required: true
example: kill_list.txt
- name: Outputs
arguments:
- name: --output
alternatives: [-o, --out]
description: |
Path to the output GFF file that contains filtered features.
type: file
direction: output
required: true
- name: Arguments
arguments:
- name: --type
alternatives: [-p, -l]
description: |
Primary tag option, case insensitive, list. Allow to specify the feature types that
will be handled.

You can specify a specific feature by giving its primary tag name (column 3) as:

* cds
* Gene
* mRNA

You can specify directly all the feature of a particular
level:

* level2=mRNA,ncRNA,tRNA,etc
* level3=CDS,exon,UTR,etc.

By default all features are taken into account. Fill the option with the value "all" will
have the same behaviour.
type: string
multiple: true
- name: --attribute
alternatives: [-a]
description: |
Attribute tag to specify the attribute to analyse. Case sensitive. Default: ID
type: string
example: ID
- name: --config
alternatives: [-c]
description: |
AGAT config file. By default AGAT takes the original agat_config.yaml shipped with AGAT.
The `--config` option gives you the possibility to use your own AGAT config file (located
elsewhere or named differently).
type: file
example: custom_agat_config.yaml
- name: --verbose
alternatives: [-v]
description: Verbose option for debugging purpose.
type: boolean_true
resources:
- type: bash_script
path: script.sh
test_resources:
- type: bash_script
path: test.sh
- type: file
path: test_data
engines:
- type: docker
image: quay.io/biocontainers/agat:1.4.0--pl5321hdfd78af_0
setup:
- type: docker
run: |
agat --version | sed 's/AGAT\s\(.*\)/agat: "\1"/' > /var/software_versions.txt
runners:
- type: executable
- type: nextflow
85 changes: 85 additions & 0 deletions src/agat/agat_sp_filter_feature_from_kill_list/help.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
```sh
agat_sp_filter_feature_from_kill_list.pl --help
```

------------------------------------------------------------------------------
| Another GFF Analysis Toolkit (AGAT) - Version: v1.4.0 |
| https://github.com/NBISweden/AGAT |
| National Bioinformatics Infrastructure Sweden (NBIS) - www.nbis.se |
------------------------------------------------------------------------------


Name:
agat_sp_filter_feature_from_kill_list.pl

Description:
The script aims to remove features based on a kill list. The default
behaviour is to look at the features's ID. If the feature has an ID
(case insensitive) listed among the kill list it will be removed. /!\
Removing a level1 or level2 feature will automatically remove all linked
subfeatures, and removing all children of a feature will automatically
remove this feature too.

Usage:
agat_sp_filter_feature_from_kill_list.pl --gff infile.gff --kill_list file.txt [ --output outfile ]
agat_sp_filter_feature_from_kill_list.pl --help

Options:
-f, --reffile, --gff or -ref
Input GFF3 file that will be read

-p, --type or -l
primary tag option, case insensitive, list. Allow to specied the
feature types that will be handled. You can specified a specific
feature by given its primary tag name (column 3) as: cds, Gene,
MrNa You can specify directly all the feature of a particular
level: level2=mRNA,ncRNA,tRNA,etc level3=CDS,exon,UTR,etc By
default all feature are taking into account. fill the option by
the value "all" will have the same behaviour.

--kl or --kill_list
Kill list. One value per line.

-a or --attribute
Attribute tag to specify the attribute to analyse. Case
sensitive. Default: ID

-o or --output
Output GFF file. If no output file is specified, the output will
be written to STDOUT.

-v Verbose option for debugging purpose.

-c or --config
String - Input agat config file. By default AGAT takes as input
agat_config.yaml file from the working directory if any,
otherwise it takes the orignal agat_config.yaml shipped with
AGAT. To get the agat_config.yaml locally type: "agat config
--expose". The --config option gives you the possibility to use
your own AGAT config file (located elsewhere or named
differently).

-h or --help
Display this helpful text.

Feedback:
Did you find a bug?:
Do not hesitate to report bugs to help us keep track of the bugs and
their resolution. Please use the GitHub issue tracking system available
at this address:

https://github.com/NBISweden/AGAT/issues

Ensure that the bug was not already reported by searching under Issues.
If you're unable to find an (open) issue addressing the problem, open a new one.
Try as much as possible to include in the issue when relevant:
- a clear description,
- as much relevant information as possible,
- the command used,
- a data sample,
- an explanation of the expected behaviour that is not occurring.

Do you want to contribute?:
You are very welcome, visit this address for the Contributing
guidelines:
https://github.com/NBISweden/AGAT/blob/master/CONTRIBUTING.md
22 changes: 22 additions & 0 deletions src/agat/agat_sp_filter_feature_from_kill_list/script.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/bin/bash

set -eo pipefail

## VIASH START
## VIASH END

# unset flags
[[ "$par_verbose" == "false" ]] && unset par_verbose

# convert par_type to comma separated list
par_type=$(echo $par_type | tr ';' ',')

# run agat_sp_filter_feature_from_kill_list
agat_sp_filter_feature_from_kill_list.pl \
--gff "$par_gff" \
--kill_list "$par_kill_list" \
--output "$par_output" \
${par_type:+--type "${par_type}"} \
${par_attribute:+--attribute "${par_attribute}"} \
${par_config:+--config "${par_config}"} \
${par_verbose:+-v}
36 changes: 36 additions & 0 deletions src/agat/agat_sp_filter_feature_from_kill_list/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#!/bin/bash

set -eo pipefail

## VIASH START
## VIASH END

test_dir="${meta_resources_dir}/test_data"

# create temporary directory and clean up on exit
TMPDIR=$(mktemp -d "$meta_temp_dir/$meta_functionality_name-XXXXXX")
function clean_up {
[[ -d "$TMPDIR" ]] && rm -rf "$TMPDIR"
}
#trap clean_up EXIT

echo "> Run $meta_name with test data"
"$meta_executable" \
--gff "$test_dir/1_truncated.gff" \
--kill_list "$test_dir/kill_list.txt" \
--output "$TMPDIR/output.gff"

echo ">> Checking output"
[ ! -f "$TMPDIR/output.gff" ] && echo "Output file output.gff does not exist" && exit 1

echo ">> Check if output is empty"
[ ! -s "$TMPDIR/output.gff" ] && echo "Output file output.gff is empty" && exit 1

echo ">> Check if output matches expected output"
diff "$TMPDIR/output.gff" "$test_dir/test_output.gff"
if [ $? -ne 0 ]; then
echo "Output file output.gff does not match expected output"
exit 1
fi

echo "> Test successful"
Loading