VEBA_v2.1.0b (pre-release)
Pre-release
Pre-release
Beta release of VEBA v2.1.0b with updates to address peer reviewers. Mostly documentation but also including the following:
- [2024.4.30] - Added
concatenate_files.py
which can concatenate files (and mixed compressed/decompressed files) using either arguments, list file, or glob. Reason for this is that unix has a limit of arguments that can be used (e.g.,cat *.fasta > output.fasta
where *.fasta results in 50k files will crash) - [2024.4.29] - Added
/volumes/workspace/
directory to Docker containers for situations when your input and output directories are the same. - [2024.4.29] -
featureCounts
can only handle 64 threads at a time so addedmin(64, opts.n_jobs)
for all the modules/scripts that usefeatureCounts
commands. - [2024.4.23] - Added
uniprot_to_enzymes.py
which reformats tables and fasta from https://www.uniprot.org/uniprotkb?query=ec%3A* - [2024.4.18] - Developed a faster implementation of
KofamScan
calledPyKofamSearch
which leveragePyHmmer
. This will be used in future versions of VEBA. - [2024.3.26] - Added
--metaeuk_split_memory_limit
tometaeuk_wrapper.py
. - [2024.3.26] - Added
-d/--genome_identifier_directory_index
toscaffolds_to_bins.py
for directories that are structuredpath/to/genomes/bin_a/reference.fasta
where you would use-d -2
. - [2024.3.26] - Added
--minimum_af
toedgelist_to_clusters.py
with an option to accept 4 column inputs[id_1]<tab>[id_2]<tab>[weight]<tab>[alignment_fraction]
.global_clustering.py
,local_clustering.py
, andcluster.py
now use this by default--af_threshold 30.0
. If you want to retain previous behavior, just use--af_threshold 0.0
. - [2024.3.18] -
edgelist_to_clusters.py
only includes edges where both nodes are in identifiers set. If--identifiers
are provided, then only those identifiers are used. If not, then it includes all nodes. - [2024.3.18] - Added
--export_representatives
argument foredgelist_to_clusters.py
to output table with[id_node]<tab>[id_cluster]<tab>[intra-cluster_connectivity]<tab>[representative]
. Also includes this information innx.Graph
objects. - [2024.3.18] - Changed singleton weight to
np.nan
instead ofnp.inf
foredgelist_to_clusters.py
to allow for representative calculations.