Edited by William B Whitman a, Maria Chuvochina b, Brian P Hedlund c, Philip Hugenholtz b, Kostas T Konstantinidis d, Alison E Murray e, Marike Palmer c, Donovan H Parks b, Alexander J Probst f, Anna-Louise Reysenbach g, Luis M Rodriguez-R h, Ramon Rossello-Mora i, Iain Sutcliffe j and Stephanus N Venter k
- a Department of Microbiology, University of Georgia, Athens, GA, USA
- b The University of Queensland, School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, Australia
- c School of Life Sciences, University of Nevada, Las Vegas, NV, USA
- d School of Civil and Environmental Engineering, Georgia Tech, Atlanta, GA, USA
- e Division of Earth and Ecosystem Sciences, Desert Research Institute, Reno, NV, USA
- f Department of Chemistry, Environmental Microbiology and Biotechnology (EMB), Group for Aquatic Microbial Ecology and Centre of Water and Environmental Research (ZWU), University of Duisburg- Essen, Essen, Germany
- g Biology Department, Portland State University, Portland, OR, USA
- h Department of Microbiology and Digital Science Center (DiSC), University of Innsbruck, Innrain 15 / 01- 05, Innsbruck 6020, Austria
- i Marine Microbiology Group, Department of Animal and Microbial Diversity, Mediterranean Institute of Advanced Studies (CSIC-UIB), Esporles, Illes Balears, Spain
- j Faculty of Health & Life Sciences, Northumbria University, Newcastle upon Tyne, UK
- k Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, South Africa
The progress of prokaryotic microbiology requires a precise system of nomenclature accepted by the majority of microbiologists. For practical purposes the term prokaryotes is a synonym for Archaea and Bacteria.
To achieve order in nomenclature, it is essential that scientific names be regulated by internationally accepted rules.
The rules that govern the scientific nomenclature used in the biological sciences are embodied in international codes of nomenclature.
Rules of nomenclature do not govern the delimitation of taxa or determine their relations. The rules are primarily for assessing the correctness of the names applied to defined taxa; they also prescribe procedures for creating and proposing new names.
The Code of Nomenclature of Prokaryotes Described from Sequence Data applies to the naming of all prokaryotes where the lower taxa (species and subspecies) are typified by a DNA sequence. This code is colloquially referred to as the SeqCode to distinguish it from the International Code of Nomenclature of Prokaryotes (ICNP), which applies to the naming of all prokaryotes where the lower taxa (species and subspecies) are typified by either a strain or illustration/description. The nomenclature of eukaryotic microbial groups is provided for by other Codes: fungi and algae by the International Code of Nomenclature for algae, fungi and plants; protozoa by the International Code of Zoological Nomenclature. The nomenclature of viruses is provided for by the International Code of Virus Classification and Nomenclature.
The Committee on the Systematics of Prokaryotes Described from Sequence Data, colloquially the SeqCode Committee, has been established to provide mechanisms to emend, interpret, and consider exemptions to the rules of the SeqCode.
The SeqCode is divided into principles, rules and recommendations. The principles (Chapter 2) form the basis of the Code, and the rules and recommendations are derived from them. The rules (Chapter 3) are designed to implement the principles. The recommendations (Chapter 3) supplement some of the rules and do not have the force of rules. They are intended as guides to desirable practice. Names contrary to recommendations cannot be rejected for this reason. Appendices may be added to assist in the application of this Code and do not form the legislative part of this Code.
Nomenclature deals with the application of names to the following taxonomic ranks, i.e., “subspecies”, “species”, “genus”, “family”, “order”, “class”, and “phylum”.
The SeqCode is an instrument of scientific communication.
The SeqCode has one fundamental aim, which is to provide a standardized, robust, and stable system of nomenclature for prokaryotes that is compatible with the freedom of scientists to classify prokaryotes according to taxonomic opinion. Nothing in the SeqCode may be construed to restrict the freedom of taxonomic opinion or action.
The nomenclature of prokaryotes is not independent of botanical, zoological and viral nomenclature. When naming new taxa at the rank of genus or higher, names that are already regulated by the International Code of Zoological Nomenclature, the International Code of Nomenclature for algae, fungi and plants and the International Code of Virus Classification and Nomenclature must not be used.
The names formed under the SeqCode are not independent of the names regulated by the International Code of Nomenclature of Prokaryotes (ICNP) and formed before January 1, 2022. Before that date, legitimate names formed under the ICNP have priority. After that date, the rules of priority of the SeqCode are used to determine priority of names formed under the ICNP compared with names formed under the SeqCode.
The scientific names of all taxa are Latin or Latinized words treated as Latin.
The primary purpose of giving a name to a taxon is to supply a means of referring to it. A secondary consideration is that names should aid memorability.
Names of taxa are tied to their nomenclatural types, referred to as types in the SeqCode. Types should serve as a reference point that allow the unambiguous identification of taxa.
The correct name of a taxon is based upon valid publication, legitimacy, taxonomic position, and priority of publication.
A name only has standing in nomenclature if it is validly published under the rules of the SeqCode.
Each taxon with a given circumscription, position, and rank can bear only one correct name, i.e., the earliest that is in accordance with the rules of the SeqCode. The circumscription of a taxon is an indication of its limits or the set of biological entities it contains. The position of a taxon or rank is an indication of its relationship to a parent taxon within a taxonomy.
To promote stability, the name of a taxon should not be changed or replaced without sufficient reason based on either further taxonomic studies or the need to rectify a name that is contrary to the rules of SeqCode.
Names created should be clear enough to avoid errors, confusion, or misunderstandings.
The Code on the Nomenclature of Prokaryotes Described from Sequence Data, or SeqCode, will take effect on January 1, 2022.
The SeqCode Legislative Commission has been established as the legislative branch of the SeqCode Committee in accordance with its statutes. The SeqCode Legislative Commission is the only body authorized to amend the SeqCode.
The SeqCode Reconciliation Commission has been established as the judicial branch of the SeqCode Committee to make decisions pertaining to the application of the SeqCode in accordance with its statutes. Examples of cases for the Reconciliation Commission may include: (a) cases in which the consequences or interpretation of a rule are uncertain, (b) cases in which the application of a name is likely to endanger health or have serious economic consequences, or (c) cases where the application of a rule is likely to lead to confusion. When the opinion of the SeqCode Reconciliation Commission is sought, a summary of pertinent facts should be submitted to the Reconciliation Commission. The SeqCode Reconciliation Commission is the only body authorized to render decisions on the application of the SeqCode.
The SeqCode Registry has been established to record and maintain names that are formed or recognized under the SeqCode in accordance with its statutes. Registration of names constitutes valid publication and is required for naming under the SeqCode.
The rules of the SeqCode are retroactive except where specified.
Names contrary to a rule may not be maintained in the SeqCode Registry.
The taxonomic categories covered by these rules are given below in descending taxonomic rank.
- Phylum
- Class
- Order
- Family
- Genus
- Species
- Subspecies
The relative order of these categories may not be altered in any classification even though definitions of taxonomic categories may vary with individual opinion.
The use of the taxonomic category of subspecies is optional.
Intermediate or informal ranks or categories not mentioned in Rule 7a are not covered by the SeqCode.
An author who describes and names a new taxon should indicate the rank of the taxon and where possible the rank and name of the parent taxon. If the parent taxon is not currently named, the author should name it.
The scientific names of all taxa must be treated as Latin and spelled only with the Latin alphabet. A species name is a binary combination of a genus name and a specific epithet; names of taxa above the rank of species are single words. Typographical signs, numbers, and additional characters cannot be used.
A name at any taxonomic rank can only refer to a single type.
A later homonym of a name formed under the nomenclatural codes listed in Principle 2 cannot be used.
To form new prokaryotic names, authors are advised as follows:
- Names that are very long or difficult to pronounce should be avoided.
- Names should differ by at least three characters from existing names of genera or species within the same genus.
- Languages other than Latin should be avoided when Latin equivalents exist or can be constructed by combining Latin word elements. Exceptions include names derived from local items such as foods, drinks, geographic localities, and other names for which no Latin words exist.
- Authors should not name organisms after themselves. If names are formed from personal names, they should contain only the name of one person. They may contain the untruncated family and/or first names.
- All personal genus names should be feminine regardless of the gender identity of the person they commemorate.
- Names should not be deliberately contentious or abusive of any person, race, religion, political belief, or ideology.
- Names that include mnemonic cues are preferred because they promote learning and memory.
The name of a genus is a noun or adjective used as a noun, in the singular number and written with an initial capital letter.
Authors should attend to the following Recommendation and those of Recommendation 9 when forming genus names.
- Names that have the same suffixes as those used for the higher taxonomic ranks should be avoided: -aceae, -ales, -ia, and -ota (see Table 1).
The name of a species is a binary combination consisting of the name of the genus followed by a single word known as a species epithet. The genus part of the name must begin with an initial capital letter, and the species epithet must begin with a lowercase letter.
A species epithet must be related to the genus name in one of three ways.
- As an adjective. Example: aureus in Staphylococcus aureus.
- As a substantive (noun) in apposition in the nominative case. Example: Desulfovibrio gigas or other names cited in Trüper and De’Clari (1997).
- As a noun in the genitive case. Example: coli in Escherichia coli.
Authors should attend to the following recommendations and those of Recommendation 9 when forming species names.
- When a species epithet is chosen to indicate a property or source of the species, epithets should not express a character common to all, or nearly all, the species of a genus.
- When the species epithet is an adjective, it should agree in gender with the genus name.
The name of a subspecies is a ternary combination consisting of three names: the genus name, the species epithet, the abbreviation “subsp.” (subspecies), and the subspecies epithet that begins with a lower-case letter.
A subspecies epithet is formed in the same way as a species epithet.
A subspecies that includes the type of the species must bear the same epithet as the species.
The name of a taxon above the rank of genus is a Latinized word. Names of families and orders are in the feminine gender, the plural number, and written with an initial capital letter. Names of classes and phyla are in the neuter gender, the plural number, and written with an initial capital letter.
The name of a family, order, class, or phylum is formed by the addition of the appropriate suffix to the stem of the type genus name (see Section 4). These suffixes are presented in Table 1.
Rank | Suffix | Example for the genus Hadarchaeum a |
---|---|---|
Phylum | -ota | Hadarchaeota |
Class | -ia | Hadarchaeia |
Order | -ales | Hadarchaeales |
Family | -aceae | Hadarchaeaceae |
a From Chuvochina et al., 2019
Each named taxon must have a designated nomenclatural type. The nomenclatural type, referred to in the SeqCode as “type”, for a species or subspecies is the evidence for that taxon (DNA sequence, see Rule 18a) with which the name is permanently associated. For taxa above the rank of species, the type is one of the subordinate taxa, with which the name is permanently associated. Formation of names of the taxa above the level of genus is based on the names of the types, allowing tracing which biological entity is included in the taxon. The nomenclatural type is not necessarily the most typical or representative element of the taxon.
Types of the various taxonomic categories are presented in Table 2.
Taxonomic category | Nomenclatural type |
---|---|
Subspecies | Designated DNA sequence |
Species | Designated DNA sequence |
Genus | Designated species |
Family | Designated genus |
Order | Designated genus |
Class | Designated genus |
Phylum | Designated genus |
The type of a taxon must be designated for the name to be validly published (see Section 5).
The type of a species or subspecies is a designated DNA sequence that is compliant with the minimum standards designated by the SeqCode Committee for genome, metagenome-assembled genome, or single-amplified genome sequences. The sequence must be available in the International Nucleotide Sequence Database Collaboration (INSDC). Upon recommendations of the SeqCode Committee or subcommittees on the taxonomy of specific groups, the SeqCode Committee may approve other minimal standards as suitable types for specific groups.
The type of a species or subspecies must allow the unambiguous identification of the taxon. Names based on types that later prove to be ambiguous are not legitimate unless a neotype is proposed.
If the type of a name is lost or demonstrated to be ambiguous, a neotype sequence may be proposed to the SeqCode Reconciliation Commission. If approved, the SeqCode Registry will be amended to reflect the new type.
Unless designated under the rules of this code, a reference DNA sequence is not a type but a sequence used in comparative studies. A reference sequence has no standing in nomenclature.
When a strain belonging to a taxon named under the SeqCode is isolated, a reference strain should be designated and submitted to two culture collections in different countries. Reference strains have no standing in nomenclature.
Only taxa with legitimate names may serve as types for taxa higher than the rank of species.
The nomenclatural type of a genus is the type species that was designated when the genus name was originally validly published.
The valid publication of a new genus name as a deliberate substitute for an earlier name found to be illegitimate does not change the type species of the genus.
When more than one subordinate taxa are available to serve as type, the earliest legitimately named taxon available at the time must be chosen, except where the type is neither a strain nor sequence data (i.e., taxa described from illustrations under the ICNP).
Any taxon with a given circumscription, position, and rank can bear only one correct name, the earliest name that is in accordance with the rules of SeqCode.
Note 1. In the case of a species epithet, Rule 23a must be applied independently of the genus name. Under most circumstances, the species epithet remains the same on transfer of a species from one genus to another. However, if the species epithet is currently in use in the name of another species or subspecies in the genus to which the species is to be transferred, a new name must be proposed for the transferred species.
Note 2. In the case of a subspecies, Rule 23a must be applied independently of the genus name and species epithet. The subspecies epithet remains the same on transfer of a subspecies from one species to another unless the subspecies epithet has been previously used as the name of another species or subspecies in the genus to which the subspecies is to be transferred.
The priority of a genus, species, or subspecies name is determined by the time and date of its valid publication, i.e., when the registration of the name is completed. For purposes of priority, only legitimate names are taken into consideration.
The priority of species and subspecies names will compete for priority with names in any other code after January 1, 2022. If two names validly published after 1 January 2022 compete for priority, priority is determined by the time and date of valid publication, either under the rules of the SeqCode or other nomenclatural codes. If both names are published at the same time and date, priority will be decided by the SeqCode Reconciliation Committee.
The priority date of names of taxa of rank higher than genus proposed after 1 January 2022 is the same as the priority date of the corresponding type genus name. The priority date for names published before 1 January 2022 is the same as their priority under the ICNP.
Legitimate names validly published under the ICNP remain legitimate in the SeqCode even if there are differences in type designations.
Effective publication under the SeqCode means that the name and evidence for the taxon have been published in a peer-reviewed journal or book.
When a name of a new taxon is published in a work written in a language other than English, the author(s) should include a description in English in the publication.
The following are not accepted as forms of effective publication.
- Communication of new names at a meeting, minutes of a meeting, or abstracts of papers presented at meetings.
- Placing of names in listings or catalogues of collections.
- Reports in ephemeral publications, newsletters, white papers, self-published papers, or non-scientific periodicals.
- A published patent application or issued patent including the name.
- A database containing names associated with a sequence or metadata.
- Electronic material available in advance of publication (e.g., papers in press or preprints).
The time and date of validation is the time and date of completion of the registration in the SeqCode Registry.
A name of a new taxon, or a new combination for an existing taxon, is not validly published unless the following criteria are met:
- The name is effectively published under the rules of the SeqCode.
- The name is registered in the SeqCode Registry, along with mandatory data fields listed below.
- The type of the taxon is clearly designated. In the case of species or subspecies, the type sequence is deposited according to Rule 18a and the accession number cited.
- The taxonomic rank is designated.
- The derivation (etymology) of a new name (and if necessary of a new combination) is given wherein one or more distinguishable roots are identified. Roots can originate from any language in use or extinct (see also Recommendation 9).
Note 1. When a new species or a new combination results in the proposal of a new genus, both the new genus name and the new species name or the new combination must be validly published. Publication of the new species epithet or new combination alone does not constitute valid publication of the new genus name.
Note 2. When possible, authors are recommended to include the SeqCode Registry identifier in the effective publication.
Note 3. If the information provided in the registration and the effective publication differ, the registration is considered definitive.
It is recommended that the name, etymology, type information, and diagnosis of the novel taxon should be clearly identifiable in a designated section of the effective publication (i.e., the section termed the ‘protologue’ by some microbial taxonomists). Authors are encouraged to provide additional information describing the taxon such as predicted or known physiological characteristics, ecological data, location, and additional metadata. Authors are also encouraged to submit metadata with the type sequence in one of the INSDC databases.
Placement of a species or subspecies epithet into a genus or species that is illegitimate does not preclude the legitimacy of the species or subspecies epithet.
The effective publication should be cited with the name of a previously proposed taxon. Correct citation of a name enables the date of publication, the description, and the circumscription of the taxon to be found. For names published under the SeqCode, the validly published name and date of valid publication should be determined from the SeqCode Registry.
When an author proposes transfer of a species to another genus, or a subspecies to another species, then the author who makes the proposal should indicate the formation of the new combination by the addition of the abbreviation ‘‘comb. nov.’’ (combinatio nova). This convention should be used when the author retains the original species epithet in the new combination. However, if an author is obliged to substitute a new species epithet as a result of homonymy, the abbreviation ‘‘nom. nov.’’ (nomen novum) should be used. The original name is referred to as the basonym and should be given, along with the citation of the effective publication, in the description of the novel combination.
If an alteration of a taxon modifies its circumscription, the author responsible may be indicated by the addition to the author citation of the abbreviation ‘‘emend.’’ (emendavit) followed by the name of the author responsible for the change. Only alterations that cause significant changes in the circumscription warrant description as an emendation.
If the type of a taxon is excluded, a type must be designated for the remaining members of the original taxon (see Rule 17), which must be given a new name.
A change in the name of a taxon is not warranted by an alteration of the diagnostic characters or the circumscription.
If a genus is divided into two or more genera, the genus name must be retained for the genus that retains the type species.
When a species is divided into two or more species, the species epithet of the original species must be retained for the taxon that includes the type.
When a species is divided into two or more subspecies, the species epithet of the original species must be retained for the subspecies that includes the type.
Note. Although the species and subspecies epithets in the name of a type subspecies are the same, they do not contravene Rule 9 because they are based on the same type.
When a subspecies is divided into two or more subspecies, the subspecies epithet of the original subspecies must be retained for the subspecies that includes the type.
When a species is transferred to another genus without any change of rank, the species epithet must be retained unless it is already in use in the new genus. In that case, a new species epithet must be chosen for the transferred species. This rule avoids creation of a later homonym.
Regardless of its priority, the transfer of species that is not the type of another genus does not affect the type species or priority of the receiving genus even if it involves union of the incoming species with the type species of the receiving genus.
When the name of a genus is changed, the epithets of the species within it must be retained unless already in use (see Rule 29).
Note 1. Modification of the gender of the species epithet to accommodate the gender of the new genus name is a minor orthographic variant and encouraged.
When two or more taxa of the same rank are united, then the name and type of the taxon is determined by following the rules of priority (see Section 5). In cases of species and subspecies, if the names or epithets are of the same time and date, the author who first unites the taxa has the right to choose one of them, and their choice must be followed.
When several species are united under one species as subspecies, the subspecies that includes the type of the species under whose name they are united must be designated by the same epithet as the species name (see Rule 13c).
If two or more species of different genera are united to form a single genus, and if those species include the type species of one or more genera, the genus name must be the earliest validly published, legitimate name. If no type species is placed in the genus, a new genus name must be proposed, and a type species must be specified.
When two or more taxa of the same rank from family to class are brought together under a taxon of higher rank, the higher-ranking taxon should derive its name from the name of the earliest legitimate type genus among the lower-ranking taxa.
If no type genera were placed in the taxon, a new name based on the selected type must be proposed. (see Rule 22)
When the rank of a taxon of genus or above is changed, the stem of the name must be retained and only the suffix altered (see Rule 15).
When a subspecies is elevated in rank to a species, the subspecies epithet must be used as the species epithet unless the resulting combination is illegitimate.
When a species is lowered in rank to a subspecies, the species epithet must be used as the subspecies epithet unless the resulting combination is illegitimate.
Section 8. Illegitimate Names and Epithets: Replacement, Rejection, and Conservation of Names and Epithets
A name contrary to a rule is illegitimate and must not be used. However, a name of a taxon that is illegitimate when the taxon is in one taxonomic position is not necessarily illegitimate when the taxon is in another taxonomic position.
Note. Some common reasons for which a name may be illegitimate are the following.
- If the taxon to which the name was applied, as circumscribed by the author, included the nomenclatural type of a name which the author ought to have adopted under one or more of the rules.
- If the author did not adopt for a binary or ternary combination the earliest legitimate genus name, species epithet, or subspecies epithet available for the taxon with its particular circumscription, position, and rank.
- A species or subspecies epithet is illegitimate if it duplicates a species or subspecies epithet previously validly published for the same genus but whose name is based upon another type.
An illegitimate name or epithet should be replaced by the earliest legitimate name or epithet in a binary or ternary combination which in the new position will be in accordance with the rules. If no legitimate name or epithet exists, one must be chosen. Since a species epithet is not rendered illegitimate by publication in a species name in which the generic name is illegitimate, authors may use such an epithet if they wish provided that there is no obstacle to its employment in the new position or sense; the resultant combination is treated as a new name (nom. nov.) and is ascribed to the author. The epithet is, however, ascribed to the original author.
A legitimate name or epithet may not be replaced.
Names contrary to the General Considerations or Principles of the code may be rejected by the SeqCode Reconciliation Comission.
All names comprise only the 26 letters of the ISO basic Latin alphabet. Diacritic signs are not to be used.
Any name or epithet should be written in conformity with the spelling of the word from which it is derived and in accordance with the rules of Latin grammar. Exceptions are provided for typographic and orthographic errors and orthographic variants.
Note 1. Consult Appendix 9 of the ICNP for recommendations on forming properly Latinized names.
Note 2. In the SeqCode an orthographic variant is a name (or epithet) applied to the same type that differs from another name only in transliteration into Latin of the same word from a language other than Latin or in its grammatical correctness. Changes in suffixes for consistency with the names of higher taxa are orthographic variants. Names transliterated from the same word and based on different types are not orthographic variants.
The original spelling of a name or epithet must be retained, except for typographical or orthographic errors.
An unintentional typographical or orthographic error later corrected by the author is to be accepted in its corrected form without affecting the status and date of valid publication. It can also be corrected by a subsequent author who may or may not mention that the spelling is corrected. However, the abbreviation ‘‘corrig.’’ (corrigendum) may be appended to the name if an author wishes to draw attention to the correction. Succeeding authors may be unaware that the original usage was incorrect and use the spelling of the original author(s). Other succeeding authors may follow the correction of a previous author or may independently correct the spelling themselves, but in no case is the use of corrig. regarded as obligatory. None of these corrections affects the status and date of validation.
Note. The liberty of correcting a name or epithet must be used with reserve, especially if the change affects the first syllable and above all the first letter of the name or epithet.
The genitive and adjectival forms of a personal name are treated as different epithets and not as orthographic variants unless they are so similar as to cause confusion.
The gender of genus names is governed by the following.
- A Latin or Latinized genus name retains the gender of its language of origin. Authors must give the gender of any proposed genus name. In cases where the classical gender varies, the author has the right of choice between the alternatives.
- Genus names that are compounds from two or more Latin words take the gender of the last component of the compound word.
- Arbitrarily formed genus names or vernacular names used as genus names take the gender assigned to them by their authors.
When it is desirable to distinguish the nature of the type of a name, the following convention is recommended. When the type for a species or subspecies is determined by the ICNP, the superscript “T” will be used immediately following the name or strain identifier. If the type is determined by the SeqCode, the superscript “Ts” or “TS” will be used. When the type is a taxon at the rank of genus or higher, the superscript is determined by the nature of the type of the species. If superscripts are not possible, they may be replaced by the symbols in parentheses, i.e., (T), (Ts), or (TS).
For the purpose of identification in the text, names of taxa at all ranks should be italicized.
The editors thank Mark Pallen and Roman Barco for helpful suggestions on the preprint of the SeqCode.
- Chuvochina M, Rinke C, Parks DH, Rappé MS, Tyson GW, Yilmaz P, Whitman, WB, Hugenholtz P (2019) The importance of designating type material for uncultured taxa. Syst Appl Microbiol 42 (2019) 15–21
- Trüper HG, de’Clari L (1997) Taxonomic note: necessary correction of specific epithets formed as substantives (nouns) "in Apposition". Int J Syst Bacteriol 47:908–909.
Data quality and reporting requirements and recommendations for an isolate genome, metagenome assembled genome (MAG), or single amplified genome (SAG) to serve as the nomenclatural type for a species named under the SeqCode. Requirements will be checked as part of the validation process on the SeqCode Registry. Recommendations are suggested best practices to guide authors and peer reviewers to ensure high quality data supporting species to be named.
Required
- Name
Recommended
- Etymology
- Name formed with mnemonic cues
- Interpretation of biological properties inferred or demonstrated physiological traits and ecological information, such as habitat, in the manuscript body and/or protologue.
- Designated genome assembly (e.g., INSDC accession) and access to raw data (e.g., SRA accession).
- Include as much metadata as possible (see Field et al., 2008).
- Provide evidence of the species, taxonomic rank, and position including the uniqueness of the species with respect to existing named species and justify the taxonomic rank and position (e.g., Jain et al., 2018, Karthikeyan et al., 2019, Parks et al., 2020, Rodriguez-R et al., 2018).
- For MAGs and SAGs, compare multiple high-quality genomes representing the
species in more than one sample
(e.g., Supplemental Information).a
Rationale: Initial requirements encourage wide participation from many microbiological disciplines and enable validation of names published prior to the SeqCode. Critical data will be captured in the SeqCode Registry in any case. Some recommendations could become requirements in the future.
Required
- Type genome assembly quality for MAGs and SAGs: >90% complete and <5% contaminated (modified from Bowers et al., 2017). For isolates, read coverage ≥10x (Field et al., 2008).
- Agreement between genome and 16S rRNA taxonomic assignments
Recommended
- 16S rRNA genes >75% complete and passes chimera checks >80% of tRNAs present (modified from Bowers et al., 2017).
- High genome integrity (contig # <100; N50 >25 kb; max. contig >100 kb).
- MAG/SAG read coverage ≥10x.
- Assembly available in INSDC databases
- Raw data available in INSDC databases
(e.g., Sequence Read Archive)c
Rationale: Registry queries the INSDC databases to perform automatic checks of data quality
a. Comparison of multiple high-quality genomic assemblies from multiple samples can support the non-chimeric nature of MAGs and provide confidence of the assembly for both MAGs and SAGs.
b. Data quality will be assessed by automated pipelines or other approaches. Exceptions for lower data quality should be justified by authors in the effective publication.
c. Not required for names effectively published before January 1, 2023, to allow for existing published names (e.g., existing Candidatus names) and names currently undergoing peer review to be validated under the SeqCode.