-
-
diff --git a/search.json b/search.json
index de4092ab..747a501e 100644
--- a/search.json
+++ b/search.json
@@ -417,7 +417,7 @@
"href": "develop/06_pipelines.html",
"title": "6. Processing and analyzing biodata",
"section": "",
- "text": "In this section, we explore essential elements of reproducibility and efficiency in computational research, highlighting techniques and tools for creating robust and transparent coding and workflows. By prioritizing reproducibility and replicability, researchers can enhance the credibility and impact of their findings while fostering collaboration and knowledge dissemination within the scientific community.\n\n\nThrough techniques such as scripting, containerization (e.g., Docker), and virtual environments, researchers can create reproducible analyses that enable others to validate and build upon their work. Emphasizing the documentation of data processing steps, parameters, and results ensures transparency and accountability in research outputs.\nTools for reproducibility:\n\nCode notebooks: Utilize tools like Jupyter Notebook and R Markdown to combine code with descriptive text and visualizations, enhancing data documentation.\n\nIntegrated development environments: Consider using platforms such as (knitr or MLflow) to streamline code development and documentation processes.\nPipeline frameworks or workflow management systems: Implement systems like Nextflow and Snakemake to automate data analysis steps (including data extraction, transformation, validation, visualization, and more). Additionally, they contribute to ensuring interoperability by facilitating seamless integration and interaction between different components or stages.\n\n\n\nComputational notebooks (e.g., Jupyter, R Markdown) provide researchers with a versatile platform for exploratory and interactive data analysis. These notebooks facilitate sharing insights with collaborators and documentation of analysis procedures.\n\n\n\nTools such as Nextflow and Snakemake streamline and automate various data analysis steps, enabling parallel processing and seamless integration with existing tools.\n\nNextflow: offers scalable and portable NGS data analysis pipelines, facilitating data processing across diverse computing environments.\nSnakemake: Utilizing Python-based scripting, Snakemake allows for flexible and automated NGS data analysis pipelines, supporting parallel processing and integration with other tools.\n\n\n\n\n\nTo maintain clarity and organization in the data analysis process, adopt best practices such as:\n\nCreate a README.md file\nAnnotate your pipelines and comment your code\nLabel numerically to maintain clarity and organization in the data analysis process (scripts, notebooks, pipelines etc.).\n\n00.preprocessing., 01.data_analysis_step1., etc.\n\nProvide environment files for reproducing the computational environment, including:\n\nContainerization platforms (e.g., Docker, Singularity): allow the researcher to package their software and dependencies into a standardized container image.\nVirtual Environments (e.g., Conda, virtualenv): provide an isolated environment with specific packages and dependencies that can be installed without affecting the system-wide configuration. These environments are particularly useful for managing conflicting dependencies and ensuring reproducibility. Moreover, Conda allows users to export environment specifications to YAML files enabling easy recreation of the environment on another system.\nEnvironment Configuration Scripts: Researchers can also provide custom scripts or configuration files that automate the setup of the computational environment. These scripts may contain commands for installing packages (such as pip for Python packages or apt-get for system-level dependencies), configuring system settings, and setting environment variables.\n\nUpload your code to version control systems (e.g., Git) and code repository Lesson 5.\nIntegrated development environments (e.g., RStudio, PyCharm): Provides tools and features for writing, testing, and debugging code\nLeverage curated pipelines such as the ones developed by the nf-core community, further ensuring adherence to community standards and guidelines.",
+ "text": "In this section, we explore essential elements of reproducibility and efficiency in computational research, highlighting techniques and tools for creating robust and transparent coding and workflows. By prioritizing reproducibility and replicability, researchers can enhance the credibility and impact of their findings while fostering collaboration and knowledge dissemination within the scientific community.\n\n\nThrough techniques such as scripting, containerization (e.g., Docker), and virtual environments, researchers can create reproducible analyses that enable others to validate and build upon their work. Emphasizing the documentation of data processing steps, parameters, and results ensures transparency and accountability in research outputs.\nTools for reproducibility:\n\nCode notebooks: Utilize tools like Jupyter Notebook and R Markdown to combine code with descriptive text and visualizations, enhancing data documentation.\n\nIntegrated development environments: Consider using platforms such as (knitr or MLflow) to streamline code development and documentation processes.\nPipeline frameworks or workflow management systems: Implement systems like Nextflow and Snakemake to automate data analysis steps (including data extraction, transformation, validation, visualization, and more). Additionally, they contribute to ensuring interoperability by facilitating seamless integration and interaction between different components or stages.\n\n\n\nComputational notebooks (e.g., Jupyter, R Markdown) provide researchers with a versatile platform for exploratory and interactive data analysis. These notebooks facilitate sharing insights with collaborators and documentation of analysis procedures.\n\n\n\nTools such as Nextflow and Snakemake streamline and automate various data analysis steps, enabling parallel processing and seamless integration with existing tools.\n\nNextflow: offers scalable and portable NGS data analysis pipelines, facilitating data processing across diverse computing environments.\nSnakemake: Utilizing Python-based scripting, Snakemake allows for flexible and automated NGS data analysis pipelines, supporting parallel processing and integration with other tools.\n\n\n\n\n\nTo maintain clarity and organization in the data analysis process, adopt best practices such as:\n\nCreate a README.md file\nAnnotate your pipelines and comment your code\nLabel numerically to maintain clarity and organization in the data analysis process (scripts, notebooks, pipelines, etc.).\n\n00.preprocessing., 01.data_analysis_step1., etc.\n\nProvide environment files for reproducing the computational environment, including:\n\nContainerization platforms (e.g., Docker, Singularity): allow the researcher to package their software and dependencies into a standardized container image.\nVirtual Environments (e.g., Conda, virtualenv): provide an isolated environment with specific packages and dependencies that can be installed without affecting the system-wide configuration. These environments are particularly useful for managing conflicting dependencies and ensuring reproducibility. Moreover, Conda allows users to export environment specifications to YAML files enabling easy recreation of the environment on another system.\nEnvironment Configuration Scripts: Researchers can also provide custom scripts or configuration files that automate the setup of the computational environment. These scripts may contain commands for installing packages (such as pip for Python packages or apt-get for system-level dependencies), configuring system settings, and setting environment variables.\n\nUpload your code to version control systems (e.g., Git) and code repository Lesson 5.\nIntegrated development environments (e.g., RStudio, PyCharm): Provides tools and features for writing, testing, and debugging code\nLeverage curated pipelines such as the ones developed by the nf-core community, further ensuring adherence to community standards and guidelines.",
"crumbs": [
"Course material",
"Key practices",
@@ -429,7 +429,7 @@
"href": "develop/06_pipelines.html#code-and-pipelines-for-data-analysis",
"title": "6. Processing and analyzing biodata",
"section": "",
- "text": "In this section, we explore essential elements of reproducibility and efficiency in computational research, highlighting techniques and tools for creating robust and transparent coding and workflows. By prioritizing reproducibility and replicability, researchers can enhance the credibility and impact of their findings while fostering collaboration and knowledge dissemination within the scientific community.\n\n\nThrough techniques such as scripting, containerization (e.g., Docker), and virtual environments, researchers can create reproducible analyses that enable others to validate and build upon their work. Emphasizing the documentation of data processing steps, parameters, and results ensures transparency and accountability in research outputs.\nTools for reproducibility:\n\nCode notebooks: Utilize tools like Jupyter Notebook and R Markdown to combine code with descriptive text and visualizations, enhancing data documentation.\n\nIntegrated development environments: Consider using platforms such as (knitr or MLflow) to streamline code development and documentation processes.\nPipeline frameworks or workflow management systems: Implement systems like Nextflow and Snakemake to automate data analysis steps (including data extraction, transformation, validation, visualization, and more). Additionally, they contribute to ensuring interoperability by facilitating seamless integration and interaction between different components or stages.\n\n\n\nComputational notebooks (e.g., Jupyter, R Markdown) provide researchers with a versatile platform for exploratory and interactive data analysis. These notebooks facilitate sharing insights with collaborators and documentation of analysis procedures.\n\n\n\nTools such as Nextflow and Snakemake streamline and automate various data analysis steps, enabling parallel processing and seamless integration with existing tools.\n\nNextflow: offers scalable and portable NGS data analysis pipelines, facilitating data processing across diverse computing environments.\nSnakemake: Utilizing Python-based scripting, Snakemake allows for flexible and automated NGS data analysis pipelines, supporting parallel processing and integration with other tools.\n\n\n\n\n\nTo maintain clarity and organization in the data analysis process, adopt best practices such as:\n\nCreate a README.md file\nAnnotate your pipelines and comment your code\nLabel numerically to maintain clarity and organization in the data analysis process (scripts, notebooks, pipelines etc.).\n\n00.preprocessing., 01.data_analysis_step1., etc.\n\nProvide environment files for reproducing the computational environment, including:\n\nContainerization platforms (e.g., Docker, Singularity): allow the researcher to package their software and dependencies into a standardized container image.\nVirtual Environments (e.g., Conda, virtualenv): provide an isolated environment with specific packages and dependencies that can be installed without affecting the system-wide configuration. These environments are particularly useful for managing conflicting dependencies and ensuring reproducibility. Moreover, Conda allows users to export environment specifications to YAML files enabling easy recreation of the environment on another system.\nEnvironment Configuration Scripts: Researchers can also provide custom scripts or configuration files that automate the setup of the computational environment. These scripts may contain commands for installing packages (such as pip for Python packages or apt-get for system-level dependencies), configuring system settings, and setting environment variables.\n\nUpload your code to version control systems (e.g., Git) and code repository Lesson 5.\nIntegrated development environments (e.g., RStudio, PyCharm): Provides tools and features for writing, testing, and debugging code\nLeverage curated pipelines such as the ones developed by the nf-core community, further ensuring adherence to community standards and guidelines.",
+ "text": "In this section, we explore essential elements of reproducibility and efficiency in computational research, highlighting techniques and tools for creating robust and transparent coding and workflows. By prioritizing reproducibility and replicability, researchers can enhance the credibility and impact of their findings while fostering collaboration and knowledge dissemination within the scientific community.\n\n\nThrough techniques such as scripting, containerization (e.g., Docker), and virtual environments, researchers can create reproducible analyses that enable others to validate and build upon their work. Emphasizing the documentation of data processing steps, parameters, and results ensures transparency and accountability in research outputs.\nTools for reproducibility:\n\nCode notebooks: Utilize tools like Jupyter Notebook and R Markdown to combine code with descriptive text and visualizations, enhancing data documentation.\n\nIntegrated development environments: Consider using platforms such as (knitr or MLflow) to streamline code development and documentation processes.\nPipeline frameworks or workflow management systems: Implement systems like Nextflow and Snakemake to automate data analysis steps (including data extraction, transformation, validation, visualization, and more). Additionally, they contribute to ensuring interoperability by facilitating seamless integration and interaction between different components or stages.\n\n\n\nComputational notebooks (e.g., Jupyter, R Markdown) provide researchers with a versatile platform for exploratory and interactive data analysis. These notebooks facilitate sharing insights with collaborators and documentation of analysis procedures.\n\n\n\nTools such as Nextflow and Snakemake streamline and automate various data analysis steps, enabling parallel processing and seamless integration with existing tools.\n\nNextflow: offers scalable and portable NGS data analysis pipelines, facilitating data processing across diverse computing environments.\nSnakemake: Utilizing Python-based scripting, Snakemake allows for flexible and automated NGS data analysis pipelines, supporting parallel processing and integration with other tools.\n\n\n\n\n\nTo maintain clarity and organization in the data analysis process, adopt best practices such as:\n\nCreate a README.md file\nAnnotate your pipelines and comment your code\nLabel numerically to maintain clarity and organization in the data analysis process (scripts, notebooks, pipelines, etc.).\n\n00.preprocessing., 01.data_analysis_step1., etc.\n\nProvide environment files for reproducing the computational environment, including:\n\nContainerization platforms (e.g., Docker, Singularity): allow the researcher to package their software and dependencies into a standardized container image.\nVirtual Environments (e.g., Conda, virtualenv): provide an isolated environment with specific packages and dependencies that can be installed without affecting the system-wide configuration. These environments are particularly useful for managing conflicting dependencies and ensuring reproducibility. Moreover, Conda allows users to export environment specifications to YAML files enabling easy recreation of the environment on another system.\nEnvironment Configuration Scripts: Researchers can also provide custom scripts or configuration files that automate the setup of the computational environment. These scripts may contain commands for installing packages (such as pip for Python packages or apt-get for system-level dependencies), configuring system settings, and setting environment variables.\n\nUpload your code to version control systems (e.g., Git) and code repository Lesson 5.\nIntegrated development environments (e.g., RStudio, PyCharm): Provides tools and features for writing, testing, and debugging code\nLeverage curated pipelines such as the ones developed by the nf-core community, further ensuring adherence to community standards and guidelines.",
"crumbs": [
"Course material",
"Key practices",
diff --git a/sitemap.xml b/sitemap.xml
index 6a794686..c5785e66 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,70 +2,70 @@
https://hds-sandbox.github.io/RDM_NGS_course/develop/03_DOD.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/04_metadata.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/02_DMP.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/contributors.html
- 2024-04-15T12:06:33.450Z
+ 2024-04-15T12:27:00.848Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/07_repos.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/examples/NGS_metadata.html
- 2024-04-15T12:06:33.450Z
+ 2024-04-15T12:27:00.848Z
https://hds-sandbox.github.io/RDM_NGS_course/practical_workflows.html
- 2024-04-15T12:06:33.474Z
+ 2024-04-15T12:27:00.876Z
https://hds-sandbox.github.io/RDM_NGS_course/cards/JARomero.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/index.html
- 2024-04-15T12:06:33.474Z
+ 2024-04-15T12:27:00.876Z
https://hds-sandbox.github.io/RDM_NGS_course/cards/AlbaMartinez.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/use_cases.html
- 2024-04-15T12:06:33.474Z
+ 2024-04-15T12:27:00.876Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/examples/NGS_OS_FAIR.html
- 2024-04-15T12:06:33.450Z
+ 2024-04-15T12:27:00.848Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/examples/NGS_management.html
- 2024-04-15T12:06:33.450Z
+ 2024-04-15T12:27:00.848Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/01_RDM_intro.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/06_pipelines.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/05_VC.html
- 2024-04-15T12:06:33.434Z
+ 2024-04-15T12:27:00.832Z
https://hds-sandbox.github.io/RDM_NGS_course/develop/practical_workshop.html
- 2024-04-15T12:06:33.474Z
+ 2024-04-15T12:27:00.872Z