diff --git a/.nojekyll b/.nojekyll index cd6972f7..acb7f826 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -0d39069e \ No newline at end of file +6279f136 \ No newline at end of file diff --git a/cards/AlbaMartinez.html b/cards/AlbaMartinez.html index a3b4977b..1f0d3180 100644 --- a/cards/AlbaMartinez.html +++ b/cards/AlbaMartinez.html @@ -2,7 +2,7 @@
- + diff --git a/cards/JARomero.html b/cards/JARomero.html index 73d87355..04119fdb 100644 --- a/cards/JARomero.html +++ b/cards/JARomero.html @@ -2,7 +2,7 @@ - + diff --git a/develop/01_RDM_intro.html b/develop/01_RDM_intro.html index 8cc4f07b..f8e3424a 100644 --- a/develop/01_RDM_intro.html +++ b/develop/01_RDM_intro.html @@ -2,7 +2,7 @@ - + @@ -292,7 +292,7 @@if you’re interested in delving deeper, explore our course on Git and GitHub.
-Alternatively, here are some examples and online resources to expand your understanding:
- -Git is a widely adopted version control system that empowers developers and researchers to efficiently manage their project’s history, collaborate seamlessly, track changes, and ensure data integrity. Git operates on core principles and mechanisms:
@@ -354,6 +335,7 @@In addition to exploring Git, we will also explore GitHub, a collaborative platform for hosting Git repositories. GitHub enhances Git’s capabilities by offering features like issue tracking, security measures to protect repositories, and GitHub Pages for creating project websites. Additionally, GitHub provides the option to set repositories as private until you are ready to share your work publicly.
+The difference between Git and GitHub is that Git is a version control system used to track changes in code, while GitHub is a cloud-based platform that hosts Git repositories and facilitates collaboration. Essentially, GitHub serves as an online access point for managing and sharing repositories.
We will discuss repositories for archiving experimental or large datasets in lesson 7. However, if you are interested in version control large files, we recommend the use of git annex
. It is important to store files with a checksum (MD5, SHA1, SHA256) to verify that files are not altered or corrupted buy recomputing their signature.
We will discuss repositories for archiving experimental or large datasets in lesson 7. However, if you are interested in version control large files, we recommend the use of git annex
. It is also important to archive files with a checksum (MD5, SHA1, SHA256) to verify that files are not altered or corrupted buy recomputing their signature.
In this lesson, we explored version control and utilized Git and GitHub to establish data analysis repositories from our Project folders. Additionally, we delved into creating a GitHub organization and leveraging GitHub Pages to showcase data analysis scripts and notebooks publicly. Remember to complete the corresponding exercise from the practical workshop to reinforce your knowledge.
If you’re interested in delving deeper, explore our course on Git and GitHub.
+Alternatively, here are some examples and online resources to expand your understanding:
While platforms like GitHub excel in version control and collaborative coding, repositories like Zenodo, Gene Expression Omnibus, and Annotare specialize in archiving and sharing scientific data, ensuring long-term accessibility for the global research community.
+While platforms like GitHub excel in version control and collaborative coding, repositories specialize in archiving and sharing scientific data (e.g. Zenodo), ensure long-term accessibility for the global research community.
+What to archive and how?
+A framework for reproducibility in computational research can generally be divided into three key, though sometimes overlapping, categories:
+Researchers are typically more accustomed to archiving readable components, such as papers or data documentation, compared to executable components like scripts and code. However, for research to be fully reproducible, it is crucial that all key components, including executable ones, are properly archived.
+When choosing an archival solution, it’s important to recognize that there is no one-size-fits-all option. Several factors must be considered, including data size, format requirements, licensing conditions, cost, and tools for data attribution and citation. Each of these features plays a crucial role in selecting the most suitable archive for your needs.
Specialized repositories and archives securely store, curate, and disseminate scientific data, ensuring long-term preservation, transparency, and citability of research findings through standardized formats and rigorous curation processes.
@@ -312,7 +322,7 @@Check the registry of research data repositories–re3data.org for a full overview. You can browse by subject if you are looking within a specific field.
+Check the registry of research data repositories,re3data.org for a full overview. You can browse by subject if you are looking within a specific field.
There are two types of repositories:
@@ -425,24 +435,15 @@By adhering to standards, repositories ensure that submitted data is high quality, well-documented, and compliant with community best practices, promoting data discovery, reproducibility, and interoperability within the scientific community.
Following all the recommendations in this course makes it straightforward to provide the necessary documentation and information for these repositories. For instance, repositories specific to NGS data will require the raw FASTQ files, sample metadata, and protocols as well as final pre-processing results (for instance, read count matrices in BED files).
-Keep in mind that these repositories are not intended for downstream analysis data and associated code. However, you should already have those versions controlled by GitHub, which eliminates any concerns. You can then archive such repositories in a general repository like Zenodo.
+ +Keep in mind that data repositories are not intended for downstream analysis data and associated code. However, you should already have those versions controlled by GitHub, which eliminates any concerns. You can then archive such repositories in a general repository like Zenodo.
Archives for software source code are essential for long-term accessibility and reproducibility and are becoming very popular. Check Software Heritage if you are developing software.
-There are plenty of data archiving repositories. We recommend to check the Longwood Research Data management website at Harvard for a quick overview. Some of the most well-known are:
+There are plenty of general archiving repositories. We recommend to check the Longwood Research Data management website at Harvard for a quick overview. Some of the most well-known are: