-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Quarto GHA Workflow Runner
committed
Nov 14, 2023
1 parent
8323c44
commit fdc5687
Showing
3 changed files
with
115 additions
and
36 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
863d1db0 | ||
f9ae2ee0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,7 +7,7 @@ | |
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes"> | ||
|
||
|
||
<title>BigPicture - Downloading data</title> | ||
<title>BigPicture - Download guide</title> | ||
<style> | ||
code{white-space: pre-wrap;} | ||
span.smallcaps{font-variant: small-caps;} | ||
|
@@ -178,7 +178,7 @@ | |
</nav> | ||
<nav class="quarto-secondary-nav" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }"> | ||
<div class="container-fluid d-flex justify-content-between"> | ||
<h1 class="quarto-secondary-nav-title">Downloading data</h1> | ||
<h1 class="quarto-secondary-nav-title">Download guide</h1> | ||
<button type="button" class="quarto-btn-toggle btn" aria-label="Show secondary navigation"> | ||
<i class="bi bi-chevron-right"></i> | ||
</button> | ||
|
@@ -193,7 +193,7 @@ <h1 class="quarto-secondary-nav-title">Downloading data</h1> | |
<ul class="list-unstyled mt-1"> | ||
<li class="sidebar-item"> | ||
<div class="sidebar-item-container"> | ||
<a href="../download/downloading-data.html" class="sidebar-item-text sidebar-link active">Downloading data</a> | ||
<a href="../download/downloading-data.html" class="sidebar-item-text sidebar-link active">Download guide</a> | ||
</div> | ||
</li> | ||
</ul> | ||
|
@@ -205,10 +205,14 @@ <h1 class="quarto-secondary-nav-title">Downloading data</h1> | |
<h2 id="toc-title">On this page</h2> | ||
|
||
<ul> | ||
<li><a href="#prepare-your-system" id="toc-prepare-your-system" class="nav-link active" data-scroll-target="#prepare-your-system">Prepare your system</a></li> | ||
<li><a href="#install-the-sda-cli-tool" id="toc-install-the-sda-cli-tool" class="nav-link active" data-scroll-target="#install-the-sda-cli-tool">Install the sda-cli tool</a></li> | ||
<li><a href="#generate-public-and-secret-key" id="toc-generate-public-and-secret-key" class="nav-link" data-scroll-target="#generate-public-and-secret-key">Generate public and secret key</a></li> | ||
<li><a href="#decrypt-the-data-files" id="toc-decrypt-the-data-files" class="nav-link" data-scroll-target="#decrypt-the-data-files">Decrypt the data files</a></li> | ||
<li><a href="#validate-the-decrypted-files" id="toc-validate-the-decrypted-files" class="nav-link" data-scroll-target="#validate-the-decrypted-files">Validate the decrypted files</a></li> | ||
<li><a href="#notify-admins-via-email" id="toc-notify-admins-via-email" class="nav-link" data-scroll-target="#notify-admins-via-email">Notify admins via email</a></li> | ||
<li><a href="#receiving-email-from-an-admin" id="toc-receiving-email-from-an-admin" class="nav-link" data-scroll-target="#receiving-email-from-an-admin">Receiving email from an admin</a></li> | ||
<li><a href="#download-dataset" id="toc-download-dataset" class="nav-link" data-scroll-target="#download-dataset">Download dataset</a></li> | ||
<li><a href="#decrypt-the-files" id="toc-decrypt-the-files" class="nav-link" data-scroll-target="#decrypt-the-files">Decrypt the files</a></li> | ||
<li><a href="#validating-decrypted-files" id="toc-validating-decrypted-files" class="nav-link" data-scroll-target="#validating-decrypted-files">Validating decrypted files</a></li> | ||
<li><a href="#notify-us" id="toc-notify-us" class="nav-link" data-scroll-target="#notify-us">Notify us</a></li> | ||
</ul> | ||
</nav> | ||
</div> | ||
|
@@ -217,7 +221,7 @@ <h2 id="toc-title">On this page</h2> | |
|
||
<header id="title-block-header" class="quarto-title-block default"> | ||
<div class="quarto-title"> | ||
<h1 class="title d-none d-lg-block">Downloading data</h1> | ||
<h1 class="title d-none d-lg-block">Download guide</h1> | ||
</div> | ||
|
||
|
||
|
@@ -232,37 +236,56 @@ <h1 class="title d-none d-lg-block">Downloading data</h1> | |
|
||
</header> | ||
|
||
<p>This section provides intructions on how to download the data files. The instructions contain the following steps:</p> | ||
<p>This section provides guidelines on the necessary steps to download data from BigPicture.</p> | ||
<section id="install-the-sda-cli-tool" class="level2"> | ||
<h2 class="anchored" data-anchor-id="install-the-sda-cli-tool">Install the sda-cli tool</h2> | ||
<p>Follow the guidelines <a href="../submission/submission-guide.html#install-the-sda-cli-tool">here</a> to install the sda-cli tool.</p> | ||
</section> | ||
<section id="generate-public-and-secret-key" class="level2"> | ||
<h2 class="anchored" data-anchor-id="generate-public-and-secret-key">Generate public and secret key</h2> | ||
<p>The initial step involves creating a crypt4gh keypair using the sda-cli:</p> | ||
<div class="sourceCode" id="cb1"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./sda-cli</span> createKey <span class="op"><</span>keypair_name<span class="op">></span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
<p>where <keypair_name> is the base name of the key files. The above command will create two keys named <code>keypair_name.pub.pem</code> and <code>keypair_name.sec.pem</code>. The public key (<code>pub</code>) will be sent to the admins (see the next step) and will be used by them for the encryption of the files, while the private one (<code>sec</code>) will be used by the requester for decrypting the files when the download.</keypair_name></p> | ||
</section> | ||
<section id="notify-admins-via-email" class="level2"> | ||
<h2 class="anchored" data-anchor-id="notify-admins-via-email">Notify admins via email</h2> | ||
<p>Send an <a href="mailto:[email protected]">email to the admins</a> notifying them that you want to download files from Big Picture. Include the following information in the email:</p> | ||
<ul> | ||
<li>The dataset that the requester wants to download</li> | ||
<li>Attach the crypt4gh public key (<code>keypair_name.pub.pem</code>) generated in the previous step.</li> | ||
</ul> | ||
</section> | ||
<section id="receiving-email-from-an-admin" class="level2"> | ||
<h2 class="anchored" data-anchor-id="receiving-email-from-an-admin">Receiving email from an admin</h2> | ||
<p>Upon publicizing encrypted data in a public folder in the S3 outbox bucket, an admin will send an email to the requester. This email contains a URL leading to a <code>txt</code> file named <code>urls_list.txt</code>, which contains all the dataset paths.</p> | ||
<p>The number of URLs will be equal to the number of datasets requested for download.</p> | ||
</section> | ||
<section id="download-dataset" class="level2"> | ||
<h2 class="anchored" data-anchor-id="download-dataset">Download dataset</h2> | ||
<p>Steps for downloading the data:</p> | ||
<ul> | ||
<li>Download the <code>urls_list.txt</code> file:</li> | ||
</ul> | ||
<div class="sourceCode" id="cb2"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">curl</span> <span class="at">-OL</span> <span class="op"><</span>url-included-in-email<span class="op">></span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
<ul> | ||
<li>Prepare your system</li> | ||
<li>Generate public and secret key</li> | ||
<li>Decrypt the files</li> | ||
<li>Validate the decrypted files</li> | ||
<li>Use the <code>sda-cli</code> to retrieve all dataset files:</li> | ||
</ul> | ||
<section id="prepare-your-system" class="level3"> | ||
<h3 class="anchored" data-anchor-id="prepare-your-system">Prepare your system</h3> | ||
<p>In order to download and decrypted files, you need to first get the tool <a href="https://www.ga4gh.org/news/crypt4gh-a-secure-method-for-sharing-human-genetic-data/">crypt4gh</a>.</p> | ||
<div class="sourceCode" id="cb3"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">./sda-cli</span> download <span class="op"><</span>path-to-urls_list.txt<span class="op">></span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
</section> | ||
<section id="generate-public-and-secret-key" class="level3"> | ||
<h3 class="anchored" data-anchor-id="generate-public-and-secret-key">Generate public and secret key</h3> | ||
<p>The first step is to generate a keypair using the <code>crypt4gh</code> encryption tool. This can be done using the following command. There are two keys generated - <code>user.sec</code> is the secret key and <code>user.pub</code> is the public key. You must reply to the Helpdesk by sending them the generated public key - <code>user.pub</code>.</p> | ||
<div class="sourceCode" id="cb1"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="ex">crypt4gh-keygen</span> <span class="at">--sk</span> user.sec <span class="at">--pk</span> user.pub</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
<section id="decrypt-the-files" class="level2"> | ||
<h2 class="anchored" data-anchor-id="decrypt-the-files">Decrypt the files</h2> | ||
<p>The files can be decrypted by using the <code>sda-cli</code>:</p> | ||
<div class="sourceCode" id="cb4"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">./sda-cli</span> decrypt <span class="at">-key</span> <span class="op"><</span>keypair_name.sec.pem<span class="op">></span> <span class="op"><</span>file-path<span class="op">></span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
<p>Decrypt multiple files by appending file paths:</p> | ||
<div class="sourceCode" id="cb5"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a> <span class="ex">./sda-cli</span> decrypt <span class="at">-key</span> <span class="op"><</span>keypair_name.sec.pem<span class="op">></span> <span class="op"><</span>file-path-1<span class="op">></span> <span class="op"><</span>file-path-2<span class="op">></span> <span class="op"><</span>file-path-3<span class="op">></span> ...</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
</section> | ||
<section id="decrypt-the-data-files" class="level3"> | ||
<h3 class="anchored" data-anchor-id="decrypt-the-data-files">Decrypt the data files</h3> | ||
<p>You will recieve an email from the Helpdesk that contains a URL. This URL contains links to download the encrypted data files. Get our <a href="download-data-script.qmd">script for downloading data</a> and then use the following command.</p> | ||
<div class="sourceCode" id="cb2"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./download_data.sh</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
<p>In case the name of the text file is changed or it exists in a different path than the download script, run:</p> | ||
<div class="sourceCode" id="cb3"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="ex">./download_data.sh</span> path/filename</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
<p>This downloads all the encrypted files from the text file and allows it to maintain the structure of the dataset. Next, you must transfer the downloaded encrypt files and the secret key to a secure environment. Use the following command to decrypt files inside the secure environment using the secret key - <code>user.sec</code> that was generated in the previous step.</p> | ||
<div class="sourceCode" id="cb4"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="ex">crypt4gh</span> decrypt <span class="at">--sk</span> user.sec <span class="op"><</span> encrypted-file.c4gh <span class="op">></span> encrypted-file</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
<p>Ensure that the <code>encrypted-file.c4gh</code> and the corresponding <code>encrypted-file</code> are in the same folder.</p> | ||
<section id="validating-decrypted-files" class="level2"> | ||
<h2 class="anchored" data-anchor-id="validating-decrypted-files">Validating decrypted files</h2> | ||
<p>During the publicize process, a file named <code>checksums_list.sha256</code> is generated, containing <code>sha256</code> checksums of all unencrypted files. Use this file to validate the correctness of downloaded files and ensure no issues occurred during the download.</p> | ||
</section> | ||
<section id="validate-the-decrypted-files" class="level3"> | ||
<h3 class="anchored" data-anchor-id="validate-the-decrypted-files">Validate the decrypted files</h3> | ||
<p>The next step is to validate the decrypted files. This can be done using calculating checksums of the downloaded files. Executing the <code>download_data</code> script downloads a file <code>checksums_list.sha256</code> that contains the list of checksums. Following command is used to validate the decrypted file.</p> | ||
<div class="sourceCode" id="cb5"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="fu">sha256sum</span> <span class="at">-c</span> checksums_list.sha256</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div> | ||
<p>In the end, please <a href="mailto:[email protected]">email us</a> to confirm successful download and decryption.</p> | ||
<section id="notify-us" class="level2"> | ||
<h2 class="anchored" data-anchor-id="notify-us">Notify us</h2> | ||
<p>Finally, send an <a href="mailto:[email protected]">email to the admins</a> to confirm the successful download and decryption or notifying them in case of an error.</p> | ||
|
||
|
||
</section> | ||
|
Oops, something went wrong.