Update download_data.md

Added getting CosMX data from BWH Core
hbc · Jun 5, 2024 · f701032 · f701032
1 parent b3c319c
commit f701032
Showing 1 changed file with 38 additions and 25 deletions.
diff --git a/admin/download_data.md b/admin/download_data.md
@@ -73,31 +73,6 @@ A typical command might look like this:
 
 # Dana Farber MBCF
 
-## Notes
-- Zach and co. always share raw data but may default to sharing it through their pydio web interface, which is not reliable.
-
-- If you email Zach ([email protected]) and tell him who's data you need (cc: the researcher) he will setup an FTP site for you to use. 
-
-- Make sure to let them know you've pulled down the data, so they can turn off the site when you're done (it costs money to run this).
-
-- Their data is typically in tar.gz files, it can pay off to decompress them right away so you know if you have the whole file...
-
-## Getting the data
-
-- You can access the data through a `wget` command.
-- Preface it with `nohup` so your job keeps running even if you connection drops.
-- The final nohup.out file will the download progress in it if you want to confirm download.
-A typical command might be something like this:
-
-`nohup wget -m ftp://userid:[email protected]/*`
-
--m mirror is to copy a mirror image of the directory/data including all files and subfolders
-
-Use this if nohup isn't working. Double check the UN, PW and IP address as they change.:
-`wget -m ftp://HSPH_bfx:MBCFHSPH_bfx\[email protected]`
-
-*note the escaped exclamation point in the password (\!), they like to put characters like that in their passwords. (old: `wget -m ftp://jhutchinson:MBCFjhutchinson\[email protected]`)
-
 ## MBCF Google Bucket
 
 -Zach setup a google bucket since the FTP server was painfully slow for data downloads over 1TB (3 days vs 6 hours)
@@ -138,6 +113,44 @@ Copying gs://mbcf-hsph/231011_KT10562_fastq/multiqc_report.html...
 
 HMS RC also suggested: Rclone - here is a good link to get you started. https://rclone.org/googlecloudstorage/
 
+## OLD DFCI Notes - still helpful for wget!
+- Zach and co. always share raw data 
+
+- If you email Zach ([email protected]) and tell him who's data you need (cc: the researcher) he will setup an FTP site for you to use. 
+
+- Make sure to let them know you've pulled down the data, so they can turn off the site when you're done (it costs money to run this).
+
+- Their data is typically in tar.gz files, it can pay off to decompress them right away so you know if you have the whole file...
+
+## Getting the data
+
+- You can access the data through a `wget` command.
+- Preface it with `nohup` so your job keeps running even if you connection drops.
+- The final nohup.out file will the download progress in it if you want to confirm download.
+A typical command might be something like this:
+
+`nohup wget -m ftp://userid:[email protected]/*`
+
+-m mirror is to copy a mirror image of the directory/data including all files and subfolders
+
+Use this if nohup isn't working. Double check the UN, PW and IP address as they change.:
+`wget -m ftp://HSPH_bfx:MBCFHSPH_bfx\[email protected]`
+
+*note the escaped exclamation point in the password (\!), they like to put characters like that in their passwords. (old: `wget -m ftp://jhutchinson:MBCFjhutchinson\[email protected]`)
+
+# CosMx Data from BWH
+
+- Only for the Clark lab and CosMx data so far, but who knows...
+- Get an email from the lab then schedule a time with Miles Tran mtran26 at bwh dot harvard dot edu. (Great that the Clark lab downloads the data at the same time so we know they have a copy of the data)
+- They use an AWS download service and send a tarball. Apparently AWS opens permissions on the tarball so they send a link that's good for 15 minutes
+- at the scheduled time, Miles sends a bit.ly code. use wget and the code (previously had sent a very long code with instructions to put it in 'single quotes' but that never worked for me, so he sends the bit.ly now). In the example below, I made up the code for the transfer (ie. https://bit.ly/7d34a6e), and then the transferred tarball would be called 7d34a6e)
+- (Preface wget with `nohup` so your job keeps running even if you connection drops.)
+- what I do:
+- login to o2 (transfer node or not) - go to appropriate directory
+- nohup wget https://bit.ly/7d34a6e
+- tar zxvf 7d34a6e
+- Done!
+
 # Broad Institute
 
 - It can depend on the platform the researcher used, but the Broad typically only give out BAM files for normal RNA-seq runs.