Skip to content

Latest commit

 

History

History
33 lines (26 loc) · 2.11 KB

NCBIExample.md

File metadata and controls

33 lines (26 loc) · 2.11 KB

NCBI Example

While developing this tool, one of the primary uses cases considered was the National Center for Biotechnology Information's Sequence Read Archive. In this case, there is an FTP server hosting biological sequence data. NCBI also supports using Aspera Connect for data transfers. The goal here would be transfer the file using the faster Aspera client even if the user requests a file from a URL pointing to the FTP server.

In this case, the metadata repository would be configured with two data sources:

  • NCBI SRA FTP
  • NCBI SRA Aspera

The NCBI SRA FTP data source would be configured with a URL Matcher that matches URLs with scheme FTP and host ftp-trace.ncbi.nlm.nih.gov. It would also be configured with a URL transform that mapped URLs from the FTP data source to the Aspera data source by replacing the URL scheme.

The Aspera data source would be configured to use Aspera as the transfer mechanism and would also store the Aspera client parameters required for NCBI.

When a user attempts to transfer a file from the SRA FTP server, such as ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR039/SRR039884/SRR039884.sra, using the BDSS transfer client, the following happens:

  1. Client requests alternate URLs for the file from the metadata repository.
  2. Metadata repository uses URL matcher configuration to match URL to NCBI SRA FTP data source.
  3. Metadata repository applies NCBI SRA FTP data source's URL transforms to URL to map URL to Aspera data source.
  4. Metadata repository responds with alternate URL and transfer mechanism (Aspera) information.
  5. Client uses Aspera to download file.