Skip to content

cildatadownloader.py

Chris Churas edited this page Jan 25, 2018 · 3 revisions

This tool downloads image and video datasets via http. Below is output from script when --help is passed in.

usage: cildatadownloader.py [-h] [--log {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                            [--id ID] [--skipifexists] [--retryfailed]
                            [--numretries NUMRETRIES]
                            [--retrysleep RETRYSLEEP] [--timeout TIMEOUT]
                            [--version]
                            databaseconf destdir

              Version 0.1.0

              Downloads images and videos from legacy Cell Image Library
              website and Omero webservice. This is one of three programs
              needed retrieve & convert data. The other two are
              cildataconverter.py and cildataupdatedb.py. Invoking
              those programs with --help will provide more information.

              This script first gets a list of image and video dataset ids
              by querying the database defined by the first argument
              (databaseconf) to this program. The script then downloads,
              via http, the images and videos to images/ & videos/
              subdirectories under the second argument (destdir).

              Each dataset ID gets its own directory and within a json
              file is written containing information about the data
              downloaded.

              For image datasets the following files are downloaded:

              http://cellimagelibrary.org/images/download_jpeg/<ID>.jpg
              http://grackle.crbs.ucsd.edu:8080/OmeroWebService/images/images/<ID>.tif
              http://grackle.crbs.ucsd.edu:8080/OmeroWebService/images/images/<ID>.raw

              For video datasets the following files are downloaded:

              http://cellimagelibrary.org/images/download_jpeg/<ID>.jpg
              http://cellimagelibrary.org/videos/<ID>.flv
              http://grackle.crbs.ucsd.edu:8080/OmeroWebService/images/images/<ID>.tif
              http://grackle.crbs.ucsd.edu:8080/OmeroWebService/images/images/<ID>.raw

              Database configuration file:

              The first argument is expected to be a database configuration
              file. This file should have the following format:

              [postgres]

              user = <USER>
              password = <PASSWORD>
              port = <PORT>
              host = <HOST>
              database = <DATABASE_NAME>

              Example:

              [postgres]

              user = bob
              password = 12345
              port = 5432
              host = mydb.foo.com
              database = cildb

              For more information please visit:

              https://github.com/slash-segmentation/cildata_util/wiki

    

positional arguments:
  databaseconf          Database configuration file
  destdir               Directory where images and videos will be saved

optional arguments:
  -h, --help            show this help message and exit
  --log {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Set the logging level (default WARNING)
  --id ID               Only download data with id passed in.
  --skipifexists        Skip download if directory for id exists on filesystem
  --retryfailed         Going off of filesystem retry any faileddownloads
  --numretries NUMRETRIES
                        Number of attempts to make at downloading a file
  --retrysleep RETRYSLEEP
                        Number of seconds to wait before retrying a download
  --timeout TIMEOUT     Number of seconds to wait for response from http when
                        downloading a file
  --version             show program's version number and exit
Clone this wiki locally