Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial gateaux output downloader #3

Open
tyla123 opened this issue Jun 11, 2024 · 0 comments
Open

initial gateaux output downloader #3

tyla123 opened this issue Jun 11, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@tyla123
Copy link
Contributor

tyla123 commented Jun 11, 2024

Leaving code here for future use and until I add it to the bakeR package.

How it works:

  • By default: It grabs the JSON from the first two pages for your specified report name (essentially including all jobs and details on those pages.
  • Then, depending on your report ids specified, it will filter the JSON (at this point, dataframe) and download the outputs into zip file(s) that will be in a collective directory outputs/ in your current working directory.
  • You can specify any number of report ids (job IDs) but if you have multiple recent job outputs (for example > 20) that you want to get, be mindful and increase the page number.
gateaux_download_output <- function(report_name, 
                                    report_id,
                                    JWT,
                                    page_no = c(0:1),
                                    server = 'gateaux.io') {
  require(rjson)
  require(dplyr)
  
  ## By default, this will get the json for all jobs on page 1 [index = 0] & 2 [ index = 1]
  downloadURL <- list()
  for(pn in 1:length(page_no)) {
    print(pn)
    call_getURL <- sprintf('curl -H "Authorization: Bearer %s" -H "Content-Type: application/json" https://%s/api/jobs/%s?page=%s', JWT, server, report_name, page_no[pn])
    json_getURL <- rjson::fromJSON(system(call_getURL, intern = T))
    
    downloadURL[[pn]] <- data.frame(jobID = sapply(json_getURL$results, function(i)i[["id"]]), 
                                    fURL = sapply(json_getURL$results, function(i)i[["files_url"]]))
  }
  
  downloadURL <- do.call(rbind, downloadURL) %>% filter(jobID %in% report_id)
  
  print('Downloading ...')
  for(f in 1:nrow(downloadURL)) {
    if(!dir.exists('output/')) dir.create('output/')
    message(sprintf('[%s] %s -- %s', f, report_name, downloadURL$jobID[f]))
    # call_download <- sprintf("curl -o output/%s '%s'", paste0(f,'_',report_name, '.zip'), downloadURL$fURL[f])
    call_download <- sprintf("curl '%s' > output/%s", downloadURL$fURL[f], paste0(f,'_',report_name, '.zip'))
    print(call_download)
    system(call_download)
    print(paste('Stored here: ', file.path(getwd(), paste0(f,'_',report_name, '.zip'))))
  }
}

An example:

gateaux_download_output(report_name = 'Shark-catch-rxn-models',
                        report_id = c(10680:10689),
                        JWT = '{{ token here }}')
@tyla123 tyla123 added the enhancement New feature or request label Jun 11, 2024
@tyla123 tyla123 self-assigned this Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant