Skip to content

Commit

Permalink
Update codes
Browse files Browse the repository at this point in the history
  • Loading branch information
silasprincipe committed Jan 15, 2024
1 parent cf53c7e commit 9edc771
Show file tree
Hide file tree
Showing 13 changed files with 987 additions and 1,262 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,7 @@ po/*~
rsconnect/

codes/drafts/

data/

functions/drafts/
422 changes: 422 additions & 0 deletions README.html

Large diffs are not rendered by default.

32 changes: 27 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,31 @@
# Understanding how temperature is changing on the Marine Heritage Sites
# eDNA expeditions
## Environmental information on Marine Heritage Sites

Work in progess. Experimental.
This repository documents the process to obtain and analyse environmental information (sea temperature and oxygen) on Marine Heritage Sites. Environmental information is downloaded from Copernicus, from the following data sources:

See the page at https://iobis.github.io/marineheritage_sst
- SST: [Global Ocean Physics Reanalysis](https://data.marine.copernicus.eu/product/GLOBAL_MULTIYEAR_PHY_001_030/description)
- Oxygen: [Global Ocean Biogeochemistry Hindcast](https://data.marine.copernicus.eu/product/GLOBAL_MULTIYEAR_BGC_001_029/description)

To add a section with 'climate vulnerability' for the most abundant/present species (?)
You can see the live page at https://iobis.github.io/marineheritage_sst

![image](https://github.com/iobis/marineheritage_sst/assets/53846571/db63c648-b78a-4154-9559-c042efd3a607)
## Codes

### Data download

There are several ways of obtaining Copernicus satellite data. We provide code for three possible pathways:

1. From WEkEO (`download_temperature_wekeo.ipynb`) - WEkEO is a service from Copernicus that provide a virtual environment (JupyterHub) for satellite data processing. Because all Copernicus data is acessible directly from the virtual environment, this is the fastest way of obtaining the data. WEkEO is free and an account can be created here. Once you have set up your virtual environment, open the Jupyter notebook provided here to download the data.
2. Using OpenDAP (`download_temperature_opendap.R`) - OpenDAP is a data access protocol that is widely used to access satellite data. OpenDAP is also a very quick way to acess Copernicus data and here we adapt code provided by Jorge Assis to subset and download the information we need. Note, however, that the OpenDAP support for Copernicus is being deprecated in mid 2024 in favor of the new Python API (see below).
3. Using the new Copernicus API (`download_*_toolbox.R`) - Copernicus introduced major changes in its Marine Data store in 2023, including a new [toolbox](https://help.marine.copernicus.eu/en/articles/7949409-copernicus-marine-toolbox-introduction) for data access. Unfortunately, the solution is based only on Python and thus there is no R equivalent for it. Using from R relies on system interface to the CLI or link through the `reticulate` package (approach used here). The code `get_depth_profiles.R` is used to obtain the nearest available depth from the chosen depths.

To be in line with the future changes in the Copernicus services, we adopted the pathway 3 to obtain the data. It is necessary to have **Python** installed and the toolbox ([instructions here](https://help.marine.copernicus.eu/en/articles/7970514-copernicus-marine-toolbox-installation), we recommend using `pip`).

Oxygen data download proceeded the same way as temperature (use the code with the word "oxygen").

### Data processing

Data analysis for Marine Heat Waves and Cold Spells was done using the package [`heatwaveR`](https://robwschlegel.github.io/heatwaveR/). More details to be added after the workshop.

----

<img src="https://obis.org/images/logo.png" width="200">
136 changes: 136 additions & 0 deletions codes/download_oxygen_toolbox.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#################### eDNA expeditions - scientific analysis ####################
########################## Environmental data download #########################
# January of 2024
# Author: Silas C. Principe
# Contact: [email protected]
#
##################### Oxygen concentration from Copernicus #####################

# Load packages ----
library(terra)
library(reticulate)
cm <- import("copernicus_marine_client")


# Settings ----
.user <- rstudioapi::askForPassword("Enter your user")
.pwd <- rstudioapi::askForPassword("Enter your password")

outfolder <- "data/oxygen"
fs::dir_create(outfolder)
filename <- "var=o2"


# Load areas ----
mhs <- vect("data/shapefiles/marine_world_heritage.gpkg")
# Add an index - we put from 0 to keep consistency with the JupytherHub approaches
mhs$code <- 0:(length(mhs)-1)


# Define target dataset, time and depths
dataset <- "cmems_mod_glo_bgc_my_0.25_P1D-m"
product <- "globgch"
years <- 1992:2021
depths <- c(0, 25, 50, 75, 100, 150, 200, 250, 500, 1000, 2000)
variables <- list("o2")
lon_lat_buffer <- 0.5 # in degrees
failed <- c() # To see if any failed

for (site_idx in mhs$code) {

sel_site <- mhs[mhs$code == site_idx, ]

long_range <- ext(sel_site)[1:2] + c(-lon_lat_buffer, lon_lat_buffer)
lat_range <- ext(sel_site)[3:4] + c(-lon_lat_buffer, lon_lat_buffer)

for (depth in depths) {

outfile <- paste0(filename, "_site=", site_idx, "_depth=", depth, "_product=", product, ".nc")

if (!file.exists(paste0(outfolder, "/", outfile))) {
success <- try(cm$subset(
dataset_id = dataset,
variables = variables,
username = .user,
password = .pwd,
minimum_longitude = long_range[1],
maximum_longitude = long_range[2],
minimum_latitude = lat_range[1],
maximum_latitude = lat_range[2],
start_datetime = paste0(min(years), "-01-01T00:00:00"),
end_datetime = paste0(max(years), "-12-31T23:59:59"),#"2022-01-31T23:59:59",
minimum_depth = depth,
maximum_depth = depth+0.5,
output_filename = outfile,
output_directory = outfolder,
force_download = TRUE
), silent = T)

# It will rarely fail, but can happen due to server connection problems.
# In those cases, sleep and retry
if (inherits(success, "try-error")) {
cat("Retrying... \n")
Sys.sleep(1)
success <- try(cm$subset(
dataset_id = dataset,
variables = variables,
username = .user,
password = .pwd,
minimum_longitude = long_range[1],
maximum_longitude = long_range[2],
minimum_latitude = lat_range[1],
maximum_latitude = lat_range[2],
start_datetime = paste0(min(years), "-01-01T00:00:00"),
end_datetime = paste0(max(years), "-12-31T23:59:59"),#"2022-01-31T23:59:59",
minimum_depth = depth,
maximum_depth = depth+0.5,
output_filename = outfile,
output_directory = outfolder,
force_download = TRUE
), silent = T)
if (inherits(success, "try-error")) {failed <- c(failed, outfile)}
}
} else {
cat(glue::glue("File for site {site_idx} depth {depth} already exists. Skipping.\n"))
}

}

}

# Use the code below to explore a dataset
#
# sst_l3s = cm$open_dataset(
# dataset_id = dataset,
# #variables = variables,
# username = .user,
# password = .pwd,
# minimum_longitude = -10,
# maximum_longitude = 10,
# minimum_latitude = -10,
# maximum_latitude = 10
# )
#
# # Print loaded dataset information
# print(sst_l3s)


# Convert to parquet
proc <- job::job({
lapply(list.files(outfolder, full.name = T, pattern = "\\.nc"),
function(fname, redo = T) {

outfile <- gsub("\\.nc", ".parquet", fname)

if (file.exists(outfile) & !redo) {
cat("File already processed, skipping.\n")
} else {
r <- rast(fname)
r_dat <- as.data.frame(r, xy = TRUE, time = TRUE, wide = FALSE)
r_dat <- subset(r_dat, select = -layer)
arrow::write_parquet(r_dat, outfile, compression = "gzip")
}

return(invisible(NULL))
})
})
214 changes: 0 additions & 214 deletions codes/download_sst_current.R

This file was deleted.

Loading

0 comments on commit 9edc771

Please sign in to comment.