Skip to content

Commit

Permalink
Add specificity to operation doc
Browse files Browse the repository at this point in the history
  • Loading branch information
mfisher87 committed Jul 22, 2024
1 parent e4cb83e commit beec098
Showing 1 changed file with 88 additions and 45 deletions.
133 changes: 88 additions & 45 deletions doc/operation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
# Operation

> [!WARNING]
> This software requires a large amount of memory (min. 8GB), increasingly more as time
> goes on, because the entire climatology is sometimes read in to memory. An error may
Expand All @@ -11,9 +9,21 @@
> We should move to a tool like SQLite or XArray for managing this data without needing
> so much memory.
# Operation

The one set of files that is required but not generated by this code are the 37h
threshold binaries provided by Tom Mote. These are checked in to this repository.

<details><summary>🛠️ _TODO_</summary>

- [ ] Simpler commands/CLI, shouldn't have to know to set PYTHONPATH.
- [ ] Simplify steps; either consolidate or express step order in code, e.g. in CLI
with clear order.
- [ ] Convert from storage via picklefiles to NetCDF. See issue:
https://github.com/nsidc/Antarctica_Today/issues/19

</details>

> [!NOTE]
> The path commands are for users working in a bash environment! For csh users,
> the commands are:
Expand All @@ -22,81 +32,106 @@ threshold binaries provided by Tom Mote. These are checked in to this repository
> setenv PYTHONPATH /path/to/repository/Antarctica_Today
> python antactica_today/program.py
> ```
>
> 🛠️ _TODO_
>
> - [ ] Simpler commands/CLI, shouldn't have to know to set PYTHONPATH.
> - [ ] Simplify steps; either consolidate or express step order in code, e.g. in CLI
> with clear order.
> - [ ] Convert from storage via picklefiles to NetCDF. See issue:
> https://github.com/nsidc/Antarctica_Today/issues/19
> [!IMPORTANT]
> :bangbang: Steps must be performed in order :bangbang:
## 1. Download NSIDC-0080
Download all NSIDC-0080 granules:
Download NSIDC-0080 granules into the `Tb/` directory:
```bash
PYTHONPATH=.
python antarctica_today download-tb
```
> [!NOTE]
> For data before 2022-01-10, `NSIDC-0001` and `NSIDC-0007` are used. These previous
> data have already been processed and are available as binary ".bin" files in the
> `/data/daily_melt_bin_files` directory. All data newer than that date are calculated
> from the `NSIDC-0080` v2 product (https://nsidc.org/data/nsidc-0080/versions/2), which
> is in NetCDF format.
> [!IMPORTANT]
> For data before 2022-01-10, data have already been processed through step 2 and are
> available as binary ".bin" files in this repo's `/data/daily_melt_bin_files`
> directory. This data was generated from `NSIDC-0001` and `NSIDC-0007` datasets.
>
> Data newer than that date are calculated freshly from the `NSIDC-0080` v2 product
> (https://nsidc.org/data/nsidc-0080/versions/2), which is in NetCDF format. This step
> downloads that raw data.

## 2. Generate all the daily melt binary files

Generate new data in `data/daily_melt_bin_files/`:

```bash
PYTHONPATH=.
python antarctica_today generate-daily-melt
```
> [!IMPORTANT]
> Binaries provided in this repo's `data/daily_melt_bin_files/` dir with pre-2016 dates
> are already calibrated by Tom Mote and don't need to be generated. Remember from the
> note in the previous step: pre-generated data goes through to 2022-01-10.
> [!WARNING]
> I receive a large number of warnings like:
>
> ```
> UserWarning: Warning: At least one NSIDC Tb file on date '20230909' is missing. Skipping
> that date.
> ```
>
> Why?

> [!NOTE]
> Binaries provided in `data/daily_melt_bin_files` are already calibrated by Tom Mote
> for pre-2016 dates. This generates new daily melt files from the NSIDC-0080 data
> downloaded in the previous step.
<details><summary>🛠️ _TODO_</summary>
I receive a large number of warnings like:

```
UserWarning: Warning: At least one NSIDC Tb file on date '20230909' is missing. Skipping
that date.
```

Why?
</details>


## 3. Generate the database

This software manages a database covering the full climatology in the form of a pickle
file.
This step creates, primarily, four pickle files:

* `daily_cumulative_melt_averages.pickle`
* `daily_melt_pixel_averages.pickle`
* `database/v3_1979-present_gap_filled.pickle`
* `database/v3_1979-present_raw.pickle`

Additionally:

* `.csv` files will be created in `database/` directory
* `.tif` files will be created in `data/mean_climatology/` directory
* `.tif` files will be created in `data/annual_*_geotifs` directories

> [!NOTE]
> This command may take up to tens of minutes.
>
> 🛠️ _TODO_
>
> - [ ] What does this command do? Create the pickle?
> - [ ] Why is the next section called "Initializing"? Are there multiple pickle files?
> Does each command initialize one? Can we combine them all into one command?

```bash
PYTHONPATH=.
python antarctica_today preprocess
```

> [!NOTE]
> This command may take up to tens of minutes.
<details><summary>🛠️ _TODO_</summary>

- [ ] Why is the next section called "Initializing"? Are there multiple pickle files?
Does each command initialize one? Can we combine them all into one command?
- [ ] After this command, `git status` shows untracked files. Which should ignored?
Which should be committed?

Untracked:

```
database/baseline_percentiles_1990-2020.csv
database/baseline_percentiles_1990-2020_gap_filled.csv
database/daily_melt_totals.csv
database/daily_melt_totals_gap_filled.csv
```
</details>


### Database initialization (?)


<details><summary>🛠️ _TODO_</summary>
Is this step necessary? It seems like new files aren't being created when this step is
run.
</details>

Create the melt array picklefile, a file containing a 2d grid for each day:

```bash
Expand All @@ -118,16 +153,16 @@ python antarctica_today gap-filled-melt-picklefile

### Daily updates

> [!WARNING]
> All initialization steps above must be completed first.
This step will download any new Tb data files from NSIDC since its last run, and generate new plots from the last day's data (for all of Antartica and for each individual region), including:
1) A "daily melt" map of the most recent day's melt extent
2) A "sum" map of that season's total melt days
3) An "anomaly" map of that season's total melt days in comparison to baseline average values to-that-day-of-year
4) A line plot of melt extent up do that date, compared to historical baseline averages.
It will copy these plots into a sub-directory /plots/daily_plots_gathered/[date]/ for easy collection.

> [!WARNING]
> All initialization steps above must be completed first.
```bash
PYTHONPATH=.
python antarctica_today/update_data.py
Expand All @@ -136,10 +171,18 @@ python antarctica_today/update_data.py

## 4. Generate outputs (optional)

This will go through the entire database and produce summary maps and plots for every year on record.
Run the main CLI's `process` command.
> [!NOTE]
> This command may take up to tens of minutes.
This will go through the entire database and produce summary maps and plots for every year on record in the `plots/` directory.

```bash
PYTHONPATH=.
python antarctica_today process
```

<details><summary>🛠️ _TODO_</summary>

- [ ] After this step, `git status` shows changed files. Should they be committed?

</details>

0 comments on commit beec098

Please sign in to comment.