-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
verificaton brainstorming #6
Comments
|
A hint at spatiotemporal resampling: https://mlr3spatiotempcv.mlr-org.com/ |
As denoted therein (https://ml4physicalsciences.github.io/2019/files/NeurIPS_ML4PS_2019_75.pdf) and in other literature: power spectral density (PSD) could be more beneficial to evaluate high-resolution features than MSE or PSNR (peak signal to noise ratio). |
Verification script of the spanish team is to be found at: https://github.com/ECMWFCode4Earth/DeepR/tree/main/deepr/validation/netcdf and conducts calculation of well established skill scores for individual coordinates: https://github.com/ECMWFCode4Earth/DeepR/blob/main/deepr/validation/netcdf/metrics.py#L50-L56 |
I did some testing on radially averaged PSD using the r package For lead time 12 with a radially averaged 2D fourier transform per timestep (912 in total) first plots look as follows (line is mean over all timesteps, shaded area is minmax range) |
|
I am not familiar with the score but i guess a skill score is mainly needed when the differences between the methods are small, if they are big enough i dont think it is necessary, similar with scaling and normalizing. Regarding the lead times: maybe we can have a summarizing score for the lead times and show a lead time wise graphic and use the power spectrum just at one or two lead times (night or day). Just for my understanding: does PSD penalize a bias? |
here for comparison the plot with logarithmic x axis (the last two plots were without scaling and normalization, the one in #6 (comment) was with both) |
@r3xth0r, do you think that comparing variograms among cerra and models is a useful addition/alternative to PSD? |
pushed my first tests on PSD with 011f663 to its own experimental branch - any suggestions and ideas are very welcome |
Just did some tests with variography. I aggregated the time steps (to seasons) as its computationally much more expensive than PSD. Here's a test output for lead_time 12. This is for testing only, but we would interpret here that spatial variablility is underestimated by the downscaled data (samos in this case) in summer, but overestimated in the other seasons as compared to cerra. In general, the shape of the variograms is more or less reproduced I think this analysis (same for PSD) is generally only valid for projected coordinates, as otherwise distance is not uniform across space - which is not the case for this as well as above plots! (added this point to the list in #6 (comment); @r3xth0r, any thoughts on this?) here unit of x is degrees, in the PSD plots above wavenumber is diagonal_domain_size(in px)^-1 |
However, aren't we especially interested in the distances < ERA5 pixel size (which is 0.25), which is not covered at all by the above variograms (first bin at 0.54). is it even reasonable to derive a variogram with small enough bins to learn something about the low distance variability we are mainly interested in? |
(1) CRS: you are right. The effect of using geographic instead of projected coordinates might be negligible on small AOIs, but could lead to be considerable on a continental scale. (2) I am somewhat unsure about the added value of using variograms here. We probably would need to consider anisotropy to some extent, but this might not be straightforward, as does not occur consisntly across the whole area. Probably it's sufficient to stick to PSD. (3) A 3D FT would probably be better suited indeed, but I doubt that the additional effort of a manual implementation is really worth it. |
Here PSD as in #6 (comment), but stratified by season |
Inspired by yesterdays meeting (thanks, @mc4117) I added PSD for (bilinear interpolated) ERA5 to this analysis - this is what it looks like for individual timesteps: |
we can clearly see that the power spectrum of the downscaled field (samos) is closely following the PS of the CERRA data, and ERA5 showing bigger differences |
Thanks! It's great to see the comparison |
issue for brainstorming and material collection
The text was updated successfully, but these errors were encountered: