Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider regular expression for --dataset DATASETID #66

Open
jbfaden opened this issue May 17, 2024 · 9 comments
Open

Consider regular expression for --dataset DATASETID #66

jbfaden opened this issue May 17, 2024 · 9 comments

Comments

@jbfaden
Copy link
Contributor

jbfaden commented May 17, 2024

It would be useful for me to be able to run suites of tests on parts of a server, maybe by specifying a regular expression for the --dataset argument. For example, testing the new CDAWeb HAPI server there are many Barrel mission CDF files, which are all very similar and it would be nice to exclude them from the test (with ^(?!BAR).+ for example).

@berniegsfc
Copy link

I agree this is useful and the cdasws supports regex for many "dataset search" query parameters. But implementing it is difficult if you defend against ReDoS. I had to protect the cdasws from ReDoS years ago. It never caused a problem until after a recent upgrade when many (simple) requests involving client-supplied regex values were rejected as being "too complex". I never understood what happened but it hasn't happened again (even though the same regex request is sent 100s/1000s of times/day). Sometimes I wish I never implemented regex support nearly 20 years ago.

@berniegsfc
Copy link

I didn't look closely at the project for this issue. I thought you were suggesting adding regex support to the hapi specification. ReDoS is less of a concern for the verifier. So, ignore my previous comment.

@jbfaden
Copy link
Contributor Author

jbfaden commented May 17, 2024

That's good to know, Bernie. I was thinking of it in terms of running it locally, but I can see that with the server mode this would be a volnerability. So maybe a constrained set of regular expressions, or maybe wildcards (--exclude=BAR*)?

(The validator has spent over an hour and it is still going through BAR*... I hope this isn't causing any problems for you, Bernie.)

@jbfaden
Copy link
Contributor Author

jbfaden commented May 17, 2024

I see the --help option shows that "^regex" is supported, so it's really more of a documentation issue. I'd like to try:

node ~/temp/verifier-nodejs/verify.js --url http://localhost:8080/HapiServer/hapi --dataset='^BAR.*'

but this doesn't seem to work. Also, how would I use a carot in the regex, is it just:

node ~/temp/verifier-nodejs/verify.js --url http://localhost:8080/HapiServer/hapi --dataset='^^BAR.*'

@rweigel
Copy link
Collaborator

rweigel commented May 17, 2024

Not sure what the issue is. These work:

node verify.js --url 'http://hapi-server.org/servers/TestData2.0/hapi' --dataset='^dataset[1-2]'
node verify.js --url 'http://hapi-server.org/servers/TestData2.0/hapi' --dataset='^dataset.*'

To debug, put console.log(datasets); process.exit() in place of line 1420.

I'll add some examples to the docs in the future.

Regarding escaping, the code is very simple: https://github.com/hapi-server/verifier-nodejs/blob/master/tests.js#L1399. So I'd experiment in Chrome Debugger, for example, re = new RegExp('^a\\^bc'); re.test('a^bc') to figure out the regular expression to pass on the command line.

I'd write a script for more complex use cases that generates a sequence of node verify.js commands.

@jbfaden
Copy link
Contributor Author

jbfaden commented May 17, 2024

Should this mean test any dataset that starts with D:

node ~/temp/verifier-nodejs/verify.js --url https://cdaweb.gsfc.nasa.gov/hapi --dataset='^D.*' 

I was expecting it to do the D's. Is this not right?

@rweigel
Copy link
Collaborator

rweigel commented May 17, 2024

I would think so. Are you using the latest version? Try git pull; npm install.

But given you saw the regex option in the help, it seems you are using a version with the regex feature. If I do the `console.log(datasets); process.exit()`` in place of line 1420 and your command, I see

[
  { id: 'DE1_1MIN_RIMS' },
  { id: 'DE1_6SEC_MAGAGMS' },
  { id: 'DE1_PWI_LFC-SPECTRA' },
  { id: 'DE1_PWI_OR-AT' },
  { id: 'DE1_PWI_SFC-SPECTRA' },
  { id: 'DE2_62MS_VEFIMAGB@0' },
  { id: 'DE2_62MS_VEFIMAGB@1' },
  { id: 'DE2_AC500MS_VEFI' },
  { id: 'DE2_DCA500MS_VEFI' },
  { id: 'DE2_DUCT16MS_RPA@0' },
  { id: 'DE2_DUCT16MS_RPA@1' },
  { id: 'DE2_DUCT16MS_RPA@2' },
  { id: 'DE2_DUCT16MS_RPA@3' },
  { id: 'DE2_DUCT16MS_RPA@4' },
  { id: 'DE2_DUCT16MS_RPA@5' },
  { id: 'DE2_DUCT16MS_RPA@6' },
  { id: 'DE2_DUCT16MS_RPA@7' },
  { id: 'DE2_ION2S_RPA' },
  { id: 'DE2_NEUTRAL1S_NACS' },
  { id: 'DE2_NEUTRAL8S_FPI' },
  { id: 'DE2_PLASMA500MS_LANG' },
  { id: 'DE2_UA16S_ALL' },
  { id: 'DE2_VION250MS_IDM@0' },
  { id: 'DE2_VION250MS_IDM@1' },
  { id: 'DE2_VION250MS_IDM@2' },
  { id: 'DE2_WIND2S_WATS' },
  { id: 'DE_UV_SAI' },
  { id: 'DE_VS_EICS' },
  { id: 'DMSP-F13_SSJ_PRECIPITATING-ELECTRONS-IONS' },
  { id: 'DMSP-F16_SSIES-3_THERMAL-PLASMA' },
  { id: 'DMSP-F16_SSJ_PRECIPITATING-ELECTRONS-IONS' },
  { id: 'DMSP-F16_SSM_MAGNETOMETER' },
  { id: 'DMSP-F17_SSIES-3_THERMAL-PLASMA' },
  { id: 'DMSP-F17_SSJ_PRECIPITATING-ELECTRONS-IONS' },
  { id: 'DMSP-F17_SSM_MAGNETOMETER' },
  { id: 'DMSP-F18_SSIES-3_THERMAL-PLASMA' },
  { id: 'DMSP-F18_SSJ_PRECIPITATING-ELECTRONS-IONS' },
  { id: 'DMSP-F18_SSM_MAGNETOMETER' },
  { id: 'DN_K0_GBAY' },
  { id: 'DN_K0_HANK' },
  { id: 'DN_K0_ICEW' },
  { id: 'DN_K0_KAPU' },
  { id: 'DN_K0_PACE' },
  { id: 'DN_K0_PYKK' },
  { id: 'DN_K0_SASK' },
  { id: 'DSCOVR_AT_DEF' },
  { id: 'DSCOVR_AT_PRE' },
  { id: 'DSCOVR_H0_MAG' },
  { id: 'DSCOVR_H1_FC' },
  { id: 'DSCOVR_ORBIT_PRE' },
  { id: 'DYNAMO-2_DESA_NX02A-ESA-FLUX' }

@jbfaden
Copy link
Contributor Author

jbfaden commented May 17, 2024

I should have started by pulling the latest code. This is working for me now. Do you know if I can do --dataset='^^BAR.*' to exclude all the BAR ones? (It doesn't seem to work for me.)

@rweigel
Copy link
Collaborator

rweigel commented May 17, 2024

Probably. I'd try the hints at https://stackoverflow.com/questions/1538512/how-can-i-invert-a-regular-expression-in-javascript and test in https://regex101.com/ (make sure to select Javascript on left). Perhaps ^(?!BAR)(.*).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants