-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Publish a list of parser-challenging valid & invalid UCUM codes #291
Comments
(Note- Response based on the original issue title that referred to a "list of invalid UCUM codes") The exact aim/intent here needs to be clarified. The number of possible valid UCUM codes is infinite, so the number of "all possible text strings in the universe" MINUS that number is just another infinity. A well-implemented UCUM parser will confirm whether a specific string is a valid UCUM expression. What it WON'T tell you is if a particular expression is misleading i.e. what it conveys to a typical human reader is not what a UCUM library will make of the same expression. It would certainly be possible to produce a "rogues gallery" of UCUM codes that either illustrate specific "foot guns" in the syntax, or are dangerous examples seen in live use. I listed some of those examples myself in the past, although as i'm now retired I don't have easy access to that work any more. |
I meant it like you say in your last paragraph. A rather short list (tens) of ucum codes that challenge the parser and helps implementers to reach
I have some more invalid UCUM codes in the test suite of my trial on a "well-implemented" UCUM parser in Python here. There are also some challenging valid UCUM codes like "dar" which is only parsed correctly if prefixes (or at least "da") have lower priority than unit atoms. Knowing about such cases would also help when working on a parser/validator. Some examples are in the same test file. Since I completed the parser, such a list will be of smaller value for myself. But others may find it useful when they start or want to validate some existing code. (related #157) |
Ah, so you WERE really talking about invalid codes to challenge the parser, while I had taken it to more towards "valid but misleading" codes, which are really a different issue. So, my mistake, but a useful clarification, thank you @dalito . |
For implementing UCUM in software the collection of common UCUM unit codes is very helpful to create unit tests.
Also very useful would be a list of invalid UCUM codes which include invalid cases such as
A good source for such a list are bug reports in UCUM implementation and old issues which led to clarification of the specification.
The text was updated successfully, but these errors were encountered: