Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Things to do for Bert Verhees #17

Open
pieterbos opened this issue Oct 27, 2015 · 13 comments
Open

Things to do for Bert Verhees #17

pieterbos opened this issue Oct 27, 2015 · 13 comments

Comments

@pieterbos
Copy link
Contributor

Since @BertVerhees asked for things to do, a list of suggestions of things he can do:

  • In the adl-anltr grammar (this project):
    1. Fix the regular expression recognition in the ANTLR Lexer. The difficulty is that paths resemble regular expressions.

In archie (https://github.com/nedap/archie) :

  1. Implement tests for any language construct - you'll test this grammar as well as the parser. the framework is already in place
  2. Implement parsing of CTemporal types. The grammar should parse those already, but the treewalker does not.
@ghost
Copy link

ghost commented Oct 27, 2015

Hi Pieter, I appreciate the work you do, and I admire the craftsmanship.

But I test grammars in another way, less integrated, but more on the level
of the strings which should fit in the detailed rules.

Working on this detail level, testing every parsing rule in the grammar has
some advantages. I have my own boilerplate code which is quite easy to
understand.
The parser, in the end is tested thoroughly, but also every rule is
tested, and because, complicated rules are in fact collections of small
rules, there is complicated rule integrating testing also.

The advantage of this approach is that the tests are easy to understand,
pinpoint immediately what goes wrong and where and when, which strings,
etc. Working in this way also causes a very detailed understanding of the
grammar, and I sometimes found hard to find bugs in grammars this way.

Do you think that is useful? I will write these tests anyway for future
use, if not for the public, then for private use.

So I will start, writing tests, and publish them in a forked git
repository, and you can judge later if that is useful for your way of
working.

I will notify you where to find the work. Feel free to criticize it, I am
glad to learn, I developed my grammar testing myself, and on my own, and I

am sure there are many other ways to do so.

Your request to fix a regular expression, I will check it. Which expression
in the lexer is it? And what is the problem?
My teacher used to say, if you solve a problem using regular expressions,
you create another problem, but sometimes we have no choice. So I can take

a look, maybe I see something you did not see.

Regarding the CTemporal types, that the parser works ok, but the tree
walker does not is very strange. I can check that also, but now I have no
clue at all. Where can I find i formstion, How does the problem make itself
visible? I saw some emails but did not read them well. Do you think the

problem is well enough described in those emails?

Let me explain my position. I am an independent developer, nobody is paying
me, and I must find time to do this. I am also not a student, time is
money.
It is because it fits in some future plans which I have next year why I
want to do this. So don't expect a high production, but I will do what I
can.

Thanks for giving the opportunity.

Bert
Op 27 okt. 2015 21:14 schreef "Pieter Bos" [email protected]:

Since @BertVerhees https://github.com/BertVerhees asked for things to
do, a list of suggestions of things he can do:

  • In the adl-anltr grammar (this project):
    1. Fix the regular expression recognition in the ANTLR Lexer. The
      difficulty is that paths resemble regular expressions.

In archie (https://github.com/nedap/archie) :

  1. Implement tests for any language construct - you'll test this
    grammar as well as the parser. the framework is already in place
  2. Implement parsing of CTemporal types. The grammar should parse
    those already, but the treewalker does not.


Reply to this email directly or view it on GitHub
#17.

@pieterbos
Copy link
Contributor Author

Hello Bert,

there are not really emails (although you can receive them this way), but github issues. You can find them on https://github.com/openEHR/adl-antlr/issues .

#7 explains the regular expression problem.

The CTemporal types is not strange at all - i just have not implemented them yet, the grammar works fine and it is not an antlr problem - just missing java code.

I think any testing in this case is useful. In any case, it's open source software, you can add anything you want in your own forks :) This grammar is mainly work by Thomas though, not me, I implemented a java library on top of it and tested the grammar that way.

@ghost
Copy link

ghost commented Oct 28, 2015

Hi Pieter, thanks for your explanation.

I know that the work is mainly Thomas's work. I appreciate his
contribution also very much.

I think testing is useful, because, I know from the previous grammar,
which was written in JavaCC, that there were changes.
Some even after five years.
And it is an easy job to do, more or less, something to listen to
classical music while doing, kind of meditation ;-).
So I do it in my own fork as a contribution, and as an opportunity to
learn the grammar very well.

Which will be very good for the next 10 years, I guess.

In the afternoon, I take a look to the at the regular expression issue.

Best regard
Bert

On 28-10-15 10:31, Pieter Bos wrote:

Hello Bert,

there are not really emails (although you can receive them this way),
but github issues. You can find them on
https://github.com/openEHR/adl-antlr/issues .

#7 #7 explains the
regular expression problem.

The CTemporal types is not strange at all - i just have not
implemented them yet, the grammar works fine and it is not an antlr
problem - just missing java code.

I think any testing in this case is useful. In any case, it's open
source software, you can add anything you want in your own forks :)
This grammar is mainly work by Thomas though, not me, I implemented a
java library on top of it and tested the grammar that way.


Reply to this email directly or view it on GitHub
#17 (comment).

@wolandscat
Copy link
Member

Re: testing code / test cases - if we can produce a nice library of test cases to add to this (adl-antlr) project it would be nice - I don't want it to just be the raw grammar - it should become a resource that a developer can use directly. To do that, I'll need help ;-)

@ghost
Copy link

ghost commented Oct 28, 2015

On 28-10-15 14:31, Thomas Beale wrote:

Re: testing code / test cases - if we can produce a nice library of
test cases to add to this (adl-antlr) project it would be nice - I
don't want it to just be the raw grammar - it should become a resource
that a developer can use directly. To do that, I'll need help ;-)

Can you give an example of what you intend and what the purpose is?

I remember in the old grammar, there was an error, there was a 13606
class-property as keyword, and that made it impossible to specify that
particular class in adl. I have changed that, and I was happy to find
out, everything still worked out, (no one complained until now) because
there was no way to test the grammar, except the test-files Rong had
written, which are not sure to cover the complete grammar.

But I believe you want a solution like that?
I could write that too, all in good time.

I agree that testing the raw grammar is not used very often, it is
mainly to be sure that there are no unintended side-paths in the
grammar, and for those two or three times the grammar needs to be changed.

When googling, I found some test-tools on Antlr, until now, I tested my
grammars only on raw-grammar, because, in fact, that tests the parser
and the lexer also, but is quite some work.
However, for a DSL, it is maybe two days, but for the OpenEHR grammars,
I am afraid it is more then a week.

So, concluding, I don't know what is wise to do.
I did not find any wise man having an opinion on this.

Maybe Terence Parr is the right person to ask.
As main architect of OpenEHR, and also having English as first language,
it could be better if you ask him.

If you can come to a wise decision, I am happy to help you create it

@wolandscat
Copy link
Member

At the moment the test resource I am using is the adl-archetypes reference archetype set - what we use to test the parser and validator in the ADL Workbench. Each of these archetypes is intended to either illustrate some AOM/ADL feature in its working form, or else demonstrate a single kind of failure corresponding to a 'V' code, e.g. VCACA etc, which are the validity codes defined in the AOM, and that ADL WB emits.

Testing Antlr rules seems easier by testing each rule individually, and therefore using small fragments of archetypes, e.g. just a single C_COMPLEX_OBJECT structure etc. A test set based on this approach would presumably consist of many small files like this, and an automated test rig to run them all against specific rules; then we would also want to test full arcetypes and generate the same V-codes out from the reference archetypes.

Doing this probably means upgrading the current Antlr rules to include the V-codes due to specific error conditions.

@ghost
Copy link

ghost commented Oct 28, 2015

On 28-10-15 16:20, Thomas Beale wrote:

At the moment the test resource I am using is the adl-archetypes
reference archetype set - what we use to test the parser and validator
in the ADL Workbench. Each of these archetypes is intended to either
illustrate some AOM/ADL feature in its working form, or else
demonstrate a single kind of failure corresponding to a 'V' code, e.g.
VCACA etc, which are the validity codes defined in the AOM, and that
ADL WB emits.

Testing Antlr rules seems easier by testing each rule individually,
and therefore using small fragments of archetypes, e.g. just a single
C_COMPLEX_OBJECT structure etc. A test set based on this approach
would presumably consist of many small files like this, and an
automated test rig to run them all against specific rules; then we
would also want to test full arcetypes and generate the same V-codes
out from the reference archetypes.

Doing this probably means upgrading the current Antlr rules to include
the V-codes due to specific error conditions.


Reply to this email directly or view it on GitHub
#17 (comment).

Sounds good, only the last sentence, anticipating op probably to make
errors,
I have no idea if it is possible in a grammar.
I think that part must be reserved to a class derived from the
parser-class, which catches ANTLR errors and translates them, and will
be used to process the parsing information.
Probably there are also possibilities in the listener class, but I must
study that.

@ghost
Copy link

ghost commented Oct 29, 2015

Hi Thomas, in the ANTLR Reference is a complete chapter (9 Error Reporting and Recovery) about error-handling/reporting, and explanation and example how to change error messages. It is just a matter of adding your own errorhandling which translates the errors.

You can find the predefined errors in org.antlr4.v4.tool.ErrorType and ErrorSeverity

@wolandscat
Copy link
Member

Indeed (I have the book) - I'll get onto this at some point, but if anyone wants to work out a few patterns, I'll be happy to incorporate them. Since I'm not normally a Java programmer, I'll need to make sure that what we do in this project is the 'normal' way to do it.

@ghost
Copy link

ghost commented Oct 29, 2015

I can help you with that, we have the test-pattern Rong used, that could be a good starting point. I must see how to create the necessary boilerplate. Maybe you have already some archetypes for test-purposes, and according error messages/codes from the ADL Workbench, which you want the parser to produce.

And I must say, I must study the syntax from ADL2.0 (I postponed that ;-) But I must study that anyway, so this is a good occasion. So, if it is OK for you, I start this afternoon, and have at least the boilerplate code ready, and a first test. After that, things can go fast.

@ghost
Copy link

ghost commented Oct 29, 2015

Hi Thomas, I checked the boilerplate code, it is in fact very simple, create a parser, add your errorlistener and parse the file, the errorlistener will then contain the errors that occured.

And for checking the AOM-classes, in fact Pieter Bos has build a good fundament. I think, this is ready, except for ADL-Workbench conforming errorhandling

Maybe you have test archetypes, they are easy to implement in the parser, with expected errors.

@pieterbos
Copy link
Contributor Author

The error handling in Archie is still very primitive. The basic structures are in place (an ANTLR ErrorListener and a place to put errors and warnings), but that's it - the error messages are really primitive. I suggest something like the Bean Validation to handle archetype model validation and errors. If there is a list of standard errors - great, let's use it.

This is however not the first thing on my todo-list.

@ghost
Copy link

ghost commented Oct 29, 2015

Pieter, error handling in ANTLR is, IMHO, good enough, it breaks at severe errors, and runs further at non severe errors. I believe there are five levels of severity.
It must be easy to map some error-coding to antlr-errors, as long as the error-coding is parser-technical, the infrastructure is ready for it, it will be difficult to map semantic errors to parser errors.

I don't know what the adl-workbench has, Thomas knows, so it is for him to decide if it is possible to use that error-system, and maybe partly.

So, best thing to do is to wait for his opinion on this, and then, if appropriate, make a plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants