Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flatten parsing result #345

Open
BrunoGugli opened this issue Sep 27, 2024 · 0 comments
Open

Flatten parsing result #345

BrunoGugli opened this issue Sep 27, 2024 · 0 comments

Comments

@BrunoGugli
Copy link

BrunoGugli commented Sep 27, 2024

This issue is to ask if there is an expression in grammar syntax to turn a capture into a flatten string; here is an example to explain my question:

import tatsu

grammar = r'''
    @@grammar::Base64Command
    @@whitespace :: /[ \t]+/

    start = base64_decode_command;

    target =  'Taking' code:code

    code = { let_dig | special_char }+ ;

    let_dig = letter | digit  ;

    special_char = '+' | '/' | '=' ;

    letter = /[a-zA-Z]+/ ;

    digit = /\d+/ ;
'''
    
    text = 'Taking VGhpcyBpcyBhIGJhc2U2NCBjb2RlIGZvciBhbiBleGFtcGxlCg=='
    
    parser = tatsu.compile(grammar)
    
    parse_result = parser.parse(text)
    
    print("Result: ", parse_result)

The output of this script is:

Result:  {'code': ['VGhpcyBpcyBhIGJhc', '2', 'U', '2', 'NCBjb', '2', 'RlIGZvciBhbiBleGFtcGxlCg', '=', '=']}

What I'm looking for it's an expression that allows to specify in the grammar, that "code" must be flatten, so with that, the output should be this:

Result:  {'code': ['VGhpcyBpcyBhIGJhc2U2NCBjb2RlIGZvciBhbiBleGFtcGxlCg==']}

Obviously there is a way to take "code" as a flatten string, and it's defining code like this:

code = /[a-zA-Z0-9+/=]+/ ;

But my target is to maintain the structure and clarity of the grammar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant