Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative to Internals.encode #133

Open
lindig opened this issue Nov 17, 2018 · 2 comments
Open

Alternative to Internals.encode #133

lindig opened this issue Nov 17, 2018 · 2 comments

Comments

@lindig
Copy link
Contributor

lindig commented Nov 17, 2018

This is just an idea: the implementation of escape rules for Xml (or other transports) could be also delegated to an efficient regexp engine specified by OCamlLex:

{
    (* short names for important modules *)
    module L = Lexing 
    module B = Buffer

let get      = L.lexeme

exception Error of string
let error fmt = Printf.kprintf (fun msg -> raise (Error msg)) fmt

}

rule escape b = parse
| '&'       { B.add_string b "&";  escape b lexbuf } 
| '"'       { B.add_string b """; escape b lexbuf } 
| '\''      { B.add_string b "'"; escape b lexbuf }
| '>'       { B.add_string b ">";   escape b lexbuf }
| '<'       { B.add_string b "&lt;";   escape b lexbuf }
| [^'&' '"' '\'' '>' '<']+ 
            { B.add_string b @@ get lexbuf
            ; escape b lexbuf
            }
| eof       { let x = B.contents b in B.clear b; x }
| _         { error "don't know how to quote: %s" (get lexbuf) }

{
    let escape str = escape (B.create 100) (L.from_string str)
}
@mseri
Copy link
Collaborator

mseri commented Dec 13, 2019

This could be useful also in other places. Like the json parser, e.g. https://github.com/nojb/tinyjson/blob/master/lib/json.mll

On a separate note, I think markup.ml has the most comprehensive dictionary of entities for encode and decode: https://github.com/aantron/markup.ml/blob/master/src/entities.ml

@lindig
Copy link
Contributor Author

lindig commented Dec 13, 2019

This is nice example for the power of OCamlLex and how to effectively use the different states (or sub-scanners).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants