-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USX implementations in node and web modules #263
Merged
joelthe1
merged 32 commits into
Bridgeconn:js-tree-sitter-usfm-parser
from
kavitharaju:js/list-n-usx
Oct 22, 2024
Merged
USX implementations in node and web modules #263
joelthe1
merged 32 commits into
Bridgeconn:js-tree-sitter-usfm-parser
from
kavitharaju:js/list-n-usx
Oct 22, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kavitharaju
changed the title
toUSX() implementation in node and web modules
USX implementations in node and web modules
Oct 19, 2024
joelthe1
merged commit Oct 22, 2024
7664959
into
Bridgeconn:js-tree-sitter-usfm-parser
5 checks passed
joelthe1
pushed a commit
that referenced
this pull request
Oct 28, 2024
* initial * using base64 grammar to fix the missing magic number error while loading wasm * Use published tree-sitter-usfm3 library (#254) * update python builds for macOS intel and apple silicon * Pin dependency versions to latest working * Bump version: 3.0.0-beta.5 → 3.0.0-beta.6 * Bump version of tree-sitter-usfm3 as well * Revert "update python builds for macOS intel and apple silicon" * Pin dependency versions in pyptoject.toml as well * Bump version: 3.0.0-beta.6 → 3.0.0-beta.7 * Bump version in tree-sitter module as well * Upgrade the tree-sitter and tree-sitter-cli node libraries * Regenerate grammar, binaries and other files using newer version * Configure setup.py and .toml files for new python tree-sitter-usfm3 grammar package * Update the python parser module to use published grammar library instead of binary .so file * Change the installation and packaging config of python usfm-grammar to not use binary but the published python package for grammar * Update documentations(dev notes) * Keep bump version configs common for py-parser module and tree-sitter-usfm3 module * Git actions, trail #1 * Save changes in setup.py * Fix issues in bumpversion.cfg * Bump version: 3.0.0-beta.7 → 3.0.0-alpha.8 * Gitactions trials * Try publish to test.pypi * Try publish to test.pypi * Try publish to test.pypi * pin tree-sitter-usfm3 version in pyproject.toml, but remove it when running on github actions * Bump version: 3.0.0-alpha.8 → 3.0.0-alpha.9 * Add changes to PyPi publish workflow * Bump version: 3.0.0-alpha.9 → 3.0.0-alpha.10 * Try to fix bumpversion * Bump version: 3.0.0-alpha.10 → 3.0.0-alpha.11 * Bump version: 3.0.0-alpha.11 → 3.0.0-beta.8 * Change the build process in npm publish * Bump version: 3.0.0-beta.8 → 3.0.0-beta.9 * Update pypi_publish.yml * Bump version: 3.0.0-beta.9 → 3.0.0-beta.10 * Test suite update from tcdocs (#253) * Pull in latest changes of Test Suite from tcdocs * Update tests readme * Name the versionNumber node in \usfm * Allow \fig within footnote * Make attribute in rb mandatory * Allow \cat within footnote * Allow cat within crossrefs too * New tag ref and allow \xt and \+xt in footnotes * Allow multiple imt blocks in same file * Override 3 tests for parsing * Change version number to 3.1 in USX and USJ * Add loc as default attrib of ref in USX/J generation code * Add ref node handling and its default attrib in python module * Add ref to the list of char markers in conversion code * Hand difference in space and line handling in tcdocs and our usj * Bring \fig as noteText content and not note content * Strip all text values in testsuite and generated USJ before comparison * Seperate footnoteText and crossrefText to allow \xt in one and not in the other * Exclude the \usfm marker when creating USJ * \fv is within footnoteText, not within footnote * List issue USXs in test script to exlcude them * Include the valid testcases back as per new rules in 3.1 * Type of b is not optbreak, but para * Type of b is not optbreak, but para * Allow attributes with and without link- prefix * Add space before closing marker in usfm generation * Handle list of wrong USXs in test script * Add \esbe in generating USFM from USX * Fix imports in tests * All char marker nesting without + * Include tests with default markers in milestone * Undo allowing char within char without + * exclude two failing tests for now * exclude two failing tests for now * fix linting issues * fix requirements file * update python builds for macOS intel and apple silicon * Pin dependency versions to latest working * Bump version: 3.0.0-beta.5 → 3.0.0-beta.6 * Bump version of tree-sitter-usfm3 as well * Revert "update python builds for macOS intel and apple silicon" * Pin dependency versions in pyptoject.toml as well * Bump version: 3.0.0-beta.6 → 3.0.0-beta.7 * Bump version in tree-sitter module as well * Upgrade the tree-sitter and tree-sitter-cli node libraries * Regenerate grammar, binaries and other files using newer version * Configure setup.py and .toml files for new python tree-sitter-usfm3 grammar package * Update the python parser module to use published grammar library instead of binary .so file * Change the installation and packaging config of python usfm-grammar to not use binary but the published python package for grammar * Update documentations(dev notes) * Keep bump version configs common for py-parser module and tree-sitter-usfm3 module * Git actions, trail #1 * Save changes in setup.py * Fix issues in bumpversion.cfg * Bump version: 3.0.0-beta.7 → 3.0.0-alpha.8 * Gitactions trials * Try publish to test.pypi * Try publish to test.pypi * Try publish to test.pypi * pin tree-sitter-usfm3 version in pyproject.toml, but remove it when running on github actions * Bump version: 3.0.0-alpha.8 → 3.0.0-alpha.9 * Add changes to PyPi publish workflow * Bump version: 3.0.0-alpha.9 → 3.0.0-alpha.10 * Try to fix bumpversion * Bump version: 3.0.0-alpha.10 → 3.0.0-alpha.11 * Bump version: 3.0.0-alpha.11 → 3.0.0-beta.8 * Change the build process in npm publish * Bump version: 3.0.0-beta.8 → 3.0.0-beta.9 * Update pypi_publish.yml * Bump version: 3.0.0-beta.9 → 3.0.0-beta.10 * Try using web-tree-sitter and wasm file * Test publish alpha.3 * More Grammar changes as per USFM/X 3.1 (#255) * Allow any 3 letter code as bookcode * Support nested markers without + * Make closing mandatory for \fv * Introduce 'key' attribute in \k * Try fixing the issue with init() * Include the grammar wasm in the npm package * try publish locally * Use babel for generating commonjs distribution from the esm code base * Remove the src/grammar folder with grammar.base64 array and use wasm * Fix the basic test scripts to use files in dist/ folder generated after npm run build command * Add notes on how to build and locally publish the js module * Add js-module files in bumpbersion config * Update the usage in README * Use named export instead of default for USFMParser * revert to parcel build instead of webpack * Test publish alpha.4 using web-tree-sitter, parcel, wasms etc in cjs and esm formats. Tested in node, and NextJs --------- Co-authored-by: Chris Vaughn <[email protected]> * React support with seperate usfm-grammar-web (#256) * Bundle tree-sitter.js instead of keeping web-tree-sitter dependency * Fix the parser init to use input wasm path * Keep a seperate usfm-grammar-web * keep a separate usfm-grammar for node w/o using wasm files * Test publish another alpha.8 version of usfm-grammar at npm * Test node n web (#257) * Setup mocha and a sanity testcase in node-usfm-grammar * Pass USFM at object creation not to function. Not throw errors immediately, allow fromUsj parser creation * Add 7 tests to check object initialization and error handling * Add test for parsing usfm and validating pass/fail * Add tests for parsing pass or fail check of testsuite samples * Override \s5 fails and milestone w/o \* fails * Increase buffer size for large file parsing * Add a toSyntaxTree() method like in python * Undo overriding \s5 w/o \p fails and milestone w/o \* fails * Find and report MISSING nodes in tree as errors * Override samples with \s5 w/o space to fail and wrap ups test for parsing check * Rename usfmToUsj() to toUSJ() to be similar to the python API * change USJ version to 3.1 in generation code * Start testing the successful conversion of USFM to USJ for positive testsuite samples * Increase the timeout threshold for mocha test case runs as USJ conversion takes time * Extract \cat value correctly * Use toUSJ() in prevoius test case too * Start test by comparing generated USJ to testsuite reference * Handle difference in space, line and 'lemma' handing in tcdocs and usfm-grammar before comparing USJs * Exclude the \usfm markers node from USJ * Keep b as an empty paragraph * Implement \ref node * Handle default attribute handling for k, ref, xt etc * Include table cell processing in USJ generation * Exclude USJ file with \ref not present in USFM from comparison test * Replace ~ with space(which is not done in python) * Fix undecalred var issue in para object conversion * Exclude 2 USJs from comparison: an \lit issue and a lemma issue from USX * Fix issue in adding ca and cp values in chapter node * Avoid adding \+ for fv marker when regenerating USFM * Convert altnumber and pubnumber back to usfm markers in USFM generation * Handle ref object not having marker field * Test USFM round tripping via USJ and cross check presence of all input markers in output USFM also * update API changes in Readme * Fix the example in Readme.md * S5 support (#259) * Handle error reporting and use of ignore-errors in node module properly * Prepare for alpha.9 release of node and web modules for \s5 support with ignore errors * Make API changes and do error handling in web, similar to python and node packages * Change documentations corresponding to web module usage and development * Update dev notes regarding node and web modules * Test node n web (#260) * Tests in web:Get the basic tests for APIs, error reports to work as in node module * Tests in web:Get the parsing tests to work as in node module * Tests in web:Copy the usj tests from node module * Web: keep version as 3.1 * Web: extract \cat's value correctly * Web: include notes in USJ that was missed * Web: exclude \usfm node when making USJ, as version is added by default * Web: treat \b as para not optbreak * Web: process ref nodes and changes from link-href to href * Web: check for table cells and add them in USJ * Web: fix issue with ca and cp in USJ generation * Web: replace ~ with space in USJ generation * Web: handle ref node lacking marker field in USJ while USFM generation * Web: handle ca cp va vp in usfmGeneration from usj * Web testing: All the 770 tests in node module running successfully in web too * Correct the version tag of usfm-grammar-web * Tests/python node web (#261) * Copy the list of overriding tests from node to python * Fix issue in python test script where space in lemma is handled for comparison * Node: Add tests to check for all markers in generated USJ * Web: Add tests to check for all markers in generated USJ * Update the USJ schema with proper schema name * Node: Add tests to validate the generated USJs against JSON schema * Web: Add tests to validate the generated USJs against JSON schema * Node: Fix issues in include and exclude markers options * Node: Add tests for exclude and include markers * Web: Fix issues in inlcude and exclude markers options * Web: Add tests for include and exclude marker options * Web: Fix the 'table index is out of bounds' error when running all tests at once * Python: Report MISSING nodes are errors * Gitactions for js tests: Trial #1 * Gitactions for js tests: Trial #2 * Gitactions for js tests: Trial #3 * Gitactions for js tests: Trial #4 * Pylint error fix * Gitactions: Let tests run on all PRs * USFM toList() Implementation (#262) * Node: Implement toList method like we have in python * Node: Add tests for list conversion * Web: Implement toList method like we have in python and node * Web: Add tests for list conversion * Remove old trials for web version from node module * USX implementations in node and web modules (#263) * Node: Start with toUSX() with xmldom library * Node: Startwith USXGenerator class, constructore and Id node * Node: Implement chapter, verse, text etc methods in USX Generation * Node: Implement content paragraph node in USX Generation * Node: Implement Notes conversion to USX * Node: Implement char nodes and attributes conversion to USX * Node: Implement esb, cat, ref etc and generic parastyle markers in USX generation * Node: Implement milestone and table nodes in USX generation * Node: Make verse nodes empty and not carrying the text in USX generation * Node: Return xlmdom element instead of string after USX generation * Node: Fix issue of not adding node to xml tree before processing children * Node: Add verse end node at chapter end * Node: More minor fixes in USX generation * Node: Use @xmldom/xlmdom instead of xmldom and xml2js * Node: Add tests for errorless usfm-usx conversion checks * Node: Switch back to xmldom for speed * Node: exlcude usfm(version) node in USX * Node: Fix issue with numbered markers * Node: Fix marker usage instead od style * Node: Fix issues of pi style value * Node: Fix the similar issue with numbered marker fiun in USJ generation * Node: Keep ref marker not as char in USX generation * Node: tests for checking all markers in generated USX * Node: Minor fix in ref handling * Node: Allow tests to run in parallel * Web: Replicate toUSX() implementation as in Node * Web: Add tests for toUSX() connersion as in Node * Node: Fix issue is error handling * Node: Implement USFM Generation from USX and fromUsx initialization of parser * Node: Add tests for roundtripping USFM via USX * Web: Implement USFM Generation from USX and fromUsx initialization of parser * Web: Add tests for roundtripping USFM via USX --------- Co-authored-by: kavitharaju <[email protected]> Co-authored-by: Chris Vaughn <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fromUSX
initialization of USFMParser class.xmldom
1. the USX way of nesting multiple text components in single node
2. repeating same type nodes in single node
3. also both string-to-dom and dom-to-string parsing.
fast-xml-parser,xml2js,xml,@xmldom/xmldom,libxmljs.xmldom
is an old library last updated 4 years ago, even though it has 2k+ dependants@xmldom/xmldom
, was taking several amounts of time compared to this.Not Done