https://github.com/fiatjaf/awesome-jq
https://crates.io/crates/zerocopy
https://github.com/simdjson/simdjson
and a discussion of whether it works with streaming simdjson/simdjson#1361
BUT their context-blindness issue can be solved with a path. Like I'm doing.
And one could hash the path if a flat data-address-space was preferred.
json-stream by alexmaco is mostly dead
Struson seems alive, but uses a fn for parsing elements of arrays, so would need a coroutine (or something) to invert that. https://github.com/marcono1234/struson https://crates.io/crates/struson
cargo-make cos its nice https://github.com/sagiegurari/cargo-make
rapidjson in SAX mode https://rapidjson.org/md_doc_sax.html
wrapper for simdjson. Does not have a streaming mode. simdjson-rust = "0.3.0-alpha.2"
fast, but does tapes, ie pointers into the data which seems to be the main approach of simdjson https://github.com/datalust/squirrel-json?ref=blog.datalust.co
Maybe serde with this? https://stackoverflow.com/questions/75936264/how-can-i-build-up-a-stateful-streaming-parser-with-serde-json
25M https://raw.githubusercontent.com/json-iterator/test-data/master/large-file.json
181M https://github.com/zemirco/sf-city-lots-json/blob/master/citylots.json
Some seriously big data here, finally. But take a wild guess which ones are large.
This one seems to be large: https://data.cdc.gov/api/views/qnzd-25i4/rows.json?accessType=DOWNLOAD
Large json datasets here, but not really helpful because it's records, each containing some json. https://openlibrary.org/developers/dumps
some more test files here https://transparency-in-coverage.uhc.com/
But it's pretty horrible.