Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis-ready Parquet download? #2

Open
marklit opened this issue Mar 5, 2024 · 4 comments
Open

Analysis-ready Parquet download? #2

marklit opened this issue Mar 5, 2024 · 4 comments
Labels
question Further information is requested

Comments

@marklit
Copy link

marklit commented Mar 5, 2024

I built an ETL script that turns the current download into a parquet file. It has names for every field, is columnar-formatted so it is much quicker to query and it is compressed with ZStandard so a day's worth of data is still around 1.2 GB. There is also H3 indices which help filter specific geographies quickly.

https://tech.marksblogg.com/global-flight-tracking-adsb.html

Is there any chance the above ETL script could work its way into your infrastructure and produce a daily Parquet file in addition to the current daily download tar file?

@iakat
Copy link
Member

iakat commented Mar 5, 2024 via email

@wiedehopf
Copy link

Hey nice blog post! :)

If you're gonna make such a nice new format you should include info if the airplane is on the ground.

            'altitude':
                trace[3]
                if str(trace[3]).strip().lower() != 'ground'
                else None,

I didn't see that saved anywhere.
Possibly just a bool in your scheme?

You probably already referenced it while using the data, but here is some explanation on the format: https://github.com/wiedehopf/readsb/blob/dev/README-json.md#trace-jsons
The aircraft object is only present for every 4th point but i assume you didn't need much data from there / your DB scheme handles that somehow.

Also sorry for the format, it's a bit of a mess.

@iakat
Copy link
Member

iakat commented Mar 18, 2024

@marklit of course nothing is preventing you from tackling this project yourself and making the parquet-ready data available similar to this repo. :)

@iakat iakat added help wanted Extra attention is needed enhancement New feature or request labels Mar 18, 2024
@alexey-milovidov
Copy link

@marklit, I've created a ClickHouse database with the data and also added ADSB-E: https://github.com/ClickHouse/adsb.exposed/
Connect me if there are further ideas.

@iakat iakat added question Further information is requested and removed enhancement New feature or request help wanted Extra attention is needed labels Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants