-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plans for Apache Atlas support/integration #51
Comments
Aha! I knew there were more of you. :) I'm super interested in building this out, but I still need to scope it out - largely, I haven't looked at the amundsen metadata library or the apache atlas API enough to be able to tell. I can take a look today and let you know ASAP how feasible it would be. |
No worries, no need to do it ASAP. Atlas' API is quite involved (at least from my experience), but there's https://github.com/jpoullet2000/atlasclient/tree/master/atlasclient which many people seem to be using. I'm tempted to write an Atlas client in Rust, but for now I'm forced to work in Java and Python; plus I can't justify bringing in JNI or FFI for just a REST client :( |
Ah! Yeah Java and Python are much more widely used these days, still. A rust atlas api would be amazing, though. Without the rust atlas api, though, it actually doesn't seem too difficult -- this python client seems pretty reasonable. Let me give it a stab and I'll get back to you (it might be a little wait until I can get to this though). FYI, my current thinking is to periodically scrape from atlas with the registered whale cron job or the github actions script, rather than hitting the API in realtime. Does that feel acceptable to you? Updates wouldn't propagate in realtime, but if the API is performant enough, it could be quite frequent. |
Hey, I think the most involved work with a Rust client could be entity mappings. Atlas has an inheritance model where certain entites would have the same core properties, but differ a lot based on what entity typedef has been created. That said, it seems like Whale only uses Rust for the CLI, so perhaps writing a Rust client might be a tangent, as you could use the Python client. |
Hm. It is a bit of a tangent, but it is absolutely worth considering. I'll think about it more over the weekend. And definitely let me know as soon as you get to a point where you start building the Rust Atlas client. I think the big question for me is what the best architectural choice is. The options in my head right now are just:
Feels like 3 is the easiest, but if you end up creating a rust atlas client, 2 could be more elegant work-around. Also that spark lineage bit sounds SUPER interesting. Would love to know more about it :) |
I have seen Atlas in work and can say that the API is performant enough if there are enough text-based optimizations around (NLP et. al.). I believe 3rd option should be easy to go with and should serve for most of the purpose, considering Atlas is also working to improve their search over time. A rust client would be a good first step. |
Thanks, @prakharcode! Yeah let's go with this for now. I'll post here if I can get to this at some point, but in the meantime, either of you should feel free to post and take this if you're feeling ambitious. :) |
Hi,
This is related to #3 . Are there plans to support Apache Atlas (https://atlas.apache.org)? It's a metadata store that'll include other things like business catalogs and glossaries.
There's some integration with Amundsen, where the latter can store data on Atlas instead of Neo4j. In that case, supporting Amundsen API might be one way to support Atlas.
The text was updated successfully, but these errors were encountered: