Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Placeholder for data registry capability #218

Open
alexanderdean opened this issue Oct 6, 2016 · 4 comments
Open

Placeholder for data registry capability #218

alexanderdean opened this issue Oct 6, 2016 · 4 comments

Comments

@alexanderdean
Copy link
Member

This is based on @brapse's idea of SoundCloud's internal "Data AMT" service which provides a registry of what data lives where. This is distinct (but complementary to) a schema registry which governs your data types.

To give a Snowplow example, in the data registry you could record:

  • Where the raw collector payloads live
  • Where the enriched events live
  • Where the shredded data lives in Redshift

It can be more than just locations of course - relevant metadata, provenance, etc

@alexanderdean alexanderdean self-assigned this Oct 6, 2016
@andybritz
Copy link

Is this analogous to the LinkedIn WhereHows project?

@alexanderdean
Copy link
Member Author

alexanderdean commented Oct 25, 2016

Yes @andybritz ! It's funny because at Crunch conf there was a LinkedIn talk straight after @brapse's which talked about WhereHows.

We may be able to just use WhereHows instead of adding this into Iglu - it needs further investigation...

@gregbonnette
Copy link

gregbonnette commented Oct 25, 2016

May be worth looking at Atlas too.

http://atlas.incubator.apache.org/

@alexanderdean
Copy link
Member Author

Yes, have been meaning to take a look at Atlas! Thanks Greg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants