Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: write up ipni+datalog sketch #22
base: main
Are you sure you want to change the base?
feat: write up ipni+datalog sketch #22
Changes from all commits
dd99859
8b6bdc3
e249e41
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be controlled if we use our own indexer and only allow known publishers to publish ads. We can even modify it to only allow certain types of index data.
The client can filter out results to remove unknown publishers and unwanted types of metadata. This can be done without having to fully read the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that is a good call, client could leverage existing trust where possible. I have been mostly interested in an open ended scenario in which publisher of the advertisment can be different from the author and bares no accountability for the accuracy. But if publisher is accountable for advertisements trust in publisher could be leveraged to address this concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think IPNI should be able to support using UNAN to allow the content provider to authorize a publisher to publish on its behalf. I am going to propose that as a change to the IPNI spec. What is the appropriate type of UCAN to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What relationships should be captured and how would they be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I go into more details below but put it simply if you look at the example index you can see that:
block..1
is slice ofblb...left
blob.block..1
is block ofbafy..dag
dag.When I lookup
block..1
which in datalog could be something like["?e", "?relation", "block...1"]
I would expect to get back something like this:But I also would like to be able to query specific leration like
["?dag", "[email protected]/block", "block..1"]
to get something like:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single query to IPNI or to a result cache?
I am not sure I understand. Let's be specific about what key is used to look up what data. What I thought was that If IPNI is queried by a block multihash, the location of the DAG index is returned. The DAG index can be read to get location commitments. This can be cached in a separate result cache. A subsequent query for that same block multihash can first query the cache and then get the index and location commitments back in a single response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine something like this https://www.tldraw.com/r/C41cjsP2iRxuMGyrIByXQ?v=181,118,2832,2120&p=page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would datalog be used as a result cache for IPNI query results? So, datalog queries would only available after getting IPNI results and then getting additional related data such as blob index and location commitments. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow what you're asking here. I think datalog query as a composite of multiple queries. Service that receives them can decompose them in order to query combination of local cache + IPNI, aggregate / filter results and send back to the querying client.
This creates an opportunity to reduce number of queries to IPNI by caching at several layers:
So in worst case scenario it would be:
But once engine finds responses it can cache it all the future queries would require no IPNI queries. We could also tie the cache TTL to the TTLs of all the responses that way we'll get invalidation and recompilation of joined results without having to manually deal with resource management (like invalidate some indexes when commitment expries)