Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excluded provider does not appear in /providers after a successful announce if head has not changed #2193

Closed
masih opened this issue Aug 2, 2023 · 7 comments
Assignees

Comments

@masih
Copy link
Member

masih commented Aug 2, 2023

Observed in prod on inga. Provider ID, using HTTP publisher,12D3KooWDCXxiSsLi1NT9tsiyimwV6YstQkrjTjD2hAkz2KRVAGG was excluded due to timeout which was a result of bad port number:

{"level":"error","ts":"2023-07-31T23:14:32.243Z","logger":"indexer/ingest","caller":"ingest/ingest.go:883","msg":"Failed to auto-sync with publisher","publisher":"12D3KooWDCXxiSsLi1NT9tsiyimwV6YstQkrjTjD2hAkz2KRVAGG","addr":"/ip4/203.217.18.114/tcp/3737/http/httpath/_ipni%2F","err":"cannot query head for sync: fetch request failed: Get \"http://203.217.18.114:3737/_ipni/head\": GET http://203.217.18.114:3737/_ipni/head giving up after 5 attempt(s): Get \"http://203.217.18.114:3737/_ipni/head\": context deadline exceeded (Client.Timeout exceeded while awaiting headers). Possibly incorrect topic configured"}

Then a successful announcement was made by the provider :

{"level":"info","ts":"2023-08-02T00:15:59.952Z","logger":"announce","caller":"announce/receiver.go:316","msg":"Handling direct announce","peer":"12D3KooWDCXxiSsLi1NT9tsiyimwV6YstQkrjTjD2hAkz2KRVAGG"}

This did not trigger a sync because the stop node in selector matched the sync node:

{"level":"info","ts":"2023-08-02T00:20:51.372Z","logger":"dagsync","caller":"dagsync/subscriber.go:812","msg":"cid to sync to is the stop node. Nothing to do","cid":"baguqeeraxstl6rsun6vgozbkoyy5gl67ebwvpjxaejgviiugdovrj3gol4yq","peer":"12D3KooWDCXxiSsLi1NT9tsiyimwV6YstQkrjTjD2hAkz2KRVAGG"}

Provider remains excluded from the providers list and does not appear at /providers.

The logs do not show the address inside the announcement.

  • Change the announce logs to also include the announced address so that we can confirm the later announcement had the correct address.
  • check that successful connection to a provider re includes it.
  • On receiving announcements for an excluded provider, contact it regardless of sync node and if the connection is established re include it.
@rvagg
Copy link
Contributor

rvagg commented Aug 2, 2023

My matching log for that last log entry:

Aug 02 00:20:50 fletcher frisbii[3574021]: 2023-08-02T10:20:50.725+1000        INFO        frisbii        frisbii/main.go:128        Announcing to indexer as /dns/ipfs.va.gg/tcp/3747/http

That log entry's multiaddr string is the same one that's passed into Engine, from here through to here where we pass it to httpsync.NewPublisherWithoutServer(). So it shouldn't be caching an older address with the wrong port on my end (I think).

@masih
Copy link
Member Author

masih commented Aug 2, 2023

Has there been any new advertisements since the announcement was made?

@gammazero
Copy link
Collaborator

gammazero commented Aug 2, 2023

Adding addresses to announcement logs is OK. However...

The reason a provider is excluded is that there has not been a new advertisement for some amount of time. If the publisher address is incorrect, and then is corrected by a later announcement, then newer ads will be seen and the provider will no longer be excluded. If the address is corrected and there are still no new ads, being able to connect to the provider using the correct address does not change the fact that no new ads are available and the provider should still be excluded.

So, I don't think we want to do this:

contact it regardless of sync node and if the connection is established re include it.

If the indexer has already seen all the ads there, then connecting to and even fetching the already-seen ad will not constitute any new ad ingestion and should not re-include the provider.

@masih
Copy link
Member Author

masih commented Aug 3, 2023

I see, that makes sense thanks Andrew.

@rvagg can I ask you to confirm if there was any new ads published with the right address?

@rvagg
Copy link
Contributor

rvagg commented Aug 4, 2023

Screenshot 2023-08-04 at 1 27 33 pm

Yeah, I've been restarting my Frisbii a couple of times a day hoping that it'll somehow prod ipni into action. See above log.

That address should be fine, see a fetch with lassie:

Screenshot 2023-08-04 at 1 29 30 pm

I do have these logs though, after this issue was opened I started getting pings from IPNI from the head and one for the advert itself yesterday. The log line at the top of this screenshot is when my normal traffic dried up completely, 2023-08-01T11:05:08+10:00 was apparently when ipni stopped telling lassie clients that they could fetch wikipedia from me (context deadline exceeded is expected, client was fetching more than I've configured my Frisbii to be allowed to serve - I have a time & byte limit set). The log line after this screenshot was the fetch I performed above with lassie.

Screenshot 2023-08-04 at 1 33 56 pm

@rvagg
Copy link
Contributor

rvagg commented Aug 8, 2023

I'm still just getting the /_ipni/head pings every 1-2 hours and nothing more. Today I tried changing the order of the CIDs I want to advertise so I got a different advert CID; instead of baguqeeraxstl6rsun6vgozbkoyy5gl67ebwvpjxaejgviiugdovrj3gol4yq as per the logs above, it's now baguqeera5ydfpew5afc7zoqkowtndf3blikejt5cq24qtcpukngqqpu5pn5q; but that doesn't seem to have had an effect.


btw I thought the user-agent was fixed with ipni/go-libipni#60 but perhaps storetheindex has a different http client?

@rvagg
Copy link
Contributor

rvagg commented Aug 10, 2023

K, I figured this out and I think it was all me. Frisbii had a bug, which I guess was introduced at some point during updating my engine fork where the advertisements were giving the / path rather than the /_ipni/ httpath that Frisbii uses. So the persistent /_ipni/head pings I've been getting are probably from much older advertisements? I ended up having to burn through 2 more peer IDs to make it work properly in the end, it seems that storetheindex has a bit too much of a memory of where the addresses should be. At one stage when I figured this out, I was getting the /_ipni/head and then the advert CID request, /_ipni/baf..., but the next request for the chain was going to /baf... which is a bit weird but I'm assuming some cached addr for my peer ID.

Fixed, I hope, in ipld/frisbii#13, my Frisbii is back online serving data again at least!

@rvagg rvagg closed this as completed Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants