diff --git a/docs/changelogs/v0.21.md b/docs/changelogs/v0.21.md index dccdb505d9e4..e68414864187 100644 --- a/docs/changelogs/v0.21.md +++ b/docs/changelogs/v0.21.md @@ -10,6 +10,7 @@ - [`Gateway.DeserializedResponses` config flag](#gatewaydeserializedresponses-config-flag) - [`client/rpc` migration of `go-ipfs-http-client`](#clientrpc-migration-of-go-ipfs-http-client) - [Gateway: DAG-CBOR/-JSON previews and improved error pages](#gateway-dag-cbor-json-previews-and-improved-error-pages) + - [Accelerated DHT Client is no longer experimental](#--empty-repo-is-now-the-default) - [๐Ÿ“ Changelog](#-changelog) - [๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Contributors](#-contributors) @@ -74,7 +75,7 @@ for Kubo `v0.21`. In this release, we improved the HTML templates of our HTTP gateway: -1. You can now preview the contents of a DAG-CBOR and DAG-JSON document from your browser, as well as follow any IPLD Links ([CBOR Tag 42](https://github.com/ipld/cid-cbor/)) contained within them. +1. You can now preview the contents of a DAG-CBOR and DAG-JSON document from your browser, as well as follow any IPLD Links ([CBOR Tag 42](https://github.com/ipld/cid-cbor/)) contained within them. 2. The HTML directory listings now contain [updated, higher-definition icons](https://user-images.githubusercontent.com/5447088/241224419-5385793a-d3bb-40aa-8cb0-0382b5bc56a0.png). 3. On gateway error, instead of a plain text error message, web browsers will now get a friendly HTML response with more details regarding the problem. @@ -84,6 +85,20 @@ HTML responses are returned when request's `Accept` header includes `text/html`. | ---- | ---- | | ![DAG-CBOR Preview](https://github.com/ipfs/boxo/assets/5447088/973f05d1-5731-4469-9da5-d1d776891899) | ![Error Page](https://github.com/ipfs/boxo/assets/5447088/14c453df-adbc-4634-b038-133121914550) | +#### Accelerated DHT Client is no longer experimental + +The [accelerated DHT client](docs/config.md#routingaccelerateddhtclient) is now +the main recommended solution for users who are hosting lots of data. +By trading some upfront DHT caching and increased memory usage, +one gets provider throughput improvements up to 6 millions times bigger dataset. +See [the docs](docs/config.md#routingaccelerateddhtclient) for more info. + +The `Experimental.AcceleratedDHTClient` flag moved to `[Routing.AcceleratedDHTClient](docs/config.md#routingaccelerateddhtclient)`. +A config migration has been added to handle this automatically. + +A new tracker estimates the providing speed and warns users if they +should be using AcceleratedDHTClient because they are falling behind. + ### ๐Ÿ“ Changelog ### ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Contributors diff --git a/docs/config.md b/docs/config.md index bf2b7750c641..10994f68b881 100644 --- a/docs/config.md +++ b/docs/config.md @@ -110,6 +110,7 @@ config file at runtime. - [`Reprovider.Strategy`](#reproviderstrategy) - [`Routing`](#routing) - [`Routing.Type`](#routingtype) + - [`Routing.AcceleratedDHTClient`](#routingaccelerateddhtclient) - [`Routing.Routers`](#routingrouters) - [`Routing.Routers: Type`](#routingrouters-type) - [`Routing.Routers: Parameters`](#routingrouters-parameters) @@ -1348,7 +1349,7 @@ Type: `array[peering]` ### `Reprovider.Interval` Sets the time between rounds of reproviding local content to the routing -system. +system. - If unset, it uses the implicit safe default. - If set to the value `"0"` it will disable content reproviding. @@ -1423,6 +1424,52 @@ Default: `auto` (DHT + IPNI) Type: `optionalString` (`null`/missing means the default) + +### `Routing.AcceleratedDHTClient` + +Utilizes an alternative DHT client using a Full-Routing-Table strategy, it will +every hour do a complete scan of the DHT and record all nodes found. +Then when a lookup is tried instead of having to go through multiple Kad hops it +is able to find the 20 final nodes by looking up the recorded network table. + +This means you trade off cpu and memory due to the extra periodic scans and more +records to keep. However the latency of individual operations should be ~10x faster +and the provide throughput up to 6 millions times faster. + +This is not compatible with `Routing.Type` `custom`. If you are using composable routers +you can configure this individualy on each router. + +When it is enabled: +- DHT operations (reads and writes) should complete much faster giving lower latency +- The provider will now use a keyspace sweeping mode allowing to keep alive + CID sets that are multiple orders of magnitude bigger alive. +- The standard Bucket-Routing-Table DHT will still run for the DHT server if a + mode that enables the DHT server is used. This means the classical routing + table will still be used to answer other nodes, this is because this is critical + to get right in order to not harm the network. +- The operations `ipfs stats dht` will default to showing information about the accelerated DHT client + +**Caveats:** +1. Running the accelerated client likely will result in more resource consumption (connections, RAM, CPU, bandwidth) + - Users that are limited in the number of parallel connections their machines/networks can perform will likely suffer + - The resource usage is not smooth as the client crawls the network in rounds and reproviding is similarly done in rounds + - Users who previously had a lot of content but were unable to advertise it on the network will see an increase in + egress bandwidth as their nodes start to advertise all of their CIDs into the network. If you have lots of data + entering your node that you don't want to advertise, then consider using [Reprovider Strategies](#reproviderstrategy) + to reduce the number of CIDs that you are reproviding. Similarly, if you are running a node that deals mostly with + short-lived temporary data (e.g. you use a separate node for ingesting data then for storing and serving it) then + you may benefit from using [Strategic Providing](experimental-features.md#strategic-providing) to prevent advertising + of data that you ultimately will not have. +2. Currently, the DHT is not usable for queries for the first 5-10 minutes of operation as the routing table is being +prepared. This means operations like searching the DHT for particular peers or content will not work initially. + - You can see if the DHT has been initially populated by running `ipfs stats dht` +3. Currently, the accelerated DHT client is not compatible with LAN-based DHTs and will not perform operations against +them + +Default: `false` + +Type: `bool` (missing means `false`) + ### `Routing.Routers` **EXPERIMENTAL: `Routing.Routers` configuration may change in future release** @@ -1465,7 +1512,7 @@ HTTP: DHT: - `"Mode"`: Mode used by the DHT. Possible values: "server", "client", "auto" - - `"AcceleratedDHTClient"`: Set to `true` if you want to use the experimentalDHT. + - `"AcceleratedDHTClient"`: Set to `true` if you want to use the acceleratedDHT. - `"PublicIPNetwork"`: Set to `true` to create a `WAN` DHT. Set to `false` to create a `LAN` DHT. Parallel: diff --git a/docs/experimental-features.md b/docs/experimental-features.md index 83a5fdf7b863..07f7f30f5e83 100644 --- a/docs/experimental-features.md +++ b/docs/experimental-features.md @@ -513,7 +513,7 @@ ipfs config --json Experimental.StrategicProviding true - [ ] provide roots - [ ] provide all - [ ] provide strategic - + ## GraphSync ### State @@ -546,59 +546,6 @@ Stable, enabled by default [Noise](https://github.com/libp2p/specs/tree/master/noise) libp2p transport based on the [Noise Protocol Framework](https://noiseprotocol.org/noise.html). While TLS remains the default transport in Kubo, Noise is easier to implement and is thus the "interop" transport between IPFS and libp2p implementations. -## Accelerated DHT Client - -### In Version - -0.9.0 - -### State - -Experimental, default-disabled. - -Utilizes an alternative DHT client that searches for and maintains more information about the network -in exchange for being more performant. - -When it is enabled: -- DHT operations should complete much faster than with it disabled -- A batching reprovider system will be enabled which takes advantage of some properties of the experimental client to - very efficiently put provider records into the network -- The standard DHT client (and server if enabled) are run alongside the alternative client -- The operations `ipfs stats dht` and `ipfs stats provide` will have different outputs - - `ipfs stats provide` only works when the accelerated DHT client is enabled and shows various statistics regarding - the provider/reprovider system - - `ipfs stats dht` will default to showing information about the new client - -**Caveats:** -1. Running the experimental client likely will result in more resource consumption (connections, RAM, CPU, bandwidth) - - Users that are limited in the number of parallel connections their machines/networks can perform will likely suffer - - Currently, the resource usage is not smooth as the client crawls the network in rounds and reproviding is similarly - done in rounds - - Users who previously had a lot of content but were unable to advertise it on the network will see an increase in - egress bandwidth as their nodes start to advertise all of their CIDs into the network. If you have lots of data - entering your node that you don't want to advertise consider using [Reprovider Strategies](config.md#reproviderstrategy) - to reduce the number of CIDs that you are reproviding. Similarly, if you are running a node that deals mostly with - short-lived temporary data (e.g. you use a separate node for ingesting data then for storing and serving it) then - you may benefit from using [Strategic Providing](#strategic-providing) to prevent advertising of data that you - ultimately will not have. -2. Currently, the DHT is not usable for queries for the first 5-10 minutes of operation as the routing table is being -prepared. This means operations like searching the DHT for particular peers or content will not work - - You can see if the DHT has been initially populated by running `ipfs stats dht` -3. Currently, the accelerated DHT client is not compatible with LAN-based DHTs and will not perform operations against -them - -### How to enable - -``` -ipfs config --json Experimental.AcceleratedDHTClient true -``` - -### Road to being a real feature - -- [ ] Needs more people to use and report on how well it works -- [ ] Should be usable for queries (even if slower/less efficient) shortly after startup -- [ ] Should be usable with non-WAN DHTs - ## Optimistic Provide ### In Version @@ -640,7 +587,7 @@ than the classic client. size estimation available the client will transparently fall back to the classic approach. 2. The chosen peers to store the provider records might not be the actual closest ones. Measurements showed that this is not a problem. -3. The optimistic provide process returns already after 15 out of the 20 provider records were stored with peers. The +3. The optimistic provide process returns already after 15 out of the 20 provider records were stored with peers. The reasoning here is that one out of the remaining 5 peers are very likely to time out and delay the whole process. To limit the number of in-flight async requests there is the second `OptimisticProvideJobsPoolSize` setting. Currently, this is set to 60. This means that at most 60 parallel background requests are allowed to be in-flight. If this