Skip to content

Commit

Permalink
readd badger, allow configurable blockstores, default to flatfs, and …
Browse files Browse the repository at this point in the history
…add docs
  • Loading branch information
aschmahmann authored and acejam committed Dec 22, 2023
1 parent c7d1f34 commit 416b256
Show file tree
Hide file tree
Showing 7 changed files with 136 additions and 3 deletions.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,14 @@ Denylists can be manually placed in the `$RAINBOW_DATADIR/denylists` folder too.

See [NoPFS](https://github.com/ipfs-shipyard/nopfs) for an explanation of the denylist format. Note that denylists should only be appended to while Rainbow is running. Editing differently, or adding new denylist files, should be done with Rainbow stopped.

## Blockstores

Rainbow ships with a number of possible blockstores for the purposes of caching data locally.
Because Rainbow, as a gateway-only IPFS implementation, is not designed for long-term data storage there are no long
term guarantees of support for any particular backing data storage.

See [Blockstores](./docs/blockstores.md) for more details.

## Garbage Collection

Over time, the datastore can fill up with previously fetched blocks. To free up this used disk space, garbage collection can be run. Garbage collection needs to be manually triggered. This process can also be automated by using a cron job.
Expand Down
33 changes: 33 additions & 0 deletions docs/blockstores.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Rainbow Blockstores

`rainbow` ships with a number of possible backing block storage options for the purposes of caching data locally.
Because `rainbow`, as a gateway-only IPFS implementation, is not designed for long-term data storage there are no long
term guarantees of support for any particular backing blockstore.

`rainbow` currently ships with the following blockstores:

- [FlatFS](#flatfs)
- [Badger](#badger)

Note: `rainbow` exposes minimal configurability of each blockstore, if in your experimentation you note that tuning some
parameters is a big benefit to you file an issue/PR to discuss changing the blockstores parameters or if there's demand
to expose more configurability.

## FlatFS

FlatFS is a fairly simple blockstore that puts each block into a separate file on disk. Due to the heavy usage of the
filesystem (i.e. not just how bytes are stored on disk but file and directory structure as well) there are various
optimizations to be had in selection of the filesystem and disk types. For example, choosing a filesystem that enables
putting file metadata on a fast SSD while keeping the actual data on a slower disk might ease various lookup types.

## Badger

`rainbow` ships with [Badger-v4](https://github.com/dgraph-io/badger).
The main reasons to choose Badger compared to FlatFS are:
- It uses far fewer file descriptors and disk operations
- It comes with the ability to compress data on disk
- Generally faster reads and writes
- Native bloom filters

The main difficulty with Badger is that its internal garbage collection functionality (not `rainbow`'s) is dependent on
workload which makes it difficult to ahead-of-time judge the kinds of capacity you need.
8 changes: 8 additions & 0 deletions gc.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package main

import (
"context"
badger4 "github.com/ipfs/go-ds-badger4"
)

// GC is a really stupid simple algorithm where we just delete things until
Expand Down Expand Up @@ -35,5 +36,12 @@ deleteBlocks:
}
}

if ds, ok := nd.datastore.(*badger4.Datastore); ok {
err = ds.CollectGarbage(ctx)
if err != nil {
return err
}
}

return nil
}
7 changes: 7 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ require (
github.com/ipfs/boxo v0.16.0
github.com/ipfs/go-cid v0.4.1
github.com/ipfs/go-datastore v0.6.0
github.com/ipfs/go-ds-badger4 v0.0.0-20231006150127-9137bcc6b981
github.com/ipfs/go-ds-flatfs v0.5.1
github.com/ipfs/go-ipfs-delay v0.0.1
github.com/ipfs/go-log/v2 v2.5.1
Expand Down Expand Up @@ -54,6 +55,8 @@ require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/davidlazar/go-crypto v0.0.0-20200604182044-b73af7476f6c // indirect
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.2.0 // indirect
github.com/dgraph-io/badger/v4 v4.2.0 // indirect
github.com/dgraph-io/ristretto v0.1.1 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/elastic/gosigar v0.14.2 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
Expand All @@ -67,7 +70,11 @@ require (
github.com/godbus/dbus/v5 v5.1.0 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/gddo v0.0.0-20180823221919-9d8ff1c67be5 // indirect
github.com/golang/glog v1.1.0 // indirect
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
github.com/golang/protobuf v1.5.3 // indirect
github.com/golang/snappy v0.0.4 // indirect
github.com/google/flatbuffers v1.12.1 // indirect
github.com/google/gopacket v1.1.19 // indirect
github.com/google/pprof v0.0.0-20231023181126-ff6d637d2a7b // indirect
github.com/google/uuid v1.3.0 // indirect
Expand Down
14 changes: 14 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,11 @@ github.com/decred/dcrd/crypto/blake256 v1.0.1 h1:7PltbUIQB7u/FfZ39+DGa/ShuMyJ5il
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.2.0 h1:8UrgZ3GkP4i/CLijOJx79Yu+etlyjdBU4sfcs2WYQMs=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.2.0/go.mod h1:v57UDF4pDQJcEfFUCRop3lJL149eHGSe9Jvczhzjo/0=
github.com/dgraph-io/badger v1.6.0/go.mod h1:zwt7syl517jmP8s94KqSxTlM6IMsdhYy6psNgSztDR4=
github.com/dgraph-io/badger/v4 v4.2.0 h1:kJrlajbXXL9DFTNuhhu9yCx7JJa4qpYWxtE8BzuWsEs=
github.com/dgraph-io/badger/v4 v4.2.0/go.mod h1:qfCqhPoWDFJRx1gp5QwwyGo8xk1lbHUxvK9nK0OGAak=
github.com/dgraph-io/ristretto v0.1.1 h1:6CWw5tJNgpegArSHpNHJKldNeq03FQCwYvfMVWajOK8=
github.com/dgraph-io/ristretto v0.1.1/go.mod h1:S1GPSBCYCIhmVNfcth17y2zZtQT6wzkzgwUve0VDWWA=
github.com/dgryski/go-farm v0.0.0-20190423205320-6a90982ecee2 h1:tdlZCpZ/P9DhczCTSixgIKmwPv6+wP5DGjqLYw5SUiA=
github.com/dgryski/go-farm v0.0.0-20190423205320-6a90982ecee2/go.mod h1:SqUrOPUnsFjfmXRMNPybcSiG0BgUW2AuFH8PAnS2iTw=
github.com/docker/go-units v0.4.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/docker/go-units v0.5.0 h1:69rxXcBk27SvSaaxTtLh/8llcHD8vYHT7WSdRZ/jvr4=
Expand Down Expand Up @@ -180,10 +185,12 @@ github.com/golang/gddo v0.0.0-20180823221919-9d8ff1c67be5/go.mod h1:xEhNfoBDX1hz
github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b/go.mod h1:SBH7ygxi8pfUlaOkMMuAQtPIUF8ecWP5IEl/CR7VP2Q=
github.com/golang/glog v1.0.0/go.mod h1:EWib/APOK0SL3dFbYqvxE3UYd8E6s1ouQ7iEp/0LWV4=
github.com/golang/glog v1.1.0 h1:/d3pCKDPWNnvIWe0vVUpNP32qc8U3PDVxySP/y360qE=
github.com/golang/glog v1.1.0/go.mod h1:pfYeQZ3JWZoXTV5sFc986z3HTpwQs9At6P4ImfuP3NQ=
github.com/golang/groupcache v0.0.0-20190702054246-869f871628b6/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/groupcache v0.0.0-20191227052852-215e87163ea7/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/groupcache v0.0.0-20200121045136-8c9f03a8e57e/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da h1:oI5xCqsCo564l8iNU+DwB5epxmsaqB+rhGL0m5jtYqE=
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/lint v0.0.0-20180702182130-06c8688daad7/go.mod h1:tluoj9z5200jBnyusfRPU2LqT6J+DAorxEvtC7LHB+E=
github.com/golang/mock v1.1.1/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A=
github.com/golang/mock v1.2.0/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A=
Expand Down Expand Up @@ -211,8 +218,12 @@ github.com/golang/protobuf v1.5.2/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiu
github.com/golang/protobuf v1.5.3 h1:KhyjKVUg7Usr/dYsdSqoFveMYd5ko72D+zANwlG1mmg=
github.com/golang/protobuf v1.5.3/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY=
github.com/golang/snappy v0.0.0-20180518054509-2e65f85255db/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q=
github.com/golang/snappy v0.0.4 h1:yAGX7huGHXlcLOEtBnF4w7FQwA26wojNCwOYAEhLjQM=
github.com/golang/snappy v0.0.4/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q=
github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/flatbuffers v1.12.1 h1:MVlul7pQNoDzWRLTw5imwYsl+usrS1TXG2H4jg6ImGw=
github.com/google/flatbuffers v1.12.1/go.mod h1:1AeVuKshWv4vARoZatz6mlQ0JxURH0Kv5+zNeJKJCa8=
github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M=
github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
Expand Down Expand Up @@ -305,6 +316,8 @@ github.com/ipfs/go-datastore v0.6.0/go.mod h1:rt5M3nNbSO/8q1t4LNkLyUwRs8HupMeN/8
github.com/ipfs/go-detect-race v0.0.1 h1:qX/xay2W3E4Q1U7d9lNs1sU9nvguX0a7319XbyQ6cOk=
github.com/ipfs/go-detect-race v0.0.1/go.mod h1:8BNT7shDZPo99Q74BpGMK+4D8Mn4j46UU0LZ723meps=
github.com/ipfs/go-ds-badger v0.0.7/go.mod h1:qt0/fWzZDoPW6jpQeqUjR5kBfhDNB65jd9YlmAvpQBk=
github.com/ipfs/go-ds-badger4 v0.0.0-20231006150127-9137bcc6b981 h1:GOKV62VnjerKwO7mwOyeoArzlaVrDLyoC/YPNtxxGwg=
github.com/ipfs/go-ds-badger4 v0.0.0-20231006150127-9137bcc6b981/go.mod h1:LUU2FbhNdmhAbJmMeoahVRbe4GsduAODSJHWJJh2Vo4=
github.com/ipfs/go-ds-flatfs v0.5.1 h1:ZCIO/kQOS/PSh3vcF1H6a8fkRGS7pOfwfPdx4n/KJH4=
github.com/ipfs/go-ds-flatfs v0.5.1/go.mod h1:RWTV7oZD/yZYBKdbVIFXTX2fdY2Tbvl94NsWqmoyAX4=
github.com/ipfs/go-ds-leveldb v0.1.0/go.mod h1:hqAW8y4bwX5LWcCtku2rFNX3vjDZCy5LZCg+cSZvYb8=
Expand Down Expand Up @@ -916,6 +929,7 @@ golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBc
golang.org/x/sys v0.0.0-20210630005230-0f9fa26af87c/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220908164124-27713097b956/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20221010170243-090e33056c14/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.15.0 h1:h48lPFYpsTvQJZF4EKyI4aLHaev3CxivZmv7yZig9pc=
Expand Down
9 changes: 8 additions & 1 deletion main.go
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ Generate an identity seed and launch a gateway:
Name: "inmem-block-cache",
Value: 1 << 30,
EnvVars: []string{"RAINBOW_INMEM_BLOCK_CACHE"},
Usage: "Size of the in-memory block cache. 0 to disable (disables compression on disk too)",
Usage: "Size of the in-memory block cache (currently only used for badger). 0 to disable (disables compression on disk too)",

Check warning on line 142 in main.go

View check run for this annotation

Codecov / codecov/patch

main.go#L142

Added line #L142 was not covered by tests
},
&cli.Uint64Flag{
Name: "max-memory",
Expand Down Expand Up @@ -176,6 +176,12 @@ Generate an identity seed and launch a gateway:
EnvVars: []string{"RAINBOW_PEERING"},
Usage: "Multiaddresses of peers to stay connected to (comma-separated)",
},
&cli.StringFlag{
Name: "blockstore",
Value: "flatfs",
EnvVars: []string{"RAINBOW_BLOCKSTORE"},
Usage: "Type of blockstore to use, such as flatfs or badger. See https://github.com/ipfs/rainbow/blockstore.md for more details",
},

Check warning on line 184 in main.go

View check run for this annotation

Codecov / codecov/patch

main.go#L179-L184

Added lines #L179 - L184 were not covered by tests
}

app.Commands = []*cli.Command{
Expand Down Expand Up @@ -261,6 +267,7 @@ share the same seed as long as the indexes are different.

cfg := Config{
DataDir: ddir,
BlockstoreType: cctx.String("blockstore"),

Check warning on line 270 in main.go

View check run for this annotation

Codecov / codecov/patch

main.go#L270

Added line #L270 was not covered by tests
GatewayDomains: getCommaSeparatedList(cctx.String("gateway-domains")),
SubdomainGatewayDomains: getCommaSeparatedList(cctx.String("subdomain-gateway-domains")),
ConnMgrLow: cctx.Int("connmgr-low"),
Expand Down
60 changes: 58 additions & 2 deletions setup.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ import (
"path/filepath"
"time"

"github.com/dgraph-io/badger/v4"
"github.com/dgraph-io/badger/v4/options"
nopfs "github.com/ipfs-shipyard/nopfs"
nopfsipfs "github.com/ipfs-shipyard/nopfs/ipfs"
bsclient "github.com/ipfs/boxo/bitswap/client"
Expand All @@ -27,6 +29,7 @@ import (
httpcontentrouter "github.com/ipfs/boxo/routing/http/contentrouter"
"github.com/ipfs/go-cid"
"github.com/ipfs/go-datastore"
badger4 "github.com/ipfs/go-ds-badger4"
flatfs "github.com/ipfs/go-ds-flatfs"
delay "github.com/ipfs/go-ipfs-delay"
metri "github.com/ipfs/go-metrics-interface"
Expand Down Expand Up @@ -78,7 +81,8 @@ type Node struct {
}

type Config struct {
DataDir string
DataDir string
BlockstoreType string

ListenAddrs []string
AnnounceAddrs []string
Expand Down Expand Up @@ -360,7 +364,59 @@ func Setup(ctx context.Context, cfg Config, key crypto.PrivKey, dnsCache *cached
}

func setupDatastore(cfg Config) (datastore.Batching, error) {
return flatfs.CreateOrOpen(filepath.Join(cfg.DataDir, "flatfs"), flatfs.NextToLast(3), false)
switch cfg.BlockstoreType {
case "flatfs":
return flatfs.CreateOrOpen(filepath.Join(cfg.DataDir, "flatfs"), flatfs.NextToLast(3), false)
case "badger":
badgerOpts := badger.DefaultOptions("")
badgerOpts.CompactL0OnClose = false
// ValueThreshold: defaults to 1MB! For us that means everything goes
// into the LSM tree and that means more stuff in memory. We only
// put very small things on the LSM tree by default (i.e. a single
// CID).
badgerOpts.ValueThreshold = 256

// BlockCacheSize: instead of using blockstore, we cache things
// here. This only makes sense if using compression, according to
// docs.
badgerOpts.BlockCacheSize = cfg.InMemBlockCache // default 1 GiB.

// Compression: default. Trades reading less from disk for using more
// CPU. Given gateways are usually IO bound, I think we can make this
// trade.
if badgerOpts.BlockCacheSize == 0 {
badgerOpts.Compression = options.None
} else {
badgerOpts.Compression = options.Snappy
}

Check warning on line 391 in setup.go

View check run for this annotation

Codecov / codecov/patch

setup.go#L367-L391

Added lines #L367 - L391 were not covered by tests

// If we write something twice, we do it with the same values so
// *shrugh*.
badgerOpts.DetectConflicts = false

// MemTableSize: Defaults to 64MiB which seems an ok amount to flush
// to disk from time to time.
badgerOpts.MemTableSize = 64 << 20
// NumMemtables: more means more memory, faster writes, but more to
// commit to disk if they get full. Default is 5.
badgerOpts.NumMemtables = 5

// IndexCacheSize: 0 means all in memory (default). All means indexes,
// bloom filters etc. Usually not huge amount of memory usage from
// this.
badgerOpts.IndexCacheSize = 0

opts := badger4.Options{
GcDiscardRatio: 0.3,
GcInterval: 20 * time.Minute,
GcSleep: 10 * time.Second,
Options: badgerOpts,
}

Check warning on line 414 in setup.go

View check run for this annotation

Codecov / codecov/patch

setup.go#L395-L414

Added lines #L395 - L414 were not covered by tests

return badger4.NewDatastore(filepath.Join(cfg.DataDir, "badger4"), &opts)
default:
return nil, fmt.Errorf("unsupported blockstore type: %s", cfg.BlockstoreType)

Check warning on line 418 in setup.go

View check run for this annotation

Codecov / codecov/patch

setup.go#L416-L418

Added lines #L416 - L418 were not covered by tests
}
}

type bundledDHT struct {
Expand Down

0 comments on commit 416b256

Please sign in to comment.