-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to use carbon clickhouse with distributed tables ? #119
Comments
I am not sure what you mean. I've used distributed tables for inserts there |
you use partitionned tables, not distributed table in the readme. clickhouse distributed table: |
with distributed table you can distribute data again clickhouse node shard.... and scale linearie with the number of node (depend on the eficiency of the distribution key).... |
You should rather use a single table, not "on cluster" clause. See https://clickhouse.com/docs/en/engines/table-engines/special/distributed/ |
is the documentation what i'm searching ! the idea is to store not in single node but in cluster with multiple shard.... to scale... the readme create table instruction, are for single node... or i've missed somephing ... can be possible to have an example of create table in distributed mode in the readme ? |
You should create regular tables on each node in cluster. For reading from all nodes in one request, use Distributed table. When creating Distributed, you can set sharding_key. That allows you to write "to distribution table" -- this means that all incoming data will be routed by sharding_key. Note here: Here is examples of configs from my prod: Tables: CREATE TABLE IF NOT EXISTS graphite_repl ON CLUSTER datalayer (
`Path` String CODEC(ZSTD(3)),
`Value` Float64 CODEC(Gorilla, LZ4),
`Time` UInt32 CODEC(DoubleDelta, LZ4),
`Date` Date CODEC(DoubleDelta, LZ4),
`Timestamp` UInt32 CODEC(DoubleDelta, LZ4)
)
ENGINE = ReplicatedGraphiteMergeTree('/clickhouse/tables/{shard}/graphite_repl', '{replica}', 'graphite_rollup')
PARTITION BY toYYYYMMDD(Date)
ORDER BY (Path, Time)
TTL
Date + INTERVAL 1 WEEK TO VOLUME 'cold_volume',
Date + INTERVAL 4 MONTH DELETE
SETTINGS
index_granularity = 512;
CREATE TABLE IF NOT EXISTS graphite_dist ON CLUSTER datalayer AS graphite_repl
ENGINE = Distributed(datalayer, ..., graphite_repl); carbon-clickhouse:
graphite-clickhouse:
|
i while go to test that !! |
can be usefull to use chproxy in front to cache request (https://www.chproxy.org/) ? |
If you use carbonapi - it also can cache requests. So that depends on what is your use case. I would overall suggest to start with simple setup and then add extra pieces once you encounter a bottleneck |
No. chproxy can't cache requests with external data (used in points table queries). Graphite-clickhouse can cache finder queries (in render requests). So, no reason use chproxy for caching. But usefull as bouncer/connection pool limiter. |
how to configure carbon-clikhouse with clickhouse distributed tables ?
The text was updated successfully, but these errors were encountered: