Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Balance streaming parents (netdata#18945)
* recreate the circular buffer from time to time * do not update cloud url if the node id is not updated * remove deadlock and optimize pipe size * removed const * finer control on randomized delays * restore children re-connecting to parents * handle partial pipe reads; sender_commit() now checks if the sender is still connected to avoid bombarding it with data that cannot be sent * added commented code about optimizing the array of pollfds * improve interactivity of sender; code cleanup * do not use the pipe for sending messages, instead use a queue in memory (that can never be full) * fix dictionaries families * do not destroy aral on replication exit - it crashes the senders * support multiple dispatchers and connectors; code cleanup * more cleanup * Add serde support for KMeans models. - Serialization/Deserialization support of KMeans models. - Send/receive ML models between a child/parent. - Fix some rare and old crash reports. - Reduce allocations by a couple thousand per second when training. - Enable ML statistics temporarily which might increase CPU consumption. * fix ml models streaming * up to 10 dispatchers and 2 connectors * experiment: limit the number of receivers to the number of cores - 2 * reworked compression at the receiver to minimize read operations * multi-core receivers * use slot 0 on receivers * use slot 0 on receivers * use half the cores for receivers with a minimum of 4 * cancel receiver threads * use offsets instead of pointers in the compressed buffer; track last reads * fix crash on using freed decompressor; core re-org * fix incorrect job registration * fix send_to_plugin() for SSL * add reason to disconnect message * fix signaling receivers to stop * added --dev option to netdata-installer.sh to prevent it from removing the build directory * Fix serde of double values. NaNs and +/- infinities are encoded as strings. * unused param * reset max cbuffer size when it is recreated * struct receiver_state is now private * 1 dispatcher, 1 connector, 2/3 cores for receivers * all replication requests are served by replication threads - never the dispatcher threads * optimize partitions and cache lines for dbengine cache * fix crash on receiver shutdown * rw spinlock now prioritizes writers * backfill all higher tiers * extent cache to 10% * automatic sizing of replication threads * add more replication threads * configure cache eviction parameters to avoid running in aggressive mode all the time * run evictions and flushes every 100ms * add missing initialization * add missing initialization - again * add evictors for all caches * add dedicated evict thread per cache * destroy the completion * avoid sending too many signals to eviction threads * alternative way to make sure there are data to evict * measure inline cache events * disable inline evictions and flushing for open and extent cache * use a spinlock to avoid sending too many signals * batch evictions are not in steps of pages * fix wanted cache size when there are no clean entries in it * fix wanted cache size when there are no clean entries in it * fix wanted cache size again * adaptive batch evictions; batch evictions first try all partitions * move waste events to waste chart * added evict_traversed * evict is smaller steps * removed obsolete code * disabled inlining of evictions and flushing; added timings for evictions * more detailed timings for evictions * use inline evictors * use aral for gorilla pages of 512 bytes, when they are loaded from disk * use aral for all gorilla page sizes loaded from disk * disable inlining again to test it after the memory optimization * timings for dbengine evictions * added timing names * detailed timings * detailed timings - again * removed timings and restored inline evictions * eviction on release only under critical pressure * cleanup and replication tuning * tune cache size calculation * tune replication threads calculation * make streaming receiver exit * Do not allocate/copy extent data twice. * Build/link mimalloc Just for testing, it will be reverted. * lower memory requirements * Link mimalloc statically * run replication with synchronous queries * added missing worker jobs in sender dispatcher * enable batch evictions in pgc * fix sender-dispatcher workers * set max dispatchers to 2 * increase the default replication threads * log stream_info errors * increase replication threads * log the json text when we fail to parse json response of stream_info * stream info response may come back in multiple steps * print the socket error of stream info * added debug to stream info socket error * loop while content-length is smaller than the payload received * Revert "Link mimalloc statically" This reverts commit c98e482. * Revert "Build/link mimalloc" This reverts commit 8aae22a. * Remove NEED_PROTOBUF * Use mimalloc * Revert "Use mimalloc" This reverts commit 9a68034. * Use mimalloc * support 256 bytes gorilla pages, when they are loaded from disk * added os_mem_available() * test memory protection * use protection only on one cache * use the free memory of the main cache in the other caches too * use the free memory of the main cache in the open cache too * Batch gorilla writes by tracking the last written number. In a setup with 200 children, `perf` shows that the worst offender is the gorilla write operation, reporting ~17% overhead. With this change `perf` reports ~4% overhead and netdata's CPU consumption decreased by ~16%. * make buffered_reader_next_line() a couple times faster * flushing open cache * Use re2c for the line splitting pluginsd. Function get's optimized around 3x. We should delete old code and use the re2c for the rest of the functions, but we need to keep the PR size as minimal as possible. Will do in follow up PRs. * use cores - 1 for receivers, use only 1 sender * move sender processing to a separate function * Revert "Batch gorilla writes by tracking the last written number." This reverts commit 2e72a5c. * Batch gorilla writes only from writers This reapplies df79be2f01145bd79091a8934d7c80b4b3eb915b and introduces a couple changes to remomove writes from readers. * log information for buffer overflow * fix heap use after free * added comments to the main stream receiver loop * 3 dispatchers * single threaded receiver and sender * code cleanup * de-associate hosts from streaming threads when both the receiver and sender stop, so that each time the threads are re-balanced * fix heap use after free * properly get the slot number of pollfd * fixes * fixes * revert worker changes * reuse streaming threads * backfilling should be synchronous * remove the node last * do not keep a pointer to rellocatable buffer * give to pgc the right page size, not less * restore spreading metrics size across time * use the calculated slots for gorilla pages * accurately track gorilla page size changes * check the sth pointer for validity * code cleanup, files re-org and renames to reflect the new structure of streaming * updated referenced size when the size of a page changes; removed flush spins - fluhses cancelled is a waste event * improve families in netdata statistics * page size histogram per cache * page size histogram per cache queue (hot, dirty, clean) * fix heap after use in pdc.c * rw_spinlocks: when preferring a writer yield so that the writer has the chance to get the lock * do not balloon open and extent caches more than needed (it fragments memory and there is not enough memory for the main cache) * fixed typo * enable trace allocations to work * Skip adding kmeans model when ML dimension has not been created. * PGD is now entirely on ARAL for all types of pages * 2 partitions for PGD * Check for ML queue prior to pushing as well. * merge multiple arals, to avoid wasting memory * significantly less arals; proper calculation of gorilla efficiency * report pgd buffers separately from pgc * aral only for sizes less than 512 bytes * tune aral caches * log the functions using the streaming buffer when concurrent use is detected * aral supporting different pages for collected pages and clean pages - an attempt to minimize fragmentation at high performance * fix misuse of sender thread buffers * select the right buffer, based on the receiver tid * no more rrdpush, renamed to stream * lower aral max page size to 16KiB - in an attempt to lower fragmentation under memory pressure * update opcode handling * automatic sizing of aral limiting its size to 200 items per page or 4 x system pages * tune cache eviction strategy * renamed global statistics to telemetry and split it into multiple files * left over renames of global statistics to telemetry * added heatmap to chart types * note about re-balancing a parents cluster * fix formating * added aral telemetry to find the fragmentation per aral * experiment with a different strategy when making clean pages: always append so that the cache is being constantly rotated; aral telemetry reports utilization instead of fragmentation * aral now takes into account waiting deallocators when it creates new pages * split netdata-conf functions into multiple files; added dbengine use all caches and dbengine out of memory protection settings * tune cache eviction strategy * cache parameters cleanup * rename mem_available to system_memory * Fix variable type. * Add fuzzer for pluginsd line splitter. * use cgroup v1 and v2 to detect memory protection; log on start the detection of memory * fixed typo * added logs about system memory detection * remove debug logs from system memory detection * move the rest of dbengine config to netdata-conf * respect streaming buffer size configured * add workers to pgc eviction threads * renamed worker * fixed flip-flop in size and entries conversions * use aral_by_size when we actually agreegate stats to aral by size * use keyword defintions * move opcode definitions to stream-thread.h * swap struct pollfd slots to make sure all the sockets have an equal chance of being processed * Revert "Add fuzzer for pluginsd line splitter." This reverts commit 454cbcf. * Revert "Use re2c for the line splitting pluginsd." This reverts commit 2b2f9d3. * stream thread use judy arrays instead of linked lists and pre-allocated arrays * added comment about pfd structure on sender and receiver * fixed logs and made the defaut sender timeout 5 seconds * Spawn ML worker threads based on number of CPUs. * Add statistics for ML allocations/deallocations. * Add host flag to check for pending alert transitions to save Remove precompiled statements Offload processing of alerts in the event loop Queue alert transitions to the metadata event loop to be saved Run metadata checks every 5 seconds * do not block doing socket retries when errno indicates EWOULDBLOCK; insist sending data in send_to_plugin() * Revert "Add host flag to check for pending alert transitions to save" This reverts commit 86ade0e. * fix error reasons * Disable ML memory statistics when using mimalloc * add reason when ml cannot acquire the dimension * added ML memory and depending on the DICT_WITH_STATS define, add aral by size too * do not stream ML when the parent does not have ML enabled * nd_poll() to overcome the starvation of poll() and use epoll() under Linux * nd_poll() optimization to minimize the number of system calls * nd_poll() fix * nd_poll() fix again * make glibc release memory to the system when the system is critical in memory * try bigger aral pages, to enable releasing memory back to the system * Queue alert transitions to the metadata event loop (global list not per host) Add host count to check for pending alert transitions to save Remove precompiled statements Offload processing of alerts in the event loop Run metadata checks every 5 seconds * round robin aral allocations * fix aral round robin * ask glibc to release memory when the allocations are aggressive * tinysleep yields the processor instead of waiting * run malloc_trim() more frequently * Add reference count on alarm_entry * selective tinysleep and processor yielding * revert gorilla batch writes * codacy fixes --------- Co-authored-by: vkalintiris <[email protected]> Co-authored-by: Stelios Fragkakis <[email protected]>
- Loading branch information