Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when saving and indexing (dreyfus/clouseau) a big document (10MiB) #54

Open
rmartinez-dasnano opened this issue Aug 1, 2021 · 0 comments

Comments

@rmartinez-dasnano
Copy link

Originally posted to coucdb project, but it seems to be something related to dreyfus

We are storing a big document into couchdb using ibm docker (couchdb+lucene index) based on couchdb 3.1.1 (ibmcom/couchdb3:3.1.1).
One of the fields in the document contains an array of arrays created from a 10MiB csv. When this document is created container memory usage starts growing until it reaches memory limit for the container (8Gib), or host limit (running in a 16GiB machine), or a pocess limit

  • If machine limit is reached, oom-killer kills beam.smp
  • If container limit is reached, container is restarted.
  • When none of these limits is reached, then the issue seems to be in dreyfus index updater. It's seems there is an OOM in OS process
out of memory
[info] 2021-08-01T10:09:57.661891Z [email protected] <0.236.0> -------- couch_proc_manager <0.18737.1> died normal
[error] 2021-08-01T10:09:57.661891Z [email protected] <0.22307.1> -------- OS Process Error <0.18737.1> :: {os_process_error,{exit_status,1}}
[error] 2021-08-01T10:09:57.662507Z [email protected] emulator -------- Error in process <0.22307.1> on node '[email protected]' with exit value:
{{nocatch,{os_process_error,{exit_status,1}}},[{couch_os_process,prompt,2,[{file,"src/couch_os_process.erl"},{line,59}]},{couch_query_servers,proc_prompt,2,[{file,"src/couch_query_servers.erl"},{line,520}]},{dreyfus_index_updater,update_or_delete_index,4,[{file,"src/dreyfus_index_updater.erl"},{line,141}]},{dreyfus_index_updater,load_docs,2,[{file,"src/dreyfus_index_updater.erl"},{line,80}]},{couch_bt_engine,drop_reductions,4,[{file,"src/couch_bt_engine.erl"},{line,1177}]},{couch_btree,stream_kv_node2,8,[{file,"src/couch_btree.erl"},{line,851}]},{couch_btree,stream_kp_node,7,[{file,"src/couch_btree.erl"},{line,778}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,224}]}]}

Javascript fragment in the desgin document indexing this field: joins position 1 of arrays in a variable and then we index it in a single call to index function (built string size is ~ 11MiB). It's the same if we call index function as many times as rows in the array, same error.

if (doc.content && typeof(doc.content) !== 'undefined') {
           var stringToIndex="";
           for (i = 0; i < doc.content.length; i++) {
                 stringToIndex= stringToIndex+ doc.content[i][1]+" ";
            } 
            index("doc_number", stringToIndex, {"boost": 1, "facet":false, "index": true, "store": false});
}

To avoid os_process_error issue, we tried to increase max memory of couchjs processes with COUCHDB_QUERY_SERVER_JAVASCRIPT

environment:
  - COUCHDB_USER=****
  - COUCHDB_PASSWORD=****
  - COUCHDB_QUERY_SERVER_JAVASCRIPT="/opt/couchdb/bin/couchjs -S 536870912 /opt/couchdb/share/server/main.js"

But it seems it has no effect

docker container top 5bb095ba7a85
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
1001                333932              333911              0                   09:21               ?                   00:00:00            runsvdir -P -H /etc/service log: ...........................................................................................................................................................................................................................................................................................................................................................................................................
1001                334124              333932              0                   09:21               ?                   00:00:00            runsv couchdb
1001                334125              333932              0                   09:21               ?                   00:00:00            runsv couchdb-search
1001                334127              334125              0                   09:21               ?                   00:00:56            java -server -Xmx2G -Dsun.net.inetaddr.ttl=30 -Dsun.net.inetaddr.negative.ttl=30 -Dlog4j.configuration=file:/opt/couchdb-search/etc/log4j.properties -XX:OnOutOfMemoryError=kill -9 %p -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -classpath /opt/couchdb-search/lib/* com.cloudant.clouseau.Main /opt/couchdb-search/etc/clouseau.ini
1001                334141              333932              0                   09:21               ?                   00:00:00            /opt/couchdb/bin/../erts-9.3.3.14/bin/epmd -daemon
1001                366563              334124              13                  11:41               ?                   00:09:12            /opt/couchdb/bin/../erts-9.3.3.14/bin/beam.smp -K true -A 16 -Bd -- -root /opt/couchdb/bin/.. -progname couchdb -- -home /opt/couchdb -- -boot /opt/couchdb/bin/../releases/3.1.1/couchdb -kernel inet_dist_listen_min 9100 -kernel inet_dist_listen_max 9100 -crypto fips_mode true -kernel error_logger silent -sasl sasl_error_logger false -noshell -noinput -name [email protected] -config /opt/couchdb/bin/../releases/3.1.1/sys.config -setcookie monster
1001                366594              366563              0                   11:41               ?                   00:00:00            erl_child_setup 1048576
1001                366647              366594              0                   11:41               ?                   00:00:00            inet_gethost 4
1001                366648              366647              0                   11:41               ?                   00:00:00            inet_gethost 4
1001                367618              366594              0                   11:45               ?                   00:00:30            ./bin/couchjs ./share/server/main.js
1001                367619              366594              0                   11:45               ?                   00:00:30            ./bin/couchjs ./share/server/main.js
1001                367659              366594              0                   11:45               ?                   00:00:29            ./bin/couchjs ./share/server/main.js
1001                369407              366594              0                   11:55               ?                   00:00:16            ./bin/couchjs ./share/server/main.js
1001                369417              366594              0                   11:55               ?                   00:00:17            ./bin/couchjs ./share/server/main.js
1001                369427              366594              0                   11:55               ?                   00:00:17            ./bin/couchjs ./share/server/main.js
1001                370634              366594              0                   12:07               ?                   00:00:12            ./bin/couchjs ./share/server/main.js
1001                370808              366594              0                   12:09               ?                   00:00:06            ./bin/couchjs ./share/server/main.js

Steps to Reproduce
Start docker image ibmcom/couchdb3:3.1.1
Create the document
Wait 5 seconds and memory will grow from 350Mib aprox to 6GiB

Expected Behaviour

  • Memory consumption does not grow so much. 10MyB document -> 6Gib of memory consumotion looks like a memory leak.
  • COUCHDB_QUERY_SERVER_JAVASCRIPT env variable should work as expected

Environment

  • CouchDB version used: 3.1.1
  • Docker version: Docker version 20.10.7, build f0df350
  • Container limits:
     deploy:
    resources:
      limits:
        cpus: "4"
        memory: 8192M
      reservations:
        cpus: "4"
        memory: 4096M
  • Couchdb configuration:
[attachments] compressible_types="text/*, application/javascript, application/json, application/xml"
[attachments] compression_level="8"
[chttpd] backlog="512"
[chttpd] bind_address="any"
[chttpd] max_db_number_for_dbs_info_req="100"
[chttpd] port="5984"
[chttpd] prefer_minimal="Cache-Control, Content-Length, Content-Range, Content-Type, ETag, Server, Transfer-Encoding, Vary"
[chttpd] require_valid_user="false"
[chttpd] server_options="[{backlog, 512}, {acceptor_pool_size, 64}, {max, 4096}]"
[chttpd] socket_options="[{sndbuf, 262144}, {nodelay, true}]"
[cluster] n="3"
[cluster] q="2"
[cors] credentials="false"
[couch_httpd_auth] allow_persistent_cookies="true"
[couch_httpd_auth] auth_cache_size="50"
[couch_httpd_auth] authentication_db="_users"
[couch_httpd_auth] authentication_redirect="/_utils/session.html"
[couch_httpd_auth] iterations="10"
[couch_httpd_auth] require_valid_user="false"
[couch_httpd_auth] secret="aaaaaa"
[couch_httpd_auth] timeout="600"
[couch_peruser] database_prefix="userdb-"
[couch_peruser] delete_dbs="false"
[couch_peruser] enable="false"
[couchdb] attachment_stream_buffer_size="4096"
[couchdb] changes_doc_ids_optimization_threshold="100"
[couchdb] database_dir="./data"
[couchdb] default_engine="couch"
[couchdb] default_security="admin_only"
[couchdb] file_compression="snappy"
[couchdb] max_dbs_open="10000"
[couchdb] max_document_size="4294967296"
[couchdb] os_process_timeout="120000"
[couchdb] single_node="true"
[couchdb] users_db_security_editable="false"
[couchdb] uuid="aaaa"
[couchdb] view_index_dir="./data"
[couchdb_engines] couch="couch_bt_engine"
[csp] enable="true"
[dreyfus] name="[email protected]"
[fabric] request_timeout="infinity"
[feature_flags] partitioned||*="true"
[httpd] allow_jsonp="false"
[httpd] authentication_handlers="{couch_httpd_auth, cookie_authentication_handler}, {couch_httpd_auth, default_authentication_handler}"
[httpd] bind_address="any"
[httpd] enable_cors="false"
[httpd] enable_xframe_options="false"
[httpd] max_http_request_size="4294967296"
[httpd] port="5986"
[httpd] secure_rewrites="true"
[httpd] socket_options="[{sndbuf, 262144}]"
[indexers] couch_mrview="true"
[ioq] concurrency="10"
[ioq] ratio="0.01"
[ioq.bypass] compaction="false"
[ioq.bypass] os_process="true"
[ioq.bypass] read="true"
[ioq.bypass] shard_sync="false"
[ioq.bypass] view_update="true"
[ioq.bypass] write="true"
[log] level="debug"
[log] writer="stderr"
[query_server_config] os_process_limit="2000"
[query_server_config] os_process_soft_limit="1000"
[query_server_config] reduce_limit="true"
[replicator] connection_timeout="30000"
[replicator] http_connections="20"
[replicator] interval="60000"
[replicator] max_churn="20"
[replicator] max_jobs="500"
[replicator] retries_per_request="5"
[replicator] socket_options="[{keepalive, true}, {nodelay, false}]"
[replicator] ssl_certificate_max_depth="3"
[replicator] startup_jitter="5000"
[replicator] verify_ssl_certificates="false"
[replicator] worker_batch_size="500"
[replicator] worker_processes="4"
[ssl] port="6984"
[uuids] algorithm="sequential"
[uuids] max_count="1000"
[vendor] name="The Apache Software Foundation"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant