-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1679281: High CPU Utilization and Event Loop Blockage Due to Synchronous Code Leading to CPU Blockage For large Files in PUT Command #917
Comments
hi, thank you for raising this concern about the driver with us - we'll take a look |
took a quick look at it. without having information on how exactly the performance and eventloop blockings was measured, apparently there might be at least 2 operations in
all of the involved functions are async (except maybe the part where we normalize timestamp in gzipped file header with in the meantime as mitigation, you could try:
by this you could eliminate the need of blocking the eventloop by the CPU-intensive operation of compressing the file inside the node process
Not sure what the ultimate goal is with the files put to the stage, if it's ingesting them with COPY INTO, that can also be run once and made to match multiple files (or run on the whole stage), doesn't necessarily need one single file to work on. Nevertheless; we'll look at the possible enhancement options here (without any ETA), but at this stage I don't feel this is a bug per se. If you feel i'm missing something, please provide futher inputs. |
edit now i see #922 was raised by someone else, for the exact same topic. closing this one in favor of 922 for tracking. |
#922 seems to be related to the memory consumption Will you be fixing CPU utilization as well in the same issue because there is significant CPU usage( in getDigestAndSizeForFile and encryption) ? |
internally, we're tracking it as one enhancement indeed |
Sure, thank you! |
Please answer these questions before submitting your issue.
In order to accurately debug the issue this information is required. Thanks!
What version of NodeJS driver are you using?
1.9.3
What operating system and processor architecture are you using?
MacOs arm64
What version of NodeJS are you using?
(
node --version
andnpm --version
)node : 18.12.1 , npm: 8.19.2
What are the component versions in the environment (
npm list
)?5.Server version:* E.g. 1.90.1
You may get the server version by running a query:
What did you do?
During the execution of file uploads via the put command, there are repeated instances where the event loop was blocked for over 7 seconds, leading to CPU spikes and unresponsiveness. This was identified as happening during the compression and upload stages, particularly involving synchronous code in the file transfer process.
File details: 2.5GB CSV file, each row size ~6KB (random data, In the response of put command the compressed file size ~1.8GB)
PUT file://${yourFilePath} @~ AUTO_COMPRESS=TRUE;
Running this queryTo Reproduce the issue:
Use the Snowflake SDK put command to upload a large file (e.g., 2.5 GB CSV).
Observe the CPU utilization and event loop blockage during the process, especially in the file compression and transfer phases.
What did you expect to see?
we suspect the synchronous code is causing this issue
What should have happened and what happened instead?
Other observations
https://community.snowflake.com/s/article/How-to-generate-log-file-on-Snowflake-connectors
e.g
Add this to get standard output.
The text was updated successfully, but these errors were encountered: