Skip to content

Commit

Permalink
perf(silo-prepro): speed up by using more efficient jq command (#3396)
Browse files Browse the repository at this point in the history
It truned out that `jq -c '.'` was rate limiting and causing significant extra processing time.

New command has been tested for correctness and performance, still validates.

see https://loculus.slack.com/archives/C05G172HL6L/p1733500150097089?thread_ts=1733478069.696049&cid=C05G172HL6L
  • Loading branch information
corneliusroemer authored Dec 6, 2024
1 parent 32d75d8 commit 1e70d50
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion kubernetes/loculus/silo_import_job.sh
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ download_data() {
echo "Response should contain a total of : $expected_record_count records"

# jq validates each individual json object, to catch truncated lines
true_record_count=$(zstd -d -c "$new_input_data_path" | jq -c . | wc -l | tr -d '[:space:]')
true_record_count=$(zstd -d -c "$new_input_data_path" | jq -n 'reduce inputs as $item (0; . + 1)' | tr -d '[:space:]')
echo "Response contained a total of : $true_record_count records"

if [ "$true_record_count" -ne "$expected_record_count" ]; then
Expand Down

0 comments on commit 1e70d50

Please sign in to comment.