Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index Corruption During Large Import #303

Open
kierangirvan opened this issue Oct 29, 2024 · 12 comments
Open

Index Corruption During Large Import #303

kierangirvan opened this issue Oct 29, 2024 · 12 comments
Assignees

Comments

@kierangirvan
Copy link

Describe the bug
We regularly import a large jtl file post test (we've previously discussed importing/feeding the file during the test but this is not possible). But the import seems to work just fine however, we have recently increased our load profile meaning the jtl file has jumped from 800mb to 1gb and as a result we are seeing an index corruption message in the BE logs.

{"level":"error","message":"Error occurred during data parsing and saving into database: error: index \"_hyper_1_14_chunk_samples_item_idx\" contains unexpected zero page at block 28401, item_id: fb38060d-63a9-4703-b278-5a972734d624"}

We have reindexes this index and tried to import again and it fails with the same error message.

To Reproduce
Try and import a large (>1gb) jtl file

Expected behavior
The jtl file to successfully upload.

Screenshots
Screenshot 2024-10-29 at 17 02 28

@ludeknovy
Copy link
Owner

Hi @kierangirvan
what version do you run?

@kierangirvan
Copy link
Author

Looks like v4.10
image

@ludeknovy
Copy link
Owner

Thanks. My advice would be to try to upgrade to v5 as this uses a newer version of timescaledb.

But to be fair, I have no idea, whether this going to solve your problem or not :|

@kierangirvan
Copy link
Author

We've upgraded to v5 and still seeing the same exception.
image

@kierangirvan
Copy link
Author

We've decided to completely rebuild the service (unfortunately meaning we lose our previous tests), but this seems to have introduced a new issue:
{"level":"error","message":"Error occurred during samples aggregation in database: Error while processing dataId: 28f5f5c9-ac17-455d-9e97-49e4f7e50530 for item: 28f5f5c9-ac17-455d-9e97-49e4f7e50530, error: error: function lttb(timestamp with time zone, integer, unknown) does not exist, item_id: 28f5f5c9-ac17-455d-9e97-49e4f7e50530"}

@ludeknovy
Copy link
Owner

What kind of database image do you use? it looks like you are not using the one from here https://github.com/ludeknovy/jtl-reporter/blob/main/db/Dockerfile

@kierangirvan
Copy link
Author

I've just recreated this cloning the main branch, docker-compose up and attempting to import a tiny jtl file:
{"level":"info","message":"Starting KPI file streaming and saving to db, item_id: d5087134-684a-4ee2-8ef7-babe379e89e8"} {"level":"info","message":"Deleting file: uploads/ecc7cc4eebd1f00a4081cc4d850feaab"} {"level":"info","message":"Parsed 9 records in 0.078 seconds"} {"level":"error","message":"Error occurred during samples aggregation in database: Error while processing dataId: d5087134-684a-4ee2-8ef7-babe379e89e8 for item: d5087134-684a-4ee2-8ef7-babe379e89e8, error: error: function lttb(timestamp with time zone, integer, unknown) does not exist, item_id: d5087134-684a-4ee2-8ef7-babe379e89e8"} {"level":"info","message":"Trying to set item: d5087134-684a-4ee2-8ef7-babe379e89e8 to error state."} {"level":"info","message":"Deleting file: uploads/ecc7cc4eebd1f00a4081cc4d850feaab"} {"level":"error","message":"File uploads/ecc7cc4eebd1f00a4081cc4d850feaab does not exist anymore"}

@kierangirvan
Copy link
Author

My cloud engineer shared the following from our AWS hosted jtl reporter, this is the same as my local install:
postgres=# \dx List of installed extensions Name | Version | Schema | Description -------------+---------+------------+-------------------------------------------------------------------------------------- plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language timescaledb | 2.14.2 | public | Enables scalable inserts and complex queries for time-series data (Apache 2 Edition) uuid-ossp | 1.1 | public | generate universally unique identifiers (UUIDs)

@ludeknovy
Copy link
Owner

lttb function missing means, you dont have the timescale db toolkit installed.
If you use the FROM timescale/timescaledb-ha:pg16.4-ts2.16.1-all, then you dont have to do anything special as this is already preinstalled. If you've chosen different db image, you have to install it manually.

@kierangirvan
Copy link
Author

I have fed this back to the cloud engineer, I'll let you know if this fixes the issue.

Obviously the original error in this ticket is still a concern, and we had to entirely rebuild the stack therefore losing all of our historical tests.

@ludeknovy
Copy link
Owner

But why didnt you make a backup of your data??

@kierangirvan
Copy link
Author

We do have backups of the jtl files, so we can re-upload them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants