Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processor crashed with missing blocks despite correct DB entries #253

Open
poka-IT opened this issue Mar 7, 2024 · 11 comments
Open

Processor crashed with missing blocks despite correct DB entries #253

poka-IT opened this issue Mar 7, 2024 · 11 comments

Comments

@poka-IT
Copy link

poka-IT commented Mar 7, 2024

My subsquid node processor crashed with this logs:

duniter-squid-processor-1     | {"level":2,"time":1709683176077,"ns":"sqd:processor","msg":"435159 / 435159, rate: 0 blocks/sec, mapping: 85 blocks/sec, 254 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":2,"time":1709683182174,"ns":"sqd:processor","msg":"435160 / 435160, rate: 0 blocks/sec, mapping: 86 blocks/sec, 259 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":2,"time":1709683188188,"ns":"sqd:processor","msg":"435161 / 435161, rate: 0 blocks/sec, mapping: 91 blocks/sec, 274 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":2,"time":1709683194091,"ns":"sqd:processor","msg":"435162 / 435162, rate: 0 blocks/sec, mapping: 90 blocks/sec, 270 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":2,"time":1709683200099,"ns":"sqd:processor","msg":"435163 / 435163, rate: 0 blocks/sec, mapping: 90 blocks/sec, 269 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":2,"time":1709683608909,"ns":"sqd:processor","msg":"435172 / 435172, rate: 0 blocks/sec, mapping: 138 blocks/sec, 415 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":5,"time":1709683609149,"ns":"sqd:processor","err":{"generatedMessage":true,"code":"ERR_ASSERTION","actual":false,"expected":true,"operator":"==","stack":"AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:\n\n  (0, assert_1.default)(head.height >= this.chain[0].height)\n\n    at HotProcessor.moveToBlocks (/squid/node_modules/@subsquid/util-internal-ingest-tools/lib/hot.js:60:30)\n    at HotProcessor.goto (/squid/node_modules/@subsquid/util-internal-ingest-tools/lib/hot.js:38:24)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async RpcDataSource.handleNewHeads (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:213:24)\n    at async RpcDataSource.subscription (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:198:25)\n    at async RpcDataSource.processHotBlocks (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:113:13)"}}

Error: [ERR_ASSERTION]: The expression evaluated to a falsy value:\n\n (0, assert_1.default)(head.height >= this.chain[0].height)

It look likes it missed 8 higher blocks, between 435163 and 435172, which is the cause of this error.

This indexer is listening a local duniter v2s archive node.

indexer enpoind: https://gdev-squid.axiom-team.fr/v1/graphql

duniter endpoint: wss://gdev.p2p.legal/ws


We observe that the problem occurs exactly at midnight, which would be a strange coincidence:

Bloc 435160 : 2024-03-05 23:59:42
Bloc 435161 : 2024-03-05 23:59:48
Bloc 435162 : 2024-03-05 23:59:54
Bloc 435163 : 2024-03-06 00:00:00
Bloc 435172 : 2024-03-06 00:06:48

We also notice a gap of 6 minutes and 48 seconds between block 435163 and 435172, whereas only 48 seconds should have passed.


Disturbingly, the missing blocks in the logs seem to be present in the DB, and correctly dated:

{
  "data": {
    "block": [
      {
        "height": 435172,
        "timestamp": "2024-03-06T00:01:06+00:00"
      },
      {
        "height": 435171,
        "timestamp": "2024-03-06T00:01:00+00:00"
      },
      {
        "height": 435170,
        "timestamp": "2024-03-06T00:00:54.001+00:00"
      },
      {
        "height": 435169,
        "timestamp": "2024-03-06T00:00:48.002+00:00"
      },
      {
        "height": 435168,
        "timestamp": "2024-03-06T00:00:42+00:00"
      },
      {
        "height": 435167,
        "timestamp": "2024-03-06T00:00:24+00:00"
      },
      {
        "height": 435166,
        "timestamp": "2024-03-06T00:00:18+00:00"
      },
      {
        "height": 435165,
        "timestamp": "2024-03-06T00:00:12.001+00:00"
      },
      {
        "height": 435164,
        "timestamp": "2024-03-06T00:00:06.001+00:00"
      },
      {
        "height": 435163,
        "timestamp": "2024-03-06T00:00:00.001+00:00"
      },
      {
        "height": 435162,
        "timestamp": "2024-03-05T23:59:54+00:00"
      },
      {
        "height": 435161,
        "timestamp": "2024-03-05T23:59:48+00:00"
      },
      {
        "height": 435160,
        "timestamp": "2024-03-05T23:59:42.001+00:00"
      },
      {
        "height": 435159,
        "timestamp": "2024-03-05T23:59:36.001+00:00"
      },
      {
        "height": 435158,
        "timestamp": "2024-03-05T23:59:30.001+00:00"
      }
    ]
  }
}

Here a simple processor restart solve the problem, but whatever the cause of the problem, processor should be able to recover using its checkpoints, without manual restart.

Version: "@subsquid/substrate-processor": "^8.1.1"

@poka-IT poka-IT changed the title Subsquid Processor Crash with Missing Blocks Despite Correct DB Entries Subsquid processor crash with missing blocks despite correct DB entries Mar 7, 2024
@poka-IT poka-IT changed the title Subsquid processor crash with missing blocks despite correct DB entries Processor crashed with missing blocks despite correct DB entries Mar 7, 2024
@belopash
Copy link
Contributor

belopash commented Mar 7, 2024

Not sure that caused the error, but blocks weren't missed. More looks like there was some issue on the rpc node side due to that it was unable to return new blocks for a certain period of time, and these 8 blocks weren't printed in the logs because only the last block of the batch is printed.

Regarding to the error itself, it seems that there was a fork/rollback, but processor didn't handle it correctly, will investigate.

@poka-IT
Copy link
Author

poka-IT commented Jun 26, 2024

@belopash is this completed issue link to a MR ?

@belopash
Copy link
Contributor

@belopash is this completed issue link to a MR ?

Is it still relevant?

Perhaps this should be fixed with
5b076d9
fb6f10a

@poka-IT
Copy link
Author

poka-IT commented Jun 26, 2024

Perhaps
Didn't throw it again, so let's says it's ok =)

@poka-IT
Copy link
Author

poka-IT commented Jul 12, 2024

@belopash I just have the same issue again with the processor:

duniter-squid-processor-1     | {"level":2,"time":1720554246107,"ns":"sqd:processor","msg":"2169697 / 2169697, rate: 0 blocks/sec, mapping: 43 blocks/sec, 129 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":2,"time":1720554252211,"ns":"sqd:processor","msg":"2169698 / 2169698, rate: 0 blocks/sec, mapping: 43 blocks/sec, 130 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":2,"time":1720554264256,"ns":"sqd:processor","msg":"2169699 / 2169699, rate: 0 blocks/sec, mapping: 31 blocks/sec, 93 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":2,"time":1720554708561,"ns":"sqd:processor","msg":"2169707 / 2169707, rate: 0 blocks/sec, mapping: 25 blocks/sec, 75 items/sec, eta: 0s"}
duniter-squid-processor-1     | {"level":5,"time":1720554708768,"ns":"sqd:processor","err":{"generatedMessage":true,"code":"ERR_ASSERTION","actual":false,"expected":true,"operator":"==","stack":"AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:\n\n  (0, assert_1.default)(head.height >= this.chain[0].height)\n\n    at HotProcessor.moveToBlocks (/squid/node_modules/@subsquid/util-internal-ingest-tools/lib/hot.js:60:30)\n    at HotProcessor.goto (/squid/node_modules/@subsquid/util-internal-ingest-tools/lib/hot.js:38:24)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async RpcDataSource.handleNewHeads (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:213:24)\n    at async RpcDataSource.subscription (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:198:25)\n    at async RpcDataSource.processHotBlocks (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:113:13)"}}

Nothing happend after this.
Zoom on the error:

{
    "generatedMessage": true,
    "code": "ERR_ASSERTION",
    "actual": false,
    "expected": true,
    "operator": "==",
    "stack": "AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:\n\n" +
             "  (0, assert_1.default)(head.height >= this.chain[0].height)\n\n" +
             "    at HotProcessor.moveToBlocks (/squid/node_modules/@subsquid/util-internal-ingest-tools/lib/hot.js:60:30)\n" +
             "    at HotProcessor.goto (/squid/node_modules/@subsquid/util-internal-ingest-tools/lib/hot.js:38:24)\n" +
             "    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n" +
             "    at async RpcDataSource.handleNewHeads (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:213:24)\n" +
             "    at async RpcDataSource.subscription (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:198:25)\n" +
             "    at async RpcDataSource.processHotBlocks (/squid/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:113:13)"
}

So this issue is not fixed =)

@belopash
Copy link
Contributor

@poka-IT seems you are still using old packages. This is how the current code looks like, as you can see the assertion is on a different line. I suggest you to delete both lockfile and node_modules/@subsquid folder, and then run reinstall
image

@Hugo-Trentesaux
Copy link

Hugo-Trentesaux commented Sep 20, 2024

After an update, getting the same error with assertion on line 62 as on your screenshot.

AssertionError [ERR_ASSERTION]: The expression evaluated to a falsy value:
  (0, assert_1.default)(head.height >= this.chain[0].height)
    at HotProcessor.moveToBlocks (/squid/node_modules/.pnpm/@[email protected]_@[email protected]_@subs_pouyxye7trl5ald52mmup34cna/node_modules/@subsquid/util-internal-ingest-tools/lib/hot.js:62:30)
    at HotProcessor.goto (/squid/node_modules/.pnpm/@[email protected]_@[email protected]_@subs_pouyxye7trl5ald52mmup34cna/node_modules/@subsquid/util-internal-ingest-tools/lib/hot.js:38:24)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async RpcDataSource.handleNewHeads (/squid/node_modules/.pnpm/@[email protected]_@[email protected]_@[email protected]_@subsqui_vn76oakevpk2lvovyggbaloyyq/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:220:24)
    at async RpcDataSource.subscription (/squid/node_modules/.pnpm/@[email protected]_@[email protected]_@[email protected]_@subsqui_vn76oakevpk2lvovyggbaloyyq/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:205:25)
    at async RpcDataSource.processHotBlocks (/squid/node_modules/.pnpm/@[email protected]_@[email protected]_@[email protected]_@subsqui_vn76oakevpk2lvovyggbaloyyq/node_modules/@subsquid/substrate-data-raw/lib/datasource.js:120:13)

Here you can see the current version of our dependencies: https://git.duniter.org/nodes/duniter-squid/-/blob/main/package.json?ref_type=heads:

    "@subsquid/substrate-data-raw": "^1.2.0",
    "@subsquid/substrate-processor": "^8.5.1",
    "@subsquid/substrate-runtime": "^2.0.0",
    "@subsquid/util-internal": "^3.2.0",

@eldargab
Copy link
Collaborator

Did it happen again or you are trying to run against the same DB?

@eldargab eldargab reopened this Sep 20, 2024
@Hugo-Trentesaux
Copy link

Yes, it happened again on a fresh db. Currently our indexer is still moving a lot so we are re-indexing from scratch at every update that touches the db. Multiple instances met the same bug.

@eldargab
Copy link
Collaborator

@Hugo-Trentesaux Is this endpoint still valid - wss://gdev.p2p.legal/ws? Is it what you use?

@Hugo-Trentesaux
Copy link

Hugo-Trentesaux commented Sep 23, 2024

Yes, this endpoint is on our current dev network. You can also use:

  • wss://gdev.coinduf.eu
  • wss://gdev.gyroi.de
  • wss://gdev.cgeek.fr

And see other nodes connected to this network on the telemetry: https://telemetry.polkadot.io/#list/0xc184c4ccde8e771483bba7a01533d007a3e19a66d3537c7fd59c5d9e3550b6c3

About the indexers, you can interact with graphql API on :

  • https://squid.gdev.gyroi.de/v1/graphql
  • https://squid.gdev.coinduf.eu/v1/graphql (currently stopped because of this bug)

and see latest indexed block with:

query LatestBlock {
  block(orderBy: {height: DESC}, limit: 1) {
    timestamp
    height
    hash
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@eldargab @poka-IT @Hugo-Trentesaux @belopash and others