Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgpool goes into hung state when it notices flush size is 0 pool_stream.c#L650 #79

Open
HariKalakada opened this issue Nov 10, 2024 · 9 comments
Assignees

Comments

@HariKalakada
Copy link

https://github.com/pgpool/pgpool2/blob/master/src/utils/pool_stream.c#L650

When the flush size is 0, then pgpool is going into hung

@tatsuo-ishii
Copy link
Collaborator

Can you elaborate more? What do you exactly mean "hung"?

@tatsuo-ishii tatsuo-ishii self-assigned this Nov 11, 2024
@HariKalakada
Copy link
Author

Can you elaborate more? What do you exactly mean "hung"?

pgloader successfully loads data from MSSQL's to PostgreSQL when connected directly to 5432 port of postgreSQL, but when connected via PGPOOL, it does not respond after

2024-11-11 03:06:26: pid 7: DETAIL: destination = 0 for query= "
copy public.a (a) FROM STDIN"

and here is that log after that "DEBUG: pool_flush_it: flush size: 0", it just does not move ahead

2024-11-11 03:06:26: pid 7: DETAIL:  destination = 0 for query= "
        copy public.a (a) FROM STDIN"
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:1984
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_write: to backend: 0 kind:Q
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:467
2024-11-11 03:06:26: pid 7: DEBUG:  pool_flush_it: flush size: 35
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:650
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  waiting for query response
2024-11-11 03:06:26: pid 7: DETAIL:  waiting for backend:0 to complete the query
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:539
2024-11-11 03:06:26: pid 7: DEBUG:  pool_read: read 10 bytes from backend 0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:199
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  extract_message: kind: G
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:4477
2024-11-11 03:06:26: pid 7: DEBUG:  extract_message: kind: G
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:4477
2024-11-11 03:06:26: pid 7: DEBUG:  extract_message: kind: G
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:4477
2024-11-11 03:06:26: pid 7: DEBUG:  extract_message: kind: G
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:4477
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  reading backend data packet kind
2024-11-11 03:06:26: pid 7: DETAIL:  main node id: 0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:3352
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  reading backend data packet kind
2024-11-11 03:06:26: pid 7: DETAIL:  backend:0 kind:'G'
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:3414
2024-11-11 03:06:26: pid 7: DEBUG:  reading backend data packet kind
2024-11-11 03:06:26: pid 7: DETAIL:  backend:0 of 1 kind = 'G'
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:3479
2024-11-11 03:06:26: pid 7: DEBUG:  read_kind_from_backend max_count:1.000000 num_executed_nodes:1
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:3495
2024-11-11 03:06:26: pid 7: DEBUG:  processing backend response
2024-11-11 03:06:26: pid 7: DETAIL:  received kind 'G'(47) from backend
2024-11-11 03:06:26: pid 7: LOCATION:  pool_proto_modules.c:3045
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_virtual_main_db_node_id: virtual_main_node_id:0 load_balance_node_id:0 PRIMARY_NODE_ID:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_query_context.c:412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_write: to frontend: kind:G po:0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:478
2024-11-11 03:06:26: pid 7: DEBUG:  pool_write: to frontend: length:4 po:1
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:481
2024-11-11 03:06:26: pid 7: DEBUG:  pool_write: to frontend: length:5 po:5
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:481
2024-11-11 03:06:26: pid 7: DEBUG:  SimpleForwardToFrontend: packet:G length:5
2024-11-11 03:06:26: pid 7: LOCATION:  pool_process_query.c:802
2024-11-11 03:06:26: pid 7: DEBUG:  pool_flush_it: flush size: 10
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:650
2024-11-11 03:06:26: pid 7: DEBUG:  copy data rows
2024-11-11 03:06:26: pid 7: DETAIL:  read kind from frontend H(48)
2024-11-11 03:06:26: pid 7: LOCATION:  pool_proto_modules.c:3383
2024-11-11 03:06:26: pid 7: DEBUG:  pool_write: to backend: 0 kind:H
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:467
2024-11-11 03:06:26: pid 7: DEBUG:  pool_flush_it: flush size: 5
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:650
2024-11-11 03:06:26: pid 7: DEBUG:  copy data rows
2024-11-11 03:06:26: pid 7: DETAIL:  invalid copyin kind. expected 'd' got 'H'
2024-11-11 03:06:26: pid 7: LOCATION:  pool_proto_modules.c:3412
2024-11-11 03:06:26: pid 7: DEBUG:  pool_flush_it: flush size: 0
2024-11-11 03:06:26: pid 7: LOCATION:  pool_stream.c:650```

@tatsuo-ishii
Copy link
Collaborator

I see the problem:
2024-11-11 03:06:26: pid 7: DETAIL: read kind from frontend H(48)
It seems your tool uses extended query protocol mode while doing COPY TO, and send 'H' (flush) message. This message used in COPY is totally useless and should be ignored. Unfortunately pgpool is not ready for the situation. I'll try to create a patch for this.

@tatsuo-ishii
Copy link
Collaborator

Here is the fix.
fix_copy.txt

@HariKalakada
Copy link
Author

Here is the fix. fix_copy.txt

Thanks for the patch, it helped ignoring the messages used in COPYin and moved on, but now, below error is coming

Max connections reached, increase value of TDS_MAX_CONN
2024-11-11T08:32:07.748350Z WARNING PostgreSQL warning: pool_write: to frontend: kind:N po:0
2024-11-11T08:32:07.748350Z WARNING PostgreSQL warning: Unable to parse the query: "CREATE TABLE public.abc
(
  col1 timestamptz default NULL,
  col2             text default NULL,
  con4  text default NULL,
  group            int default '0',
  group_
KABOOM!
FATAL error: PostgreSQL protocol error: Unexpected message received: F 70
An unhandled error condition has been signalled:
   PostgreSQL protocol error: Unexpected message received: F 70



2024-11-11T08:32:07.748350Z FATAL Failed to create the schema, see above.
2024-11-11T08:32:07.752350Z LOG report summary reset
       table name     errors       rows      bytes      total time
-----------------  ---------  ---------  ---------  --------------
  fetch meta data          0       4056                     0.980s
   Create Schemas          0          0                     0.052s
 Create SQL Types          0          0                     0.060s
    Create tables          0          0                     0.000s
-----------------  ---------  ---------  ---------  --------------
-----------------  ---------  ---------  ---------  --------------

@tatsuo-ishii
Copy link
Collaborator

Seems like a pgloader problem. PostgreSQL complains invalid CREATE TABLE statement.

@HariKalakada
Copy link
Author

Yes, looks like that, it is repeating when tried directly with PostgreSQL without pgpool.
I'll have that checked out.
Thanks @tatsuo-ishii for your quick responses.

pengbo0328 pushed a commit that referenced this issue Nov 11, 2024
During COPY IN state (i.e. COPY FROM STDIN), frontend can send Flush
or Sync messages.  According to the F/B protocol specification, they
should be ignored but Pgpool-II treated as an invalid message and this
causes COPY hung.

Discussion: #79
Backpatch-through: v4.1
pengbo0328 pushed a commit that referenced this issue Nov 11, 2024
During COPY IN state (i.e. COPY FROM STDIN), frontend can send Flush
or Sync messages.  According to the F/B protocol specification, they
should be ignored but Pgpool-II treated as an invalid message and this
causes COPY hung.

Discussion: #79
Backpatch-through: v4.1
pengbo0328 pushed a commit that referenced this issue Nov 11, 2024
During COPY IN state (i.e. COPY FROM STDIN), frontend can send Flush
or Sync messages.  According to the F/B protocol specification, they
should be ignored but Pgpool-II treated as an invalid message and this
causes COPY hung.

Discussion: #79
Backpatch-through: v4.1
pengbo0328 pushed a commit that referenced this issue Nov 11, 2024
During COPY IN state (i.e. COPY FROM STDIN), frontend can send Flush
or Sync messages.  According to the F/B protocol specification, they
should be ignored but Pgpool-II treated as an invalid message and this
causes COPY hung.

Discussion: #79
Backpatch-through: v4.1
pengbo0328 pushed a commit that referenced this issue Nov 11, 2024
During COPY IN state (i.e. COPY FROM STDIN), frontend can send Flush
or Sync messages.  According to the F/B protocol specification, they
should be ignored but Pgpool-II treated as an invalid message and this
causes COPY hung.

Discussion: #79
Backpatch-through: v4.1
pengbo0328 pushed a commit that referenced this issue Nov 11, 2024
During COPY IN state (i.e. COPY FROM STDIN), frontend can send Flush
or Sync messages.  According to the F/B protocol specification, they
should be ignored but Pgpool-II treated as an invalid message and this
causes COPY hung.

Discussion: #79
Backpatch-through: v4.1
@HariKalakada
Copy link
Author

Here is the fix. fix_copy.txt

Hi @tatsuo-ishii, I see that the fix has been applied onto main, any tentative date of having a new release/tarball which includes this fix?

@tatsuo-ishii
Copy link
Collaborator

Next minor releases are scheduled on November 21st.
https://pgpool.net/mediawiki/index.php/Roadmap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants