-
Notifications
You must be signed in to change notification settings - Fork 0
/
NEWS
3988 lines (2770 loc) · 159 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Release Notes
===============================================================================
3.3 Series (2013/xx/xx - )
===============================================================================
3.3 (tokariboshi) 2013/xx/xx
* Version 3.3
This is the first version of pgpool-II 3.3 series.
That is, a "major version up" from 3.2 series.
__________________________________________________________________
* Incompatible changes
- All the follwing are about watchdog.
See "New features" section below for details of these changes.
- Default monitoring method was changed from query mode to heartbeat mode.
- Failover/failback commands are executed in only one pgpool-II.
- In default, all the query caches on shared memory are cleared when
standby pgpool-II escalates to active.
- Database name, user name, and password used for monitring other
pgpool-II by query are specified by dedicated parameters.
Previously, template1, recovery_user, and recovery_password are used.
__________________________________________________________________
* New features
** Watchdog
- Add a new monitring method using heartbeat signal of UDP packet in
lifecheck. (Yugo Nagata)
You can select a method from either of two methods, "heartbeat" mode or
"query" mode.
The heartbeat mode is the new method. In this mode, watchdog monitors
other pgpool-II process by using heartbeat signal. Watchdog receives
heartbeat signals sent by other pgpool-II periodically. If there are no
signal for a certain period, watchdog regards it as failure of the
pgpool-II. For redundancy you can use multiple network interface devices
for heartbeat exchange between pgpool-IIs. This mode is default and
recommended one.
The query mode is the conventional method. In this mode, watchdog monitors
pgpool-II's service rather than process. Watchdog sends queries to other
pgpool-II and checks the response. Note that this method requires connections
from other pgpool-IIs, so it would fail motoring if num_init_children isn't
large enough. This mode is deprecated and left for backward compatibility.
Add these new parameters:
- wd_lifecheck_method
- wd_heartbeat_port
- wd_heartbeat_keepalive
- wd_heartbeat_deadtime
- heartbeat_deviceX
- heartbeat_destinationX
- Add interlocking mechanism of exclusive failover/failback command
execution. (Yugo Nagata)
When using multiple pgpool-IIs with watchdog enabled, failover commands
(failover_command, failback_command, and follow_master_command) get
executed only at one pgpool-II.
Previously, these command got executed at all pgpool-IIs.
- Add authentication mechanism for watchdog packet communication.
(Yugo Nagata)
Watchdog packets (include heartbeat signal) from pgpool-II with wrong
authentication key are rejected. All pgpool-IIs must have the same key,
which is specified wd_authkey parameter in pgpool.conf. pgpool-II with
wrong authkey can't even start watchdog, because the startup packet is
rejected by other pgpool-IIs.
- Add clear_memqcache_on_escalation parameter. (Yugo Nagata)
If this is on, all the query caches on shared memory are cleared when
standby pgpool-II escalates to active.
This is aimed to prevent the new active pgpool-II from using inconsistent
query caches with the previous active.
- Add wd_escalation_command parameter. (Yugo Nagata)
This specifies command which is executed at escalation on the new active
pgpool-II server. The timing is just before virtual IP is brought up.
- Add parameters wd_lifecheck_dbname, wd_lifecheck_user, and
wd_lifecheck_password. (Yugo Nagata)
These parameters specify the database name, the user name, and password
used in query mode lifecheck of watchdog . Previously, these are hard
coded to use template1, recovery_user, and recovery_password.
** Others
- Import PostgreSQL 9.2 raw parser. (Nozomi Anzai, Tatsuo Ishii)
- Add a tool called pgpool_setup to set up pgpool-II and PostgreSQL
temporary installation in current directory for *testing* purpose.
(Tatsuo Ishii)
usage: ./pgpool_setup [-m r|s] [-n num_clusters]
-m s: create an installation as streaming replication mode.
(the default)
-m r: create an installation as native replication mode.
- Support installation method using CREATE EXTENSION for pgpool-recovery
and pgpool-regclass. (Tatsuo Ishii)
Older installtion method is still preserved.
Note: extension names are "pgpool_recovery" and "pgpool_regclass", not
"pgpool-recovery" and "pgpool-regclass" because latters are not
convenient in using CREATE EXTENSION command (requires double quotes).
- Add a function "pgpool_pgctl()" which enebles to execute
pg_ctl stop/restart/reload (except for start) by SQL. (Nozomi Anzai)
$ psql sales -c "select pgpool_pgctl('reload', 'fast')";
pgpool_pgctl
--------------
t
(1 row)
This function always ignores the actual result and returns 't', so the
user can't know if pg_ctl succeeded or failed. To use this we have to set
the custom variable for security which limits the users to execute pg_ctl
who has the permission of data directory.
__________________________________________________________________
* Bug fixes
- Consider timeout waiting for compeletion of failback request in on line
recovery. (Tatsuo Ishii)
This will prevent the situation that recovery operation continues forever
and we cannot even shutdown pgpool-II main process. This could happen
especially while executing follow master command.
- Fix a bug that %H of follow_master_command is not assigned correctly the
new primary node in stream replication mode.
(Tatsuo Ishii)
- Fix not to execute escalation when the pgpool-II which is already active
receives down notification from other pgpool-II. (Yugo Nagata)
- Fix wd_create_send_socket() not to execute select() before connect().
(Yugo Nagata)
How select() works on an unconnected socket is undefined, and differs
between platform. On Linux, this returns 2 and it is eventually harmless.
However, on Soraris, this returns 0 and it is indistinguishable from time
timeout, so watchdog wouldn't work correctly.
- Fix error when pgpool_regclass is not installed. (Tatsuo Ishii)
The query used in pool_has_pgpool_regclass() fails if pgpool_regclass
does not exist. The bug was introduced in 3.2.4. See [pgpool-general:
1722] for more details.
- Fix do_query() not to hang when PostgreSQL returns an error.
(Tatsuo Ishii)
The typical symptom is "I see SELECT is keep on running according to
pg_stat_activity". To fix this pgpool-II just exits the process and
kill the existig connection. This is not gentle but at this point I
believe this is the best solution.
- Fix possible deadlock during failover with watchdog enabled.
(Yugo Nagata)
This is reported in Bug track #54 by arshu arora
http://www.pgpool.net/mantisbt/view.php?id=54
__________________________________________________________________
* Enhancements
- Fix to restart watchdog processes automatically when these exit abnormally.
(Yugo Nagata)
- Add more error checks and reportings to functions executing ping command
with watchdog enabled. (Tatsuo Ishii)
- Replace some unsafe functions, sprintf and strncpy, with more safe ones,
snprintf and strlcpy respectively. (Yugo Nagata)
- Replace "sticky bit" to "setuid bit" in log message, comments and
funcation names. (Yugo Nagata)
These words were used mistakenly and caused confusion.
- Fix description on SSL in pool_hba.conf.sample. (Tatsuo Ishii)
- Fix some mistake, update chinese manual to the latest. (Bambo Huang)
===============================================================================
3.2 Series (2012/08/03 - )
===============================================================================
3.2.4 (namameboshi) 2013/04/26
* Version 3.2.4
This is a bugfix release against pgpool-II 3.2.3.
__________________________________________________________________
* Bug fixes
- Fix connect_inet_domain_socket_by_port() to set more appropriate value
for timeout parameter of select(2). (Tatsuo Ishii)
Some platforms such as Solaris do not allow to specify too large
microseconds timeout value (>=1000000). So divide the timeout value to
seconds and microseconds.
- Fix connect_inet_domain_socket_by_port() to not return as normal when
interrupted by alarm. (Tatsuo Ishii)
This confuses health checking because connect_inet_domain_socket_by_port()
returns unsable fd. This makes detecting errors in health checking longer.
See the following for more details:
[pgpool-general: 1458]
health check timeout in pgpool-II-3.2.3
http://www.pgpool.net/pipermail/pgpool-general/2013-March/001482.html
- Fix long standing bug with timestamp rewriting code for processing
extended protocol. (Tatsuo Ishii)
Parse() allocate memory using palloc() while rewriting the parse
message. Problem is, the rewritten message was kept in the data which
is managed by pool_create_sent_message() etc. The function assumes
that all the data is in session context memory. However, palloc()
allocates memory in query context of course, and gets freeed later on
when the query context disappears. And the function tries to free the
memory as well, which causes various problems, including segfault and
double free. To fix this, memory to store rewritten message is
allocated using session context. The bug was there since pgpool-II 3.0
was born.
Problem analysis and patch contributed by Naoya Anzai.
[pgpoolgenera-jp: 1146]. (in Japanese)
http://www.pgpool.net/pipermail/pgpool-general-jp/2013-March/001145.html
- Fix bug with md5 auth long user name handling. (Tatsuo Ishii)
If user name is longer than 32 bytes, md5 authentication doesn't work.
Problem reported in [pgpool-general: 1526] by Thomas Martin.
[pgpool-general: 1526]
[pgPool-II 3.2.3] MD5 authentication and username longer than 32 characters.
http://www.pgpool.net/pipermail/pgpool-general/2013-March/001551.html
- Fix to calculate replication delay only if standby server is behind from
the primay server. (Yugo Nagata)
When the primary server is behind from standby server, negative value of
delay is calculated and the value is assigned to unsigned variable. It
causes a log message informing negative replication delay. And what is
worse, it also causes SELECT queries to be sent to the primary in load
balance even though there are no replication delay in fact.
The problem is reported and analyzed by Saitoh Hidenori in
[pgpool-genera-jp: 1145].
[pgpool-general-jp: 1145] (in Japanese)
http://www.pgpool.net/pipermail/pgpool-general-jp/2013-March/001144.html
- pgpool-recovery adopts PostgreSQL 9.3. (Tatsuo Ishii)
Patch contributed by Asif Rehman. Slight editing by Tatsuo Ishii.
[pgpool-hackers: 180]
compile error in ppool-recovery
http://www.pgpool.net/pipermail/pgpool-hackers/2013-April/000179.html
- Fix pool_has_pgpool_regclass() to check execute privilege of
pgpool_regclass(). (Tatsuo Ishii)
Even though pgpool_regclass() exists, if pgpool cannot execute the
function, the connection to backend hangs. You can reproduce the problem
by just dropping the execute privilege from pgpool_regclass and do some
insert in native replication mode.
The problem is reported in bugtrack #53.
#53 pgpool_regclas hangs all connections
Date: 2013-04-04 13:35
Reporter: tmandke
http://www.pgpool.net/mantisbt/view.php?id=53
- Fix error message mistakes in detect_postmaster_down_error(). (Tatsuo Ishii)
For example, "LOG: detect_stop_postmaster_error: detect_error error" is
fixed to "LOG: detect_postmaster_down_error: detect_error error", and so on.
- Remove root user check when watchdog is enabled. (Tatsuo Ishii)
Per discussion [pgpool-general: 1627] Re: watchdog root requirement.
[pgpool-general: 1627]
Re: watchdog root requirement.
http://www.pgpool.net/pipermail/pgpool-general/2013-April/001654.html
- Fix bug with on memory query cache in handling UPDATE/DELETE with table
alias. (Tatsuo Ishii)
If UPDATE/DELETE is with table alias (UPDATE t1 AS foo...) pgpool thinks
the table name is "t1 AS foo" and fails to invalidate query cache. This
is caused by _outRangeVar() called from nodeToString() which generates a
query string from RangeVar node in raw parse tree. The solution is removing
"AS foo" part from the output of the string.
Reported in bugtrack #56.
#56 UPDATE with alias does not discard cache
Date: 2013-04-18 17:33
Reporter: harukat
http://www.pgpool.net/mantisbt/view.php?id=56
===============================================================================
3.2.3 (namameboshi) 2013/02/18
* Version 3.2.3
This is a bug fix release against pgpool-II 3.2.2. Main purpose
of this release is to fix fatal problem with pgpool-II 3.2.2's
health checking. If all of following conditions are met, pgpool
main process disappeared and all client connections to pgpool-II
hang forever when failover happens. And the only way to recover
from it is, manualy killing the pgpool child process and restart
pgpool-II.
- health checking is enabled
- connecting method to PostgreSQL is TCP/IP, not UNIX domain
socket(i.e. "backend_hostnameN" is not empty string)
__________________________________________________________________
* Bug fixes
- Fix connect_inet_domain_socket_by_port() bug introduced in
3.2.2. (Tatsuo Ishii)
When non blocking connect() reports EINPROGRESS or EALREADY, it
calls select(2) to wait for read or write fd ready. However it
mistakenly checks error condition using getsockopt(). It should
be called when select() returns > 0, rather than 0. Because of
this, connect_inet_domain_socket_by_port() could return
succeeded fd even it actually failed.
And what is worse, this health_check() mistakenly believes that
backend is alive and tries to write to backend socket, which of
course fails. This triggers to call notice_backend_error(),
which sends SIGUSR1 signal to pgpool main's parent process. This
will result in various weird things: for example, if you start
pgpool from a shell, the signal kills the shell. If you start
pgpool in background, pgpool's parent is the process #1. As long
as you started pgpool as non root, it's ok. Even if you start
pgpool as root, init just reopens /dev/initctl by receiving
SIGUSR1. These all annoying bugs have been there since pgpool
was born. The connect_inet_domain_socket_by_port() bug just
reveals it. To fix this, I modified notice_backend_error and
child_exit() so that it does nothing when called from pgpool
main process itself to prevent pgpool from shooting itself in
the foot.
- Fix to show pool_passwd in "SHOW pool_status". (Yugo Nagata)
- Fix a typo at configure's help in configure.in. (Yugo Nagata)
===============================================================================
3.2.2 (namameboshi) 2013/02/08
* Version 3.2.2
This is a bugfix release against pgpool-II 3.2.1.
__________________________________________________________________
* Bug fixes
- Fix compile errors on FreeBSD. (Tatsuo Ishii)
- Fix pgpool does not recognize VIEWs other than in default schema,
which is usually "public". (Tatsuo Ishii)
This makes pgpool to create caches for such a VIEW's query results,
which of course should not be allowed.
Problem reported and patch provided by jgentsch in bug id #30.
#30 pgpool 3.2.1 - views in schema other than public are caching
Reporter: jgentsch
Date: 2012-10-19 23:13
http://www.pgpool.net/mantisbt/view.php?id=30
- Fix race condition when using md5 authentication. (Tatsuo Ishii)
The file descriptor to pool_passwd is opened in pgpool main and pgpool
child inherits it. When concurrent connections try to authenticate md5
method, they call pool_get_passwd and seek the fd and cause random md5
auth failure because underlying fd is shared. Fix is, let individual
pgpool child open the file by calling pool_reopen_passwd_file.
Problem reported and analyzed by Jason Slagle in pgpool-general:1141.
[pgpool-general: 1141] Possible race condition in pool_get_passwd
From: Jason Slagle
Date: Sun, 28 Oct 2012 01:12:52 -0400
http://www.sraoss.jp/pipermail/pgpool-general/2012-October/001160.html
- Clarify load balance condition information in manual. (Tatsuo Ishii)
- Fix segfault due to bug with query cache array handling. (Tatsuo Ishii)
The cache arrary is used to keep temporary cache results in a transaction.
If there are more than 128 SELECTs in a transaction, the module expands
cache_arrary by using realloc. However it does not record the new pointer
returned by realloc. So the module keeps on using the old pointer which is
absoleted.
This problem is reported in bug track #31 by jgentsch.
#31 pgpool V3_2_STABLE - segfault in pool_memqcache.c:2529
Reporter:jgentsch
Date: 2012-10-23 06:25
http://www.pgpool.net/mantisbt/view.php?id=31
- Fix hung up while repeating pcp_attach_node and pcp_detatch_node
(Tatsuo Ishii)
When node status is changed by pcp_attach_node and pcp_detatch_node,
failover() sends SIGUSR1 to pcp_child process expecting it exits to
refresh node status. In this situation lots of pgpool children exit and
produce SIGCHLD as well. The SIGCHLD handler reaper() tries catch all
SIGCHLD but sometimes it fails depending on the system load and timing.
If SIGCHLD produced by pcp child is not caught, the process becomes
zombie and never restarted.
This problem is reported in bug track #32 (by oleg_myrk) etc.
#32 PGPool hangs on pcp_attach/detach
Reporter: oleg_myrk
Date: 2012-10-24 00:01
http://www.pgpool.net/mantisbt/view.php?id=32
- Fix pool_send_severity_message() not to use uninitialized memory.
(Tatsuo Ishii)
It cause a segmentaion fault.
Reported in Bug #33's attached valgrind output by dudee.
#33 pgpool-II 3.2.1 segfault
Reporter: dudee
Date: 2012-10-30 19:16
http://www.pgpool.net/mantisbt/view.php?id=33
- Fix bug with query cache returning incorrect data in some cases when a
persistent table and temp table have same name. (Tatsuo Ishii)
Here is a sequence to trigger the bug:
1) CREATE TABLE t1(i int); -- create a persistent table
2) INSERT INTO t1 VALUES(1);
3) SELECT * FROM t1; -- query cache entry created
4) CREATE TEMP TABLE t1(i int); -- create a temp table
5) SELECT * FROM t1; -- query cache entry mistakenly created!
Problem is #3 creates relcache entry for t1, and #5 incorrecly uses it
and believes that temp table t1 is not a temp table.
- Add a description about "-f" to help message. (Tatsuo Ishii)
- Fix reaper() not to exit wait3() loop when catches pcp or worker child
exit event. (Tatsuo Ishii)
Otherwise reaper() mistakenly ignore some process exit event and make a
risk of creating zombie process and forgetting to create new process.
Problem reported and fix suggested by Goto in [pgpool-general-jp: 1123].
http://www.sraoss.jp/pipermail/pgpool-general-jp/2012-November/001122.html
(in Japanese)
- Fix a typo of configure help message. (Yugo Nagata)
- Add wd_hostname to pool_process_reporting.c. (Yugo Nagata)
Otherwise, wd_hostname is not contained in results of SHOW pool_status and
cp_pool_status.
- Fix connect_inet_domain_socket_by_port() to not error out when connect(2)
returns EISCONN (Socket is already connected) error. (Tatsuo Ishii)
This could happen with non blocking socket and should be treated as normal.
Per bug track #29 (by spork) and pgpool-general 1218 (by Mikola Rose).
#29 pgpool 3.2.1 cannot connect to db hosts
Reporter: spork
Date: 2012-10-18 15:03
http://www.pgpool.net/mantisbt/view.php?id=29
[pgpool-general: 1218] pgpool 3.2.1 - Health check failing to connect
From: Mikola Rose
Date: Tue, 4 Dec 2012 20:21:55 +0000
http://www.sraoss.jp/pipermail/pgpool-general/2012-December/001237.html
- Fix health_check() to check the health check timer before retrying
with template1 database. (Tatuo Ishii)
Without this, the retry with node 0 always fails because health check timer
may be already expired.
- Fix pool_search_relcache() to use MASTER or MASTER_NODE_ID macro, rather
than REAL_MASTER_NODE_ID. (Tatsuo Ishii)
In case node 0 fail back in streaming replication mode, pgpool does not
restart child process. So REAL_MASTER_NODE_ID looks for node 0 con info,
which is not present until new connection to backend made. Thus referring
to node con info results in segfault. MASTER or MASTER_NODE_ID are safe in
this situation because they look at cached former master node id.
- Fix long standing bug "portal not found" error when replication delay
is too much in streaming replication mode. (Tatsuo Ishii)
The bug had been there since the delay threshold was introduced.
We changed destination DB node if delay threshold exceeds in bind,
describe and execute. However, if parse sends to different node, bind,
describe or execute will fail because no parsed statement or portal
exists. Solution is, not to send to different parse node even
if delay threshold is too much.
- Fix pg_md5 to output "\n" after user inputs password. (Yugo Nagata)
- Fix to print error message when the port number for watchdog is already used.
(Yugo Nagata)
This issue was reported by Will Ferguson in [pgpool-general: 1167].
[pgpool-general: 1167] Re: Watchdog error - wd_init: delegate_IP already exists
From: Will Ferguson
Date: Tue, 6 Nov 2012 13:03:36 +0000
http://www.sraoss.jp/pipermail/pgpool-general/2012-November/001186.html
- Fix child_exit() to not call send_frontend_exits() if there's no
connection pool. (Tatsuo Ishii)
Otherwise, it segfaults because send_frontend_exits() referes to objects
pointed to by pool_connection_pool. Per bug track #44 by tuomas.
#44 pgpool went haywire after slave shutdown triggering master failover
Reporter: tuomas
Date: 2012-12-11 00:33
http://www.pgpool.net/mantisbt/view.php?id=44
- Fix bug that only tables in white_memqcache_table_list was cached when
black_memqcache_table_list has any tables. (Yugo Nagata)
- Fix read_startup_packet() to reset alarm and free StartupPacket when
pool_read() returns 0 which means incorrect packet length. (Nozomi Anzai)
Previously, authentication timeout occurs when connected by a program
monitoring the pgpool port.It is reported in bug track #35.
#35 Authentication is timeout
Reporter: tuomas
Date: 2012-11-20 11:54
http://www.pgpool.net/mantisbt/view.php?id=35
- Fix long standing bug with pool_open(). (Tatsuo Ishii)
It initializes wrong buffer pointer. Actually this is harmless because the
pointer is initialized by prior memset() call, though.
- Log that failover is avoided because "fail_over_on_backend_error" is
turned off. (Tatsuo Ishii)
- Fix LISTEN/NOTIFY handling bugs. (Tatsuo Ishii)
1) In streaming replication mode:
Session 1: LISTEN aaa;
Session 2: NOTIFY aaa;
Session 1: LISTEN aaa; --- hangs
(If LISTEN and NOTIFY are issued in a same session, it works fine.)
We assume that packets come from all nodes. However in streaming
replication mode, notification message only comes from primary
node and we should avoid reading from standby nodes.
2) In streaming replication mode: If primary node is not node 0, it
hangs like #1 even if fix applies. This is because MASTER_NODE_ID
macro (actually pool_virtual_master_db_node_id()) always returns
REAL_MASTER_NODE_ID, which is node 0 (if it is alive). The function
should return PRIMARY_NODE_ID in master/slave mode.
3) In replication mode, LISTEN/NOTIFY simply does not work. In the
mode, NOTIFY is sent to all backends. However the order of arrival of
'Notification response' is not necessarily the master first and then
slaves. So if it arrives slave first, we should try to read from
master, rather than just discard it. Fixed in pool_process_query().
4) In replication mode, if LISTEN and NOTIFY are issued in a same
session, the session is disconnected because do_command() may receive
other than 'N', 'E', 'S' and 'C'. The solution is, put 'A' packet on a
stack and pop out when it is convenient. For this purpose, new
functions pool_push(), pool_pop() and pool_stacklen() are added.
This probmel is reported in but grack #45 by rpashin.
#45 LISTEN/NOTIFY doesn't work if cluster contains more then 1 node in
streaming replication mode
Reporter: rpashin
Date: 2012-12-12 00:09
http://www.pgpool.net/mantisbt/view.php?id=45
Considering the size of the patch, I do not back patch to 3.1 or
before(so far, we have not heard any complaints on 3.1 or before).
- Fix connect_inet_domain_socket_by_port() to call select(2) rather than
error out when connect(2) returns EINPROGESS or EALREADY error.
(Tatsuo Ishii)
When using non-blocking socket, despite the errors like
"Connection timed out", actually connection has been established.
To solve the problem we should use select(2) to wait for connection
establishing when connect(2) reports EINPROGRESS or EALREADY, instead
of doing a retry tight loop.
This problem is reported in bug track #46 by mcousin.
#46 Watchdog failing to connect sometimes
Reporter: mcousin
Date: 2012-12-15 01:01
http://www.pgpool.net/mantisbt/view.php?id=46
- Add caution to increase num_init_children if watchdog enabled in manual.
(Tatsuo Ishii)
See [pgpool-general: 1330] for more details.
[pgpool-general: 1330] WatchDog and pgool sudden stop working
From: Tomas Halgas
Date: Fri, 18 Jan 2013 14:47:23 +0100
http://www.sraoss.jp/pipermail/pgpool-general/2013-January/001350.html
- Fix segmentation fault while pgpool-II stating up or fail over when
watchdog is enabled. (Yugo Nagata)
This is caused by wrong usage of pthread, namely pthread_detach and
pthread_join are mixed together. Solution is to use pthread_join only
if we need to get status of child thread. BTW, the problem could occu
on moderately modern OS such as Fedora 17, but the reason why the problem
is not observed on other OSs is, just we were lucky.
Problem reported in [pgpool-general: 1179] by Lonni J Friedman.
[pgpool-general: 1179] 3.2.1 segfaults at startup on Fedora17.
From: Lonni J Friedman
Date: Mon, 12 Nov 2012 15:58:29 -0800
http://www.sraoss.jp/pipermail/pgpool-general/2012-November/001198.html
Patch provided by chads in bug track #48.
pthread_detach is being used wrong; causes pgpool to segfault.
Reporter: chads
Date: 2013-01-16 05:44
http://www.pgpool.net/mantisbt/view.php?id=48
- Avoid split-brain situation reported in [pgpool-general: 1046] (Yugo Nagata)
After all backend nodes are detached from pgpools and then some backend node
are attached to these, multiple active pgpools could exist simultaneously,
that is to say, split-brain situation occurs.
In this fix, when once all backend DB nodes are detached from pgpool, the pgpool
stays DOWN status until this is restarted. The pgpool in DOWN status cannot
escalate to active (delegate IP holder), so split-brain situation is avoided.
[pgpool-general: 1046]
watchdog enabled delegate_IP on multiple nodes simultaneously
From: Lonni J Friedman
Date: Wed, 26 Sep 2012 09:05:09 -0700
http://www.sraoss.jp/pipermail/pgpool-general/2012-September/001064.html
- Avoid a possible hang during the active pgpool exits. (Yugo Nagata)
When exiting, the active pgpool brings down the virtual IP and then
sends a packet to other pgpools. However, the packet sometimes is sent
before the virtual IP is brought down completely. In this case the packet
sender is set to this IP. When the IP is brought down before other pgpools
respond, the active pgpool can not recieve the response, and hang up.
In this fix, the active pgpool confirms that the virtual IP is brought
down before sending the packet.
- Modify descriptions of restrictions on watchdog. (Yugo Nagata)
- Modify pgpool.conf.sample* and documents to correct information of
whether a certain parameter change requires restart. (Yugo Nagata)
- Add pool_passwd option to pgpool.conf.sample*, pool_process_reporting.c,
and documents. (Yugo Nagata)
Otherwise, wd_hostname is not contained in results of SHOW pool_status and
pcp_pool_status.
===============================================================================
3.2.1 (namameboshi) 2012/10/12
* Version 3.2.1
This is a bugfix release against pgpool-II 3.2.0.
__________________________________________________________________
* Bug fixes
- Fix send_cached_messages(). (Tatsuo Ishii)
Before it had 8192 bytes fix length buffer for each row data and if data
exceeded 8192 bytes, it just crashed.
To fix this, eliminate copying raw data which is passed as an argument
to buffer and pass the pointer to send_message.
- Fix that extended queries failed due to query cache. (Nozomi Anzai)
- Fix read_startup_packet(). (Tatsuo Ishii)
If packet length is lower than 0, it should have returned immediately.
Otherwise it would cause memory allocation error later on.
per pgpool-general:886. Also add canceling alarm.
[pgpool-general: 886] read_startup_packet: out of memory
From: Lonni J Friedman
Date: Wed, 8 Aug 2012 10:18:15 -0700
http://www.sraoss.jp/pipermail/pgpool-general/2012-August/000896.html
- Fix too watchdog process's aggressively kill other processes when pgpool
shuts down. (Tatsuo Ishii)
watchdog process calls kill(0,SIG) to kill all processes related to
watchdog. Unfortunately this will kill not only watchdog related
processes but parent pgpool and even httpd in case when pgpool was
invoked from pgpoolAdmin because they are in the same process
group.
So for now, fix is removing call to the kill() and setpgid() because
setpgid() does nothing useful.
In the future, we should call setsid() to establish new process group
in any case.
- Fix query cache to regist such queries that start with "-- comment" or
have comments more than one. (Nozomi Anzai)
- Fix query cache to ignore multi statement. (Nozomi Anzai)
In previous, queries like "SELECT 1;UPDATE..." were cached, too, but it
was wrong.
- Add a watchdog restriction to documents. (Yugo Nagata)
- Add NOTICE message handling to s_do_auth(). (Tatsuo Ishii)
Without this, health check responses false alarm and causes failover.
per bug track:
#25 s_do_auth doesn't handle NoticeResponse (N) message
Date: 2012-08-28 03:57
Reporter: singh.gurjeet
Date:
http://www.pgpool.net/mantisbt/view.php?id=25
- Remove unnecessary/confusing debug log from s_do_auth().(Tatsuo Ishii)
- Fix buffer overrun in Execute when memory cache enabled.
(Tatsuo Ishii)
If one of bind parameter < 0, it was possible to produce more than 2
byte string for "%02X" due to sign extention.
Also use snprintf, rather than sprintf to prevent from possible buffer
overrun in the future.
- Fix long standing memory leak bug with free_select_result() since
pgpool-II 2.3 was born in December 2009. (Tatsuo Ishii)
Actually this bug only appears when operated in replication mode
(triggered by timestamp rewriting process by coincidence).
Per bug track #24:
#24 Severe memory leak in an OLTP environment
Date: 2012-08-28 03:43
Reporter: singh.gurjeet
Date:
http://www.pgpool.net/mantisbt/view.php?id=24
- Fix typo in cache_reporting(). (Tatsuo Ishii)
- Fix inifinit loop in SSL mode. (Tatsuo Ishii)
When there's pending data in SSL layer of frontend, pool_process_query()
checks pending data in backend. If there's non, it loops again and checks
frontend/backend receive buffer by using is_cache_empty().
Unfortunately it first checks pending data in SSL layer of frontend,
thus goes to backend data and checks again (infinite loop).
The solution is, if there's pending data in SSL layer of frontend and
query is not in progress, call ProcessFrontendResponse() to process new
request from frontend.
- Fix is_system_catalog() to use pgpool_regclass if available.
(Tatsuo Ishii)
- Fix memory leak in pool_get_insert_table_name(). (Tatsuo Ishii)
If session context's memory contex is used for nodeToString(), memory is
not freed until session ends.
See bug track #24 for more details.
#24 Severe memory leak in an OLTP environment
Date: 2012-08-28 03:43
Reporter: singh.gurjeet
http://www.pgpool.net/mantisbt/view.php?id=24
- Use fcntl(2) rather than flock(2) to lock oid map files. (Tatsuo Ishii)
flock(2) is not portable and cannot be used on Solaris.
Patch contributed by Ibrar Ahmed.
- Fix oversight of get_next_master_node() when operated in raw mode.
(Tatsuo Ishii)
If master node goes down, it always returned master node id 0.
See [pgpool-general: 1039] for more details.
[pgpool-general: 1039] Raw failover not working as expected on pgpool-II
v3.2.0
From: Quentin White
Date: Tue, 25 Sep 2012 07:45:34 +0000
http://www.sraoss.jp/pipermail/pgpool-general/2012-September/001057.html
- Fix segfault of do_query(). (Tatsuo Ishii)
When memqcache enabled and extended protocol is used, do_query()
accesses system catalog and use pool_read2(). Unfortunately parse message
packet is given to Parse() and the packet contents is on pool_read2's
buffer. Thus do_query could break the packet contents, and it leads to
segfault.
Solution is, allocate memory and copies the packet contents and pass to
Parse(). Note that query context holds query string, which is in the
packet as well. So we need to copy it and save the pointer in the
query context.
We think the problem is not only with Parse() but with other protocol
modules. So this fix is not Parse() only, rather for other modules. For
this purpose ProcessFrontendResponse() is changed.
See bugtrack in detai.
#21 pgpool-II 3.2.0 cannot execute sql through jdbc
Date: 2012-08-17 16:31
Reporter: elisechiang
http://www.pgpool.net/mantisbt/view.php?id=21
- Fix to set unix domain socket path for pgpool PCP communication before
setting up signal handlers. (Yugo Nagata)
Previously, unlink of the socket in exitting process was failed because
the path of the socket was not set.
Patch contributed by Gilles Darold
[pgpool-hackers: 131] Found bug with watchdog resulting in pgpool
segmentation fault
From: Gilles Darold
Date: Thu, 13 Sep 2012 18:54:42 +0200
http://www.sraoss.jp/pipermail/pgpool-hackers/2012-September/000130.html
- Fix to output the message when, in watchdog, ifup/down or arping command
does not exist. (Yugo Nagata)
- Fix long standing problem with do_query(). (Tatsuo Ishii)
When 1) extended protocol used and 2)unnamed portal is used and 3) no
explicit transaction is used, user's unnamed portal is removed by Sync
message.
This is because Sync message closes transaction and unnamed portal is
removed. This leads to "portal "" does not exist" error.
Fix is, use "Flush" message instead of Sync. Main difference between
using Sync and Flush is, Flush does not return Ready for Query message.
So do_query() does not return until all expected responses are returned.
It seems the order of messages returned from backend is random, and
do_query () manages it by using state bits.
===============================================================================
3.2.0 (namameboshi) 2012/08/03
* Version 3.2.0
This is the first version of pgpool-II 3.2 series.
That is, a "major version up" from 3.1 series.
__________________________________________________________________
* Incompatible changes
- The new query cache "On memory query cache" took the place of the old
one.
- Now the parameter "enable_query_cache" is deleted.
__________________________________________________________________
* New features
** Memory based query cache
Original author is Masanori Yamazaki, improved by Development Group.
(Tatsuo Ishii, Nozomi Anzai, Yugo Nagata)
Overview:
On memory query cache is faster because cache storage is on memory.
Moreover you don't need to restart pgpool-II when the cache is outdated
because the underlying table gets updated.
On memory cache saves pair of SELECT statements (with its Bind parameters
if the SELECT is an extended query). If the same SELECTs comes in, it
returns the value from cache. Since no SQL parsing nor access to
PostgreSQL are involed, it's extremely fast.
On the other hand, it might be slower than the normal path because it
adds some overhead to store cache. Moreover when a table is updated,
pgpool automatically deletes all the caches related to the table. So the
prformace will be degraded by a system with a lot of updates. If the
cache_hit_ratio is lower than 70%, you might want to disable onl memory
cache.
Choosing cache storage:
You can choose a cache storage: shared memory or memcached (you can't use
the both).
Query cache with shared memory is fast and easy because you don't have
to install and config memcached, but restricted the max size of cache by
the one of shared memory. Query cache with memcached needs a overhead to
access network, but you can set the size as you like.
Restrictions:
- On memory query cache deletes the all cache of an updated table
automatically with monitoring if the executed query is UPDATE, INSERT,
ALTER TABLE and so on. But pgpool-II isn't able to recognize implicit
updates due to trigers, foreign keys and DROP TABLE CASCADE.
You can avoid this problem with memqcache_expire by which pgpool deletes
old cache in a fixed time automatically, or with black_memqcache_table_list
by which pgpool's memory cache flow ignores the tables.
- If you want to use multiple instances of pgpool-II with online memory
cache which uses shared memory, it could happen that one pgpool deletes
cache, and the other one doesn't do it thus finds old cached result
when a table gets updated. Memcached is the better cache storage in this
case.
New parameters:
- Add parameters for on memoey query cache as follows:
memory_cache_enabled, memqcache_method, memqcache_expire,
memqcache_maxcache, memqcache_oiddir. (Tatsuo Ishii)
- Add parameters about shared memory for on memory query cache as
follows:
memqcache_total_size, memqcache_max_num_cache,
memqcache_cache_block_size. (Tatsuo Ishii)
- Add parameters about memcached for on memory query cache as follows:
memqcache_memcached_host, memqcache_memcached_port. (Tatsuo Ishii)
- Add parameters about relation cache for on memory query cache as