Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CI known flakes and errors #12413

Open
13 of 14 tasks
lhoguin opened this issue Oct 1, 2024 · 10 comments
Open
13 of 14 tasks

Make CI known flakes and errors #12413

lhoguin opened this issue Oct 1, 2024 · 10 comments
Labels

Comments

@lhoguin
Copy link
Contributor

lhoguin commented Oct 1, 2024

This ticket is meant to track down flakes and errors that happen in Make CI (currently enabled in main and pull requests.

OTP-27 bugs

Flakes

  • Failed to create SSL certificates - multiple test suites, one example in comment
  • rabbit > amqp_system > access_failure - details in comment
  • rabbit > metrics_SUITE > connection_metric_count_test - details in comment
  • rabbit > per_node_limit_SUITE > channel_consumers_limit - details in comment
  • rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology - details in comment - possibly solved via 8c046c7
  • rabbitmq_amqp_client > management_SUITE > cluster_size_3 > classic_queue_stopped - details in comment
  • rabbitmq_cli sometimes fails with 1383 tests, 5 failures - the failures are all tests that set the disk free limit. It is not yet known whether they are proper errors, normal flakes or caused by differences in GH runners
  • rabbitmq_federation > exchange_SUITE > rolling_upgrade > child_id_format - timetrap timeout
  • rabbitmq_federation > queue_SUITE > classic_queue > without_disambiguate > cluster_size_1 > dynamic_plugin_stop_start - details in comment
  • rabbitmq_management > clustering_SUITE > non_parallel_tests > queue_on_other_node - details in comment
  • rabbitmq_management > clustering_prop_SUITE > non_parallel_tests > prop_connection_channel_counts_test - details in comment
  • rabbitmq_mqtt > parallel-ct-set-1 > mqtt_shared_SUITE > cluster_size_3 > v4 rabbit_mqtt_qos0_queue_kill_node - test failure (details in comment) leads to subsequent tests being unable to continue and to the job timing out after 30 minutes. Possibly solved via 17df1b9
@lhoguin lhoguin added the bug label Oct 1, 2024
@lhoguin

This comment was marked as resolved.

@lhoguin

This comment was marked as resolved.

@lhoguin
Copy link
Contributor Author

lhoguin commented Oct 1, 2024

rabbitmq_federation > queue_SUITE > classic_queue > without_disambiguate > cluster_size_1 > dynamic_plugin_stop_start
    #1. {error,{test_case_failed,"Did not receive expected payloads [<<\"HELLO-to-upstream\">>] in time"}}

And

queue_SUITE > mixed > without_disambiguate > cluster_size_1 > multiple_downstreams
    #1. {error,{test_case_failed,"Did not receive expected payloads [<<\"HELLO-to-upstream\">>] in time"}}

And other test cases in that same suite.

@lhoguin
Copy link
Contributor Author

lhoguin commented Oct 1, 2024

rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}

@lhoguin
Copy link
Contributor Author

lhoguin commented Oct 1, 2024

rabbitmq_amqp_client > management_SUITE > cluster_size_3 > classic_queue_stopped
    #1. {error,
            {{badmatch,
                 {error,
                     {shutdown,{failed_to_start_child,reader,econnrefused}}}},
             [{management_SUITE,init,2,
                  [{file,"management_SUITE.erl"},{line,1037}]},
              {management_SUITE,classic_queue_stopped,1,
                  [{file,"management_SUITE.erl"},{line,598}]},
              {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1794}]},
              {test_server,run_test_case_eval1,6,
                  [{file,"test_server.erl"},{line,1303}]},
              {test_server,run_test_case_eval,9,
                  [{file,"test_server.erl"},{line,1235}]}]}}

@lhoguin
Copy link
Contributor Author

lhoguin commented Oct 1, 2024

rabbitmq_mqtt > parallel-ct-set-1 > mqtt_shared_SUITE > cluster_size_3 > v4 rabbit_mqtt_qos0_queue_kill_node

=== Ended at 2024-10-01 09:59:52
=== Location: [{mqtt_shared_SUITE,rabbit_mqtt_qos0_queue_kill_node,[1165](https://github.com/rabbitmq/rabbitmq-server/issues/mqtt_shared_suite.src.html#1165)},
              {test_server,ts_tc,1793},
              {test_server,run_test_case_eval1,1302},
              {test_server,run_test_case_eval,1234}]
=== === Reason: no match of right hand side value {publish_not_received,
                                                    <<"m1">>}
  in function  mqtt_shared_SUITE:rabbit_mqtt_qos0_queue_kill_node/1 (mqtt_shared_SUITE.erl, line 1165)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)

@lhoguin
Copy link
Contributor Author

lhoguin commented Oct 17, 2024

Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n   at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n   at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},

When this flake occurs, the RabbitMQ logs for this test show as expected:

2024-10-24 14:08:46.051834+00:00 [debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
2024-10-24 14:08:46.052915+00:00 [info] <0.1321.0> Created user 'access_failure'
2024-10-24 14:08:46.055682+00:00 [debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
2024-10-24 14:08:46.056552+00:00 [info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
2024-10-24 14:08:48.192887+00:00 [info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
2024-10-24 14:08:48.208538+00:00 [debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
2024-10-24 14:08:48.212575+00:00 [info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
2024-10-24 14:08:48.212774+00:00 [debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
2024-10-24 14:08:48.224473+00:00 [debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
2024-10-24 14:08:48.229608+00:00 [warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
2024-10-24 14:08:48.229608+00:00 [warning] <0.1338.0>                                             {symbol,
2024-10-24 14:08:48.229608+00:00 [warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
2024-10-24 14:08:48.229608+00:00 [warning] <0.1338.0>                                             {utf8,
2024-10-24 14:08:48.229608+00:00 [warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
2024-10-24 14:08:48.229608+00:00 [warning] <0.1338.0>                                             undefined}
2024-10-24 14:08:48.229850+00:00 [debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
2024-10-24 14:08:48.445342+00:00 [warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
2024-10-24 14:08:48.445342+00:00 [warning] <0.1333.0> client unexpectedly closed TCP connection

Deleted this F# test since it's redundant: #12581

@lhoguin
Copy link
Contributor Author

lhoguin commented Oct 17, 2024

Node: rabbit_shard4@localhost
Case: metrics_SUITE:connection_metric_count_test
Reason: {error,
            {{assert,
                 [{module,rabbit_ct_proper_helpers},
                  {line,21},
                  {expression,
                      "proper : counterexample ( erlang : apply ( Fun , Args ) , [ { numtests , NumTests } , { on_output , fun ( \".\" , _ ) -> ok ; ( F , A ) -> ct : pal ( ? LOW_IMPORTANCE , F , A ) end } ] )"},
                  {expected,true},
                  {not_boolean,[{21,[remove,add,remove]}]}]},
             [{rabbit_ct_proper_helpers,run_proper,3,
                  [{file,"rabbit_ct_proper_helpers.erl"},{line,21}]},

And

Node: rabbit_shard4@localhost
Case: metrics_SUITE:connection_metric_count_test
Reason: {error,
            {{assert,
                 [{module,rabbit_ct_proper_helpers},
                  {line,21},
                  {expression,
                      "proper : counterexample ( erlang : apply ( Fun , Args ) , [ { numtests , NumTests } , { on_output , fun ( \".\" , _ ) -> ok ; ( F , A ) -> ct : pal ( ? LOW_IMPORTANCE , F , A ) end } ] )"},
                  {expected,true},
                  {not_boolean,
                      [{16,
                        [add,remove,remove,add,add,remove,add,add,remove,
                         remove,add,remove,add,remove,add,remove,add,add,
                         remove,add,add,remove,add,add,add,remove,add,remove,
                         remove,add,remove,remove,add,add,remove,remove,add,
                         add,add,remove,add,add,remove,add,add,remove,remove,
                         add,remove,remove,remove,remove,remove,add,add,
                         remove,remove,remove,remove,remove,add,remove,add,
                         add,remove,remove,add,remove,remove,add,add,add,
                         remove,add,remove,remove,remove,remove,add,remove,
                         add,remove,remove,add,remove,add,add,remove,add,
                         add]}]}]},
             [{rabbit_ct_proper_helpers,run_proper,3,
                  [{file,"rabbit_ct_proper_helpers.erl"},{line,21}]},

@lhoguin
Copy link
Contributor Author

lhoguin commented Oct 17, 2024

Node: rabbit_shard1@localhost
Case: auth_SUITE:{init_per_group,authz,[]}
Reason: {error,
            {badarg,
                [{lists,keysearch,
                     [rmq_nodes,1,{skip,"Failed to create SSL certificates"}],
                     [{error_info,#{module => erl_stdlib_errors}}]},
                 {test_server,lookup_config,2,
                     [{file,"test_server.erl"},{line,1778}]},
                 {rabbit_ct_broker_helpers,get_node_config,2,
                     [{file,"rabbit_ct_broker_helpers.erl"},{line,1418}]},
                 {rabbit_ct_broker_helpers,get_node_config,3,
                     [{file,"rabbit_ct_broker_helpers.erl"},{line,1433}]},
                 {rabbit_ct_broker_helpers,rpc,5,
                     [{file,"rabbit_ct_broker_helpers.erl"},{line,1844}]},
                 {auth_SUITE,init_per_group,2,
                     [{file,"auth_SUITE.erl"},{line,149}]},
                 {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1793}]},
                 {test_server,run_test_case_eval1,6,
                     [{file,"test_server.erl"},{line,1390}]}]}}

Node: rabbit_shard1@localhost
Case: auth_SUITE:{init_per_group,authz,[]}
Reason: {error,
            {badarg,
                [{lists,keysearch,
                     [rmq_nodes,1,{skip,"Failed to create SSL certificates"}],
                     [{error_info,#{module => erl_stdlib_errors}}]},
                 {test_server,lookup_config,2,
                     [{file,"test_server.erl"},{line,1778}]},
                 {rabbit_ct_broker_helpers,get_node_config,2,
                     [{file,"rabbit_ct_broker_helpers.erl"},{line,1418}]},
                 {rabbit_ct_broker_helpers,get_node_config,3,
                     [{file,"rabbit_ct_broker_helpers.erl"},{line,1433}]},
                 {rabbit_ct_broker_helpers,rpc,5,
                     [{file,"rabbit_ct_broker_helpers.erl"},{line,1844}]},
                 {auth_SUITE,init_per_group,2,
                     [{file,"auth_SUITE.erl"},{line,149}]},
                 {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1793}]},
                 {test_server,run_test_case_eval1,6,
                     [{file,"test_server.erl"},{line,1390}]}]}}

@lhoguin
Copy link
Contributor Author

lhoguin commented Oct 17, 2024

Node: rabbit_shard4@localhost
Case: per_node_limit_SUITE:channel_consumers_limit
Reason: {error,
            {{shutdown,
                 {server_initiated_close,530,
                     <<"NOT_ALLOWED - reached maximum (2) of consumers per channel">>}},
             {gen_server,call,
                 [<9471.5195.0>,
                  {command,
                      {close,
                          {'connection.close',200,<<"Goodbye">>,0,0},
                          infinity}},
                  infinity]}}}

ansd added a commit that referenced this issue Oct 23, 2024
This commit attempts to eliminate the test flake described in
#12413 (comment)

```
rabbitmq_mqtt > parallel-ct-set-1 > mqtt_shared_SUITE > cluster_size_3 > v4 rabbit_mqtt_qos0_queue_kill_node

=== Ended at 2024-10-01 09:59:52
=== Location: [{mqtt_shared_SUITE,rabbit_mqtt_qos0_queue_kill_node,[1165](https://github.com/rabbitmq/rabbitmq-server/issues/mqtt_shared_suite.src.html#1165)},
              {test_server,ts_tc,1793},
              {test_server,run_test_case_eval1,1302},
              {test_server,run_test_case_eval,1234}]
=== === Reason: no match of right hand side value {publish_not_received,
                                                    <<"m1">>}
  in function  mqtt_shared_SUITE:rabbit_mqtt_qos0_queue_kill_node/1 (mqtt_shared_SUITE.erl, line 1165)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)
```

This flake could not be reproduced locally.
This commit also assumes that this flake occurred under Khepri but not
under Mnesia.

The hypothesis is the following:
* Node 0 is down
* MQTT client creates binding on node 1
* Khepri commits since the binding is replicated and persisted on node 1
  and node 2. However the binding isn't reflected yet in node 2's
  routing projecting table.
* Publishing a message to node 2 routes to nowhere.
ansd added a commit that referenced this issue Oct 23, 2024
This commit attempts to eliminate the test flake described in
#12413 (comment)

```
rabbitmq_mqtt > parallel-ct-set-1 > mqtt_shared_SUITE > cluster_size_3 > v4 rabbit_mqtt_qos0_queue_kill_node

=== Ended at 2024-10-01 09:59:52
=== Location: [{mqtt_shared_SUITE,rabbit_mqtt_qos0_queue_kill_node,[1165](https://github.com/rabbitmq/rabbitmq-server/issues/mqtt_shared_suite.src.html#1165)},
              {test_server,ts_tc,1793},
              {test_server,run_test_case_eval1,1302},
              {test_server,run_test_case_eval,1234}]
=== === Reason: no match of right hand side value {publish_not_received,
                                                    <<"m1">>}
  in function  mqtt_shared_SUITE:rabbit_mqtt_qos0_queue_kill_node/1 (mqtt_shared_SUITE.erl, line 1165)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)
```

This flake could not be reproduced locally.
This commit also assumes that this flake occurred under Khepri but not
under Mnesia.

The hypothesis is the following:
* Node 0 is down
* MQTT client creates binding on node 1
* Khepri commits since the binding is replicated and persisted on node 1
  and node 2. However the binding isn't reflected yet in node 2's
  routing projecting table.
* Publishing a message to node 2 routes to nowhere.

(cherry picked from commit 17df1b9)
ansd added a commit that referenced this issue Oct 24, 2024
As described in #12413 (comment)
test case queue_topology flaked in CI with the following error:
```
rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}
```

This flake could not be reproduced locally (neither with Mnesia nor with Khepri).
ansd added a commit that referenced this issue Oct 24, 2024
As described in #12413 (comment)
test case queue_topology flaked in CI with the following error:
```
rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}
```

This flake could not be reproduced locally (neither with Mnesia nor with Khepri).

(cherry picked from commit 8c046c7)
ansd added a commit that referenced this issue Oct 24, 2024
in order to troubleshoot the flake described in
#12413 (comment)
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```
ansd added a commit that referenced this issue Oct 24, 2024
in order to troubleshoot the flake described in
#12413 (comment)
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

(cherry picked from commit 0c905f9)
ansd added a commit that referenced this issue Oct 24, 2024
This test flakes in CI as described in
#12413 (comment)

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.
ansd added a commit that referenced this issue Oct 24, 2024
This test flakes in CI as described in
#12413 (comment)

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.
mergify bot pushed a commit that referenced this issue Oct 24, 2024
This test flakes in CI as described in
#12413 (comment)

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.

(cherry picked from commit b1169d0)
ansd added a commit that referenced this issue Oct 24, 2024
This test flakes in CI as described in
#12413 (comment)

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.

(cherry picked from commit b1169d0)
michaelklishin pushed a commit that referenced this issue Oct 25, 2024
As described in #12413 (comment)
test case queue_topology flaked in CI with the following error:
```
rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}
```

This flake could not be reproduced locally (neither with Mnesia nor with Khepri).
michaelklishin pushed a commit that referenced this issue Oct 25, 2024
in order to troubleshoot the flake described in
#12413 (comment)
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```
michaelklishin pushed a commit that referenced this issue Nov 4, 2024
As described in #12413 (comment)
test case queue_topology flaked in CI with the following error:
```
rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}
```

This flake could not be reproduced locally (neither with Mnesia nor with Khepri).
michaelklishin pushed a commit that referenced this issue Nov 4, 2024
in order to troubleshoot the flake described in
#12413 (comment)
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```
michaelklishin pushed a commit that referenced this issue Nov 4, 2024
This test flakes in CI as described in
#12413 (comment)

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant