Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: check hub url connectivity before initialize HubProxy #254

Closed
wants to merge 2 commits into from

Conversation

rhyw
Copy link
Contributor

@rhyw rhyw commented Mar 18, 2024

Previously it waits long time till throws 504 Gateway Timeout when the specified hub url is unreachable.

Related: https://issues.redhat.com/browse/OSH-144

Previously it waits long time till throws 504 Gateway Timeout when
the specified hub url is unreachable.

Related: https://issues.redhat.com/browse/OSH-144
@rhyw rhyw marked this pull request as draft March 18, 2024 08:29
@lzaoral
Copy link
Contributor

lzaoral commented Mar 18, 2024

@rhyw This approach will not work, unfortunately. While your check may succeed, there is zero guarantee that the subsequent XML-RPC connection will succeed as well. After that, you may still get an ugly Traceback while making redundant HTTP connections to the server. I believe that is a good idea to attempt few times to connect and then timeout which kobo does already:

kobo/kobo/xmlrpc.py

Lines 471 to 492 in 8c7fef3

def request(self, *args, **kwargs):
if self.retry_count == 0:
return transport_class.request(self, *args, **kwargs)
for i in range(self.retry_count + 1):
try:
result = transport_class.request(self, *args, **kwargs)
return result
except KeyboardInterrupt:
raise
except (socket.error, socket.herror, socket.gaierror, socket.timeout,
httplib.CannotSendRequest, xmlrpclib.ProtocolError) as ex:
if i >= self.retry_count:
raise
retries_left = self.retry_count - i
retries = "%d %s left" % (retries_left, retries_left == 1 and "retry" or "retries") # 1 retry left / X retries left
print("XML-RPC connection to %s failed: %s, %s" % (args[0], ex.args[1:], retries), file=sys.stderr)
time.sleep(self.retry_timeout)
RetryTransportClass.__name__ = transport_class.__name__
RetryTransportClass.__doc__ = transport_class.__name__
return RetryTransportClass

The only issue with the current code is that the re-raised exceptions are not being caught anywhere else which leads to ugly error traces and that is the thing to improve, not the other way around:

$ osh-cli list-tasks --hub='https://hub.example.com/osh/xmlrpc' --free
XML-RPC connection to hub.example.com failed: ('Name or service not known',), 5 retries left
XML-RPC connection to hub.example.com failed: ('Name or service not known',), 4 retries left
XML-RPC connection to hub.example.com failed: ('Name or service not known',), 3 retries left
XML-RPC connection to hub.example.com failed: ('Name or service not known',), 2 retries left
XML-RPC connection to hub.example.com failed: ('Name or service not known',), 1 retry left
Traceback (most recent call last):
  File "/usr/bin/osh-cli", line 85, in <module>
    main()
  File "/usr/bin/osh-cli", line 78, in main
    parser.run()
  File "/usr/lib/python3.12/site-packages/kobo/cli.py", line 296, in run
    cmd.run(*cmd_args, **cmd_kwargs)
  File "/usr/lib/python3.12/site-packages/kobo/client/commands/cmd_list_tasks.py", line 69, in run
    self.set_hub(username, password, hub)
  File "/usr/lib/python3.12/site-packages/kobo/client/__init__.py", line 115, in set_hub
    self.hub = HubProxy(conf=self.conf)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/kobo/client/__init__.py", line 178, in __init__
    self._login(verbose=self._conf.get("DEBUG_XMLRPC"))
  File "/usr/lib/python3.12/site-packages/kobo/client/__init__.py", line 222, in _login
    if force or self._hub.auth.renew_session():
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/xmlrpc/client.py", line 1122, in __call__
    return self.__send(self.__name, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/xmlrpc/client.py", line 1461, in __request
    response = self.__transport.request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/kobo/xmlrpc.py", line 477, in request
    result = transport_class.request(self, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/xmlrpc/client.py", line 1166, in request
    return self.single_request(host, handler, request_body, verbose)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/kobo/xmlrpc.py", line 369, in _single_request3
    h = self.send_request(host, handler, request_body, verbose)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/xmlrpc/client.py", line 1291, in send_request
    self.send_content(connection, request_body)
  File "/usr/lib64/python3.12/xmlrpc/client.py", line 1321, in send_content
    connection.endheaders(request_body)
  File "/usr/lib64/python3.12/http/client.py", line 1326, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.12/http/client.py", line 1085, in _send_output
    self.send(msg)
  File "/usr/lib64/python3.12/http/client.py", line 1029, in send
    self.connect()
  File "/usr/lib64/python3.12/http/client.py", line 1465, in connect
    super().connect()
  File "/usr/lib64/python3.12/http/client.py", line 995, in connect
    self.sock = self._create_connection(
                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/socket.py", line 828, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/socket.py", line 963, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -2] Name or service not known

@rohanpm Just for the context, the linked Jira ticket relates to making Tracebacks thrown by the client on network-related issues less scary and more readable to users.

@rhyw
Copy link
Contributor Author

rhyw commented Mar 19, 2024

list-tasks --hub='https://hub.example.com/osh/xmlrpc' --free

@lzaoral I'm not sure if you applied this patch, I tried and got below instant error message:

$ docker exec -it osh-client python3 osh/client/osh-cli list-tasks --hub='https://hub.example.com/osh/xmlrpc' --free
Traceback (most recent call last):
  File "osh/client/osh-cli", line 85, in <module>
    main()
  File "osh/client/osh-cli", line 78, in main
    parser.run()
  File "/src/kobo/kobo/cli.py", line 296, in run
    cmd.run(*cmd_args, **cmd_kwargs)
  File "/src/kobo/kobo/client/commands/cmd_list_tasks.py", line 69, in run
    self.set_hub(username, password, hub)
  File "/src/kobo/kobo/client/__init__.py", line 115, in set_hub
    self.hub = HubProxy(conf=self.conf)
  File "/src/kobo/kobo/client/__init__.py", line 178, in __init__
    self._login(verbose=self._conf.get("DEBUG_XMLRPC"))
  File "/src/kobo/kobo/client/__init__.py", line 252, in _login
    raise ValueError(f"Failed to connect to {self._hub_url}: {message}")
ValueError: Failed to connect to https://hub.example.com/osh/xmlrpc: Error: HTTPSConnectionPool(host='hub.example.com', port=443): Max retries exceeded with url: /osh/xmlrpc (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f8d14bfe320>: Failed to establish a new connection: [Errno -2] Name or service not known',))

Correct it's hard to guarantee the xmlrpc endpoint with the connectivity check, but at least it helps with those long pending timeouts or non-existent hub urls specified from cli.

@@ -216,7 +244,12 @@ def _login(self, force=False, verbose=False):

# create new self._hub instance (only once, when calling constructor)
if self._hub is None:
self._hub = xmlrpclib.ServerProxy("%s/%s/" % (self._hub_url, self._client_type), allow_none=True, transport=self._transport, verbose=verbose)
success, message = self.check_hub_connectivity(self._hub_url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this will completely block initialization when it returns False, I'm a little nervous about a couple of the details of how it works.

Firstly, it doesn't use the same transport that the self._hub itself uses. This means there's some behavior differences. I don't necessarily know all the differences, but at least the SSL verification behavior will be different, I believe. Because you just use requests.get without any particular SSL-related options, this change probably has the side-effect of suddenly requiring the hub's cert to verify against whatever CA bundle the requests library is using, while it didn't previously have this requirement. (If you check the initialization of self._transport, it seems like it doesn't actually verify certs by default - unless the caller has set CA_CERT config option - but if they have set that, the new code added here doesn't respect it, so at the very least that's a bug.)

Secondly, there's the fact that at least some login methods are allowed and expected to have some side-effect on self._transport, e.g. adding a session cookie into the cookie jar. Doing this check before that happens means not seeing that side-effect yet. If the server is configured to, say, always require auth for that endpoint or else give a 401 or 403 response, then the check prior to login isn't going to work. I'm not sure it's safe to assume that it won't be configured that way, seeing as kobo doesn't own the web server config.

Is there a way to write this so that it's not adding a new HTTP-level check which works differently from most requests? Can we just use the XML-RPC client "normally" and somehow make the resulting errors more meaningful to end-users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried various approaches earlier, including passing in the _transport. I failed to find a way to verify the accessibility/validaty of the endpoint without initializing a HubProxy instance along with transport/auth.

The approach attempts to fix one issue but indeed poses various other problems. I'll see if there's a better way, and I'll close this PR, thanks

@kdudka
Copy link
Contributor

kdudka commented Mar 19, 2024

... but at least it helps with those long pending timeouts or non-existent hub urls specified from cli.

@rhyw I think that specifying a non-existing hub URL is a corner case. If a user is advanced enough to specify a custom hub URL, they will most likely be able to check possible problems with network connectivity themselves if needed.

I do not think we should introduce an additional round trip for each single invocation of the client just in case. The client should be optimized for the most common scenario, not vice versa.

@rhyw rhyw closed this Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants