Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySql sessions dying on GCP #179

Open
snekiam opened this issue Jul 15, 2020 · 4 comments
Open

MySql sessions dying on GCP #179

snekiam opened this issue Jul 15, 2020 · 4 comments
Labels
bug Something isn't working

Comments

@snekiam
Copy link
Member

snekiam commented Jul 15, 2020

Describe the bug
This is quite the curious bug - its not clear to me why this occurs, but once it does the worker doesn't respond anymore. Perhaps the OsError is related to the transaction being trashed. I'm mostly pointing this issue at the sqlalchemy.exc.InvalidRequestError:, but I'm guessing some of the other log content is relevant.

To Reproduce
Unfortunately...I can only tell you to run this on GCP (or in docker somewhere else, most likely), leave it running for an extended period of time, and ask it questions periodically.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Ubuntu 20.04 Focal
  • Browser: N/A
  • Version: N/A

Additional context
Here's the (possibly) relevant log content:

[2020-07-14 08:14:18 +0000] [7] [ERROR] Socket error processing request.
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/workers/sync.py", line 133, in handle
    req = next(parser)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/parser.py", line 41, in __next__
    self.mesg = self.mesg_class(self.cfg, self.unreader, self.req_count)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/message.py", line 186, in __init__
    super().__init__(cfg, unreader)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/message.py", line 53, in __init__
    unused = self.parse(self.unreader)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/message.py", line 198, in parse
    self.get_data(unreader, buf, stop=True)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/message.py", line 189, in get_data
    data = unreader.read()
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/unreader.py", line 37, in read
    d = self.chunk()
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/unreader.py", line 64, in chunk
    return self.sock.recv(self.mxchunk)
  File "/usr/lib/python3.8/ssl.py", line 1226, in recv
    return self.read(buflen)
  File "/usr/lib/python3.8/ssl.py", line 1101, in read
    return self._sslobj.read(len)
OSError: [Errno 0] Error
[2020-07-14 15:51:00 +0000] [7] [ERROR] Socket error processing request.
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/workers/sync.py", line 133, in handle
    req = next(parser)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/parser.py", line 41, in __next__
    self.mesg = self.mesg_class(self.cfg, self.unreader, self.req_count)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/message.py", line 186, in __init__
    super().__init__(cfg, unreader)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/message.py", line 53, in __init__
    unused = self.parse(self.unreader)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/message.py", line 198, in parse
    self.get_data(unreader, buf, stop=True)
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/message.py", line 189, in get_data
    data = unreader.read()
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/unreader.py", line 37, in read
    d = self.chunk()
  File "/usr/local/lib/python3.8/dist-packages/gunicorn/http/unreader.py", line 64, in chunk
    return self.sock.recv(self.mxchunk)
  File "/usr/lib/python3.8/ssl.py", line 1226, in recv
    return self.read(buflen)
  File "/usr/lib/python3.8/ssl.py", line 1101, in read
    return self._sslobj.read(len)
OSError: [Errno 0] Error
Exception during reset or similar
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/mysql/connector/network.py", line 249, in recv_plain
    chunk = self.sock.recv(4 - packet_len)
  File "/usr/lib/python3.8/ssl.py", line 1226, in recv
    return self.read(buflen)
  File "/usr/lib/python3.8/ssl.py", line 1101, in read
    return self._sslobj.read(len)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 693, in _finalize_fairy
    fairy._reset(pool)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/pool/base.py", line 880, in _reset
    pool._dialect.do_rollback(self)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/dialects/mysql/base.py", line 2302, in do_rollback
    dbapi_connection.rollback()
  File "/usr/local/lib/python3.8/dist-packages/mysql/connector/connection.py", line 958, in rollback
    self._execute_query("ROLLBACK")
  File "/usr/local/lib/python3.8/dist-packages/mysql/connector/connection.py", line 970, in _execute_query
    self.cmd_query(query)
  File "/usr/local/lib/python3.8/dist-packages/mysql/connector/connection.py", line 590, in cmd_query
    result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
  File "/usr/local/lib/python3.8/dist-packages/mysql/connector/connection.py", line 350, in _send_cmd
    return self._socket.recv()
  File "/usr/local/lib/python3.8/dist-packages/mysql/connector/network.py", line 277, in recv_plain
    raise errors.OperationalError(
mysql.connector.errors.OperationalError: 2055: Lost connection to MySQL server at '35.247.107.51:3306', system error: 104 Connection reset by peer

[2020-07-15 00:48:51,841] ERROR in app: Exception on /ask [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2446, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1951, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.8/dist-packages/flask_cors/extension.py", line 161, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1820, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.8/dist-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/nimbus/flask_api.py", line 147, in handle_question
    response = {"answer": nimbus.answer_question(question)}
  File "/nimbus/nimbus.py", line 42, in answer_question
    answer = qa.answer(ans_dict)
  File "/nimbus/QA.py", line 52, in answer
    db_data = self.db_query(extracted_vars, self.db)
  File "/nimbus/QA.py", line 231, in _chain_db_access
    for key, val in fn(extracted_vars, db).items():
  File "/nimbus/QA.py", line 95, in _get_property
    value = db.get_property_from_entity(
  File "/nimbus/database_wrapper.py", line 533, in get_property_from_entity
    props = self._get_property_from_entity(
  File "/nimbus/database_wrapper.py", line 505, in _get_property_from_entity
    for row in query_obj.all():
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3233, in all
    return list(self)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3388, in __iter__
    self.session._autoflush()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 1586, in _autoflush
    self.flush()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2479, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2571, in _flush
    flush_context.transaction = transaction = self.begin(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 947, in begin
    self.transaction = self.transaction._begin(nested=nested)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 316, in _begin
    self._assert_active()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 288, in _assert_active
    raise sa_exc.InvalidRequestError(
sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (mysql.connector.errors.OperationalError) MySQL Connection not available.

[2020-07-15 00:57:51,144] ERROR in app: Exception on /ask [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 2446, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1951, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.8/dist-packages/flask_cors/extension.py", line 161, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1820, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.8/dist-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/nimbus/flask_api.py", line 147, in handle_question
    response = {"answer": nimbus.answer_question(question)}
  File "/nimbus/nimbus.py", line 42, in answer_question
    answer = qa.answer(ans_dict)
  File "/nimbus/QA.py", line 52, in answer
    db_data = self.db_query(extracted_vars, self.db)
  File "/nimbus/QA.py", line 231, in _chain_db_access
    for key, val in fn(extracted_vars, db).items():
  File "/nimbus/QA.py", line 95, in _get_property
    value = db.get_property_from_entity(
  File "/nimbus/database_wrapper.py", line 533, in get_property_from_entity
    props = self._get_property_from_entity(
  File "/nimbus/database_wrapper.py", line 505, in _get_property_from_entity
    for row in query_obj.all():
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3233, in all
    return list(self)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/query.py", line 3388, in __iter__
    self.session._autoflush()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 1586, in _autoflush
    self.flush()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2479, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2571, in _flush
    flush_context.transaction = transaction = self.begin(
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 947, in begin
    self.transaction = self.transaction._begin(nested=nested)
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 316, in _begin
    self._assert_active()
  File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 288, in _assert_active
    raise sa_exc.InvalidRequestError(
sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (mysql.connector.errors.OperationalError) MySQL Connection not available.
[SQL: INSERT INTO `QuestionLog` (question, timestamp) VALUES (%(question)s, %(timestamp)s)]
[parameters: [{'question': 'what is?', 'timestamp': datetime.datetime(2020, 7, 15, 0, 44, 11, 221947)}]]
(Background on this error at: http://sqlalche.me/e/e3q8) (Background on this error at: http://sqlalche.me/e/7s2a)

@snekiam snekiam added the bug Something isn't working label Jul 15, 2020
@chidiewenike
Copy link
Contributor

Interesting addressing of the problem here: benoitc/gunicorn#1885

@snekiam
Copy link
Member Author

snekiam commented Aug 1, 2020

Very interesting - I wonder if GCP is doing a 'health ping' which is breaking things. I'll check that this Sunday.

@mfekadu
Copy link
Member

mfekadu commented Aug 2, 2020

@snekiam
I am unsure if GCP does health checks by default. I believe it would need to be set up by an administrator && I don't believe we have done that yet.

@snekiam
Copy link
Member Author

snekiam commented Aug 23, 2020

#195 should help this not kill the api completely, but doesn't solve the root issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants