-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vtorc/vttablet: can't downgrade from v20 to v19 #16300
Comments
@derekperkins what is your MySQL version? That is important in this case. The PR where we adopted the new commands is #15907 and the way it's written, we'd expect this to work. |
We aren't using semi-sync, with Percona Server v8.0.36 # ENGINE SETTINGS #
default_storage_engine = InnoDB
default-tmp-storage-engine = InnoDB
# OTHER CONFIG #
default_authentication_plugin = mysql_native_password
secure_file_priv = NULL
explicit_defaults_for_timestamp = 1
group_concat_max_len = 4M
event_scheduler = 0
symbolic-links = 0
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
binlog_format = ROW
binlog_row_image = full
binlog_expire_logs_seconds = 259200 # 3 days
sync-binlog = 0
binlog-transaction-compression = ON
log-error-suppression-list = MY-013360
slow_query_log = OFF
# PERCONA
binlog_space_limit = 10G
userstat = OFF
long_query_time = 0
log_slow_rate_type = query
log_slow_verbosity = full
log_slow_rate_limit = 100
max_slowlog_size = 1G
slow_query_log_always_write_time = 2
slow_query_log_use_global_control = all
# CACHES AND LIMITS #
tmp-table-size = 32M
max-heap-table-size = 32M
max-connections = 2500
thread-cache-size = 50
open-files-limit = 65535
table-definition-cache = 4096
table-open-cache = 4096
# INNODB #
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 4G
innodb-flush-log-at-trx-commit = 2
innodb_buffer_pool_instances = 10
innodb_buffer_pool_chunk_size = 1G
innodb-buffer-pool-size = 10G
innodb_lock_wait_timeout = 300
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
innodb_lru_scan_depth = 2000
innodb_flush_neighbors = 0
innodb_read_io_threads = 16
innodb_write_io_threads = 16
innodb_purge_threads = 4 |
OK. attempting to set semi-sync properties with a durability policy of "none" is clearly a bug. We'll need to fix this, and backport to release-20.0 |
v19 ran fine with these settings for months, and v20 runs fine now, I just wasn't able to downgrade to v19 |
And FWIW, I don't need to downgrade anymore, so this isn't urgent from my perspective. |
Hello @derekperkins! In order to debug this properly could you tell me the outputs of running the following 2 queries in your MySQL -
|
We are treating this as semi-urgent because it breaks upgrade/downgrade for the "none" DurabilityPolicy |
|
Okay, this is very interesting. From the outputs that you show, it looks like the plugin for semi-sync is loaded.
After upgrading to v20, if you upgrade your MySQL version to |
We didn't change MySQL versions at any point during this upgrade. We upgraded from v8.0.34 to v8.0.36 while on v19. We've been >= 8.0.26 since Nov 2021 and Vitess v11 |
Did you by any chance change the plugin that was being loaded?
VS
|
Oh, maybe you didn't do anything explicitly and it just happened implicitly because For an immediate workaround though, for anyone facing this issue. To downgrade, just load the old plugins instead of the new ones in my.cnf and restart mysql so that they take effect. |
I looked into this further today, and here is what I found. We reinitialize the
So, what I believe happened is as follows -
@derekperkins Could you let me know if ☝️ if this is the correct sequence of operations? If it is, then the fix is to also downgrade |
I was able to reproduce the problem and have a fix for it - #16357 |
@GuptaManan100 sorry for the late reply, but yes, your assumptions were exactly correct |
Awesome! Thank you for confirming! We have a fix ready and should be able to merge it soon 🚀 |
#16357 has been merged. Once we do a patch release, this won't be a problem. Until then the workaround is to downgrade both mysqlctl and vttablet if downgrading from v20.0.0 to v19. |
Overview of the Issue
We're seeing vttablet OOM incredibly fast on v20.0.0 for some reason, after running fine for a couple weeks. We attempted to downgrade to v19.0.4 to see if that changed anything, but the primary was unable to start. vtorc attempted to recover
UndoDemotePrimary
and couldn't ever succeed.SET GLOBAL rpl_semi_sync_master_enabled = 0, GLOBAL rpl_semi_sync_slave_enabled = 0) failed: Unknown system variable 'rpl_semi_sync_master_enabled'
When I reverted that change back to v20.0.0, vtorc was able to successfully run
UndoDemotePrimary
Related issues:
Reproduction Steps
This was tested on a single node keyspace with only a single tablet.
Binary Version
v20.0.0 for most components downgrading vttablet to v19.0.4
Operating System and Environment details
Log Fragments
The text was updated successfully, but these errors were encountered: