Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some VReplication e2e Refactoring #14735

Merged
merged 21 commits into from
Dec 27, 2023

Conversation

rohit-nayak-ps
Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps commented Dec 8, 2023

We need to change the required tests before we merge, I will do that once it has reviews

Description

This PR contains some simplifications and improvements of the e2e test framework. Some of these are:

  • Wait for tablets to get healthy and a primary to be elected right when we create shards add tablets to the it, rather than for every test.
  • Explicitly set UTC timezone on underlying mysql instances to fix issues where local tests fail sometimes because the local instances are on SYSTEM time zones.
  • Change mixed case tables srcCharset and dstCharset to src_charset and dst_charset, since these were causing failures on Mac.
  • Reduce global variable usage.
  • Common vtgate connection helpers.
  • Remove repeated logic around cells and cell names in the cluster.

Related Issue(s)

Partly fixes: #13224

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on the CI
  • Documentation was added or is not required

Copy link
Contributor

vitess-bot bot commented Dec 8, 2023

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Dec 8, 2023
@rohit-nayak-ps rohit-nayak-ps added Component: VReplication Type: Testing and removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request labels Dec 8, 2023
@github-actions github-actions bot added this to the v19.0.0 milestone Dec 8, 2023
@rohit-nayak-ps rohit-nayak-ps force-pushed the rohit/vrep-refactor-e2e branch 2 times, most recently from bf361fd to 46916a5 Compare December 14, 2023 23:10
@rohit-nayak-ps rohit-nayak-ps changed the title WIP: VReplication e2e Refactor Some VReplication e2e Refactoring Dec 17, 2023
@rohit-nayak-ps rohit-nayak-ps force-pushed the rohit/vrep-refactor-e2e branch from c7dad2a to 3035407 Compare December 17, 2023 18:55
@rohit-nayak-ps
Copy link
Contributor Author

The expected tests are due to the combining/renaming of workflows. At time of merge we will need to mark these Not Required, Merge and mark the new ones Required.

Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I had some relatively minor comments, questions, suggestions. Let me know what you think -- I can come back to review/discuss quickly.

"vreplication_basic",
"vreplication_cellalias",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one could also be merged IMO, but doesn't have to be here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this later if the current set of workflows runs without flakiness.

Comment on lines +218 to +221
if vtgateConn == nil {
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't feel right to me. I think that we should either return the error and handle it, or we should use the test reference to fail the test within the function using e.g. require.NoError(t, err).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to race conditions this call is sometimes made after vtgates have shutdown. This is not an error.

go/test/endtoend/vreplication/cluster_test.go Outdated Show resolved Hide resolved
go/test/endtoend/vreplication/cluster_test.go Outdated Show resolved Hide resolved
return vc
}

func (vc *VitessCluster) setupVtctld() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have an non-obvious dependency on setupVtctl(), because of the AddCell parts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove setupVtctl() we will need to add it to setupVtctld() but not otherwise.

numRDOnly++
}
}
numReplicas-- // account for primary, which also has replica type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is really about the fact that the shard hasn't elected a primary yet? The comment is not clear to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we wait for n replicas, the primary is not counted as one, even though the Vitess tablet type is replica.


err := vc.VtctlClient.ExecuteCommand("RebuildKeyspaceGraph", keyspace.Name)
require.NoError(t, err)

log.Infof("Waiting for throttler config to be applied on all shards")
for _, shard := range keyspace.Shards {
for _, shardName := range shardNames {
shard := keyspace.Shards[shardName]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place where a nil check is probably worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only add to the map when we have a valid shard.

@@ -810,6 +916,13 @@ func (vc *VitessCluster) GetVTGateConn(t *testing.T) *mysql.Conn {
return getConnection(t, vc.ClusterConfig.hostname, vc.ClusterConfig.vtgateMySQLPort)
}

func getVTGateConn() (*mysql.Conn, func()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick comment would be nice. Just so that it's easy to see w/o looking at the function that it returns a connection (obvious) and a function to close it in a defer.

cell := vc.Cells[fkextConfig.cell]
vtgate := cell.Vtgates[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place that IMO is worth a nil check. I'm thinking that tracking down issues when there's a panic will be much harder than a NotNil check with a helpful error message (with file and line number).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We error out right at cluster setup if we are unable to create a vtgate for a cell.

Comment on lines +114 to +117
if err != nil {
return nil
}
return conn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this. We have the t reference here, why not use require.NoError(t, err)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is specifically to do this. Look at comment for insertRow() above.

@rohit-nayak-ps
Copy link
Contributor Author

Added "Do Not Merge" so that we don't end up merging before the patch releases, since we need to change set of required tests on merging and that might need fresh pushes to open PRs.

…mp columns are failing on e2e tests on Mac

Signed-off-by: Rohit Nayak <[email protected]>
… Mac. We are not really testing case sensitiveness here

Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
…Cell which has been deleted

Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
@rohit-nayak-ps rohit-nayak-ps force-pushed the rohit/vrep-refactor-e2e branch from a068cd6 to 4ffe54e Compare December 27, 2023 10:10
Signed-off-by: Rohit Nayak <[email protected]>
@rohit-nayak-ps rohit-nayak-ps merged commit 1ee8f28 into vitessio:main Dec 27, 2023
99 of 100 checks passed
@rohit-nayak-ps rohit-nayak-ps deleted the rohit/vrep-refactor-e2e branch December 27, 2023 13:41
mattlord added a commit to planetscale/vitess that referenced this pull request Dec 27, 2023
We merged these two PRs concurrently and thus
ended up with a breakage:
  - vitessio#14735
  - vitessio#14786

This PR addresses that issue by using a locally
scoped vtgate connection according to the newly
established model.

Signed-off-by: Matt Lord <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracking Issue: VReplication test refactoring, cleanup, documentation enhancements.
3 participants