Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

take pruning logic out of create_transaction logic #11845

Merged
merged 17 commits into from
Jan 25, 2024

Conversation

poopoothegorilla
Copy link
Contributor

@poopoothegorilla poopoothegorilla commented Jan 22, 2024

This PR removes the pruning logic from the createTransaction logic

  • txStore.CreateTransaction no longer prunes the transaction queue
  • txmgr.SendNativeToken calls PruneQueue after successful txStore.CreateTransaction
  • txmgr.CreateTransaction calls PruneQueue after successful txStore.CreateTransaction
  • changes PruneQueue and PruneUnstartedTxQueue to return ids []int64, err error instead of n int64, err error
  • Updated impacted tests

This change helps the in-memory work move forward #11625

Copy link
Contributor

I see that you haven't updated any README files. Would it make sense to do so?

@poopoothegorilla poopoothegorilla marked this pull request as draft January 22, 2024 20:16
@poopoothegorilla poopoothegorilla marked this pull request as ready for review January 22, 2024 21:49
@patrick-dowell
Copy link
Contributor

Is it important that we always call PruneQueue after txStore.CreateTransaction? What's the motivation behind removing it from txStore.CreateTransaction? Also, what's the reasoning behind the change to PruneUnstartedTxQueue to return the array of ids?

@poopoothegorilla
Copy link
Contributor Author

poopoothegorilla commented Jan 22, 2024

@patrick-dowell

Is it important that we always call PruneQueue after txStore.CreateTransaction?

yes, since we want to make sure that adding unstarted transactions to the unstarted transaction queue also keeps the queue a manageable size

What's the motivation behind removing it from txStore.CreateTransaction?
Also, what's the reasoning behind the change to PruneUnstartedTxQueue to return the array of ids?

The biggest reason is due to the in-memory work. The original txstore.CreateTransaction method returns the created transaction and an error. Since the in-memory store needs to ensure that the state of the transactions inmemory match the presisted txns we needed stronger gaurantees that the pruned transactions are the same. If we were to do this without this step we could run the risk of potential divergence of data. So the idea is that the inmemory unstarted queue will take the pruned persisted txns as truth and remove those from inmemory. We can then have a future work step to remove or reverse the evm_txstore prune step and just use the inmemory step instead.

The other reason to remove the prune step from the evm txstore create transaction method is since this prune step shouldnt be chain specific for createtransaction... the pruning strategy might... but in my opinion it makes sense to always ensure prune is called in the generalized txmgr after a createtransaction.

core/chains/evm/txmgr/evm_tx_store.go Outdated Show resolved Hide resolved
common/txmgr/txmgr.go Outdated Show resolved Hide resolved
common/txmgr/txmgr.go Outdated Show resolved Hide resolved
common/txmgr/txmgr.go Outdated Show resolved Hide resolved
common/txmgr/txmgr.go Show resolved Hide resolved
common/txmgr/strategies.go Outdated Show resolved Hide resolved
core/chains/evm/txmgr/evm_tx_store.go Show resolved Hide resolved
Copy link
Contributor

@patrick-dowell patrick-dowell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good, but I am confused about the queueSize math here.

var cancel context.CancelFunc
ctx, cancel = context.WithTimeout(ctx, s.queryTimeout)
defer cancel()

n, err = pruneService.PruneUnstartedTxQueue(ctx, s.queueSize, s.subject)
// NOTE: We prune one less than the queue size because we will be adding a new transaction to the queue right after this PruneQueue call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comments helps! But I'm not yet sure why we can't just use the queueSize. The prune is occurring before the create call right? This means that the new transaction isn't in the queue yet, so it should be safe to use s.queueSize as opposed to queueSize - 1 right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern that prashant brought up was that if we didnt do a -1 we could end up with a queue size +1 above the max queue size. Here is an example

  1. Queue is full at 500 txns
  2. PruneQueue is run... it sees 500txns which matches the max so it moves on without pruning
  3. CreateTransaction puts a new txn in the queue making the new queue size 501 which is technically above the max limit of 500

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be only off by 1 if we chose to stick with queueSize, but let me know if i am not fully understanding your idea

core/chains/evm/txmgr/strategies_test.go Show resolved Hide resolved
@@ -682,3 +682,36 @@ func (n *NullTxManager[CHAIN_ID, HEAD, ADDR, TX_HASH, BLOCK_HASH, SEQ, FEE]) Fin
func (n *NullTxManager[CHAIN_ID, HEAD, ADDR, TX_HASH, BLOCK_HASH, SEQ, FEE]) CountTransactionsByState(ctx context.Context, state txmgrtypes.TxState) (count uint32, err error) {
return count, errors.New(n.ErrMsg)
}

func (b *Txm[CHAIN_ID, HEAD, ADDR, TX_HASH, BLOCK_HASH, R, SEQ, FEE]) pruneQueueAndCreateTxn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurred to me during our call that both PruneQueue and createTransaction touch the queue, but there's no locking on this function. As a result, its possible there could be a race condition that occurs when multiple pruneQueueAndCreateTxn calls are done at the same time. For example, if thread A calls pruneQueue and then gets pre-empted, then thread B could run pruneQueueAndCreate putting the queueSize to the max queue size, and then when thread A finishes, it could then call createTransaction when the queueSize is at maximum, thereby allowing the queue to grow beyond the max queue size.

My question here is: do we need a lock to guard the interactions with the queue within this function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a good point!

We do have a constraint in our other loops where only 1 thread is operating for each fromAddress. Thus no thread safety issues.
However the CreateTransaction method is called by the client, and can be called concurrently.
So the situation you described can potentially happen.
Note that the only harm in this case would be that our queue will momentarily grow to a larger size. As soon as there's yet another call from client, that prune will again cut down the queue size to right values.

So this isn't a big problem, except that theoretically we could have a larger queue.

I won't mind adding a new lock to prevent this situation, but even if we don't, it won't break/affect anything, according to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch patrick. Added the Lock

Copy link
Contributor

@patrick-dowell patrick-dowell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for the changes

@cl-sonarqube-production
Copy link

@prashantkumar1982 prashantkumar1982 added this pull request to the merge queue Jan 25, 2024
Merged via the queue into develop with commit e214014 Jan 25, 2024
93 checks passed
@prashantkumar1982 prashantkumar1982 deleted the jtw/refactor-create-transaction branch January 25, 2024 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants