You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ReindexQueueFactory class handles most of the DB queries connecting to the persistent reindex queue. This factory class is called through ReindexQueueAPIImpl the main implementation of ReindexQueueAPI.
The problem is that although we have a single ReindexThread class and thread that attempts to read and process the items in the dist_reindex_journal table we are not consistently updating the table with the same thread. Each separate thread will act in its own transaction and connection. Reads of this table by the reindex thread will be consistent as far as other transactions having completed before those updates can be processed, but we cannot currently guarantee what thread and transaction is currently updating the table which can cause conflicts.
Some of the api methods are called synchronously and will use whatever connection is in the calling method. Some are called in a post transaction trigger, that will ensure that changes in the transaction have at least completed and are visible.
One big issue is that the ReindexThread creates requests to the ElasticSearchAPI using an asynchronous method that passes the BulkProcessorListner class this defines actions to run asynchronously before and after the class. In the handleSuccess method it calls APILocator.getReindexQueueAPI().deleteReindexEntry(successful); that directly tries to update the dist_reindex_journal table to remove the entry. The problem is that due to the asynchronous nature this delete is being done on a separate thread for the ElasticSearch bulkProcessor call. The removal and update of this row can conflict with other threads.
Making the BulkProcessor called from the reindexThread synchronous may solve some of these issues ensuring that the removal is done by the main ReindexThread thread. There are still opportunities in calling the ReindexQueueFactory from multiple threads/connections simultaniously causing conflict.
One of the ways of solving this is to
make ReindexQueueFactory to the ReindexQueueAPIImpl and ensure that no other code outside of the scope of the reindex service modifies or reads the dist_reindex_journal
Instead of allowing calls to ReindexQueueAPIImpl methods directly make DB updates we use an in memory thread safe queue like a BlockingQueue to simply accept the request.
I single thread is used to process the requests and update the state in the dist_reindex_journal.
An im memory list of items that are yet to be persisted including wating response from elasticsearch can be maintained to enable de-duplication or conflict. We should not attempt to store all items awaiting reindex in memory as the list can be large, the db should be the source of truth, but the in memory list should handle the items changes that are yet to be persisted to the db. For handling large amounts of updates the thread writing to the dist_reindex_journal can write in batches from the in memory queue rather than individually.
If an item version 1 is currently being reindexed and in quick succession we create versions 2 and 3 before version 1 is up to date we can't stop the indexing of version 1, but we should not just remove the index entry after which would then prevent version 3 from being indexed. We do not need to index both version 2 and 3 though, as soon as version 1 has finished indexing we can proceed to index the latest version 3.
The text was updated successfully, but these errors were encountered:
The ReindexQueueFactory class handles most of the DB queries connecting to the persistent reindex queue. This factory class is called through ReindexQueueAPIImpl the main implementation of ReindexQueueAPI.
The problem is that although we have a single ReindexThread class and thread that attempts to read and process the items in the dist_reindex_journal table we are not consistently updating the table with the same thread. Each separate thread will act in its own transaction and connection. Reads of this table by the reindex thread will be consistent as far as other transactions having completed before those updates can be processed, but we cannot currently guarantee what thread and transaction is currently updating the table which can cause conflicts.
Some of the api methods are called synchronously and will use whatever connection is in the calling method. Some are called in a post transaction trigger, that will ensure that changes in the transaction have at least completed and are visible.
One big issue is that the ReindexThread creates requests to the ElasticSearchAPI using an asynchronous method that passes the BulkProcessorListner class this defines actions to run asynchronously before and after the class. In the handleSuccess method it calls APILocator.getReindexQueueAPI().deleteReindexEntry(successful); that directly tries to update the dist_reindex_journal table to remove the entry. The problem is that due to the asynchronous nature this delete is being done on a separate thread for the ElasticSearch bulkProcessor call. The removal and update of this row can conflict with other threads.
Making the BulkProcessor called from the reindexThread synchronous may solve some of these issues ensuring that the removal is done by the main ReindexThread thread. There are still opportunities in calling the ReindexQueueFactory from multiple threads/connections simultaniously causing conflict.
One of the ways of solving this is to
If an item version 1 is currently being reindexed and in quick succession we create versions 2 and 3 before version 1 is up to date we can't stop the indexing of version 1, but we should not just remove the index entry after which would then prevent version 3 from being indexed. We do not need to index both version 2 and 3 though, as soon as version 1 has finished indexing we can proceed to index the latest version 3.
The text was updated successfully, but these errors were encountered: