Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update: Make database usage thread safe #341

Closed
wants to merge 1 commit into from

Conversation

fstachura
Copy link
Collaborator

Current update script serializes database access using mutexes. According to a user of Oracle support forums, this is not enough.

https://forums.oracle.com/ords/apexds/post/berkeley-db-file-corrupted-while-operating-for-hours-panic-4953

if you are accessing the same database from multiple threads or
multiple processes, they must share a cache (memory pool). In other
words, it is not sufficient to just make sure no DB->put or DB->get
operations are run simultaneously as you do with mutexes. Berkeley
DB also maintains information about database files across calls in
the cache, such as the list of free pages. If two threads accessing
a database file have independent freelists, they will eventually
both try to allocate the same page for different purposes, and the
structure of the file will be compromised.

DB.open provides a flag that should be specified if database is to be shared between threads

https://docs.oracle.com/cd/E17276_01/html/api_reference/C/dbopen.html

DB_THREAD
Cause the DB handle returned by DB->open() to be free-threaded; that
is, concurrently usable by multiple threads in the address space.
You should use this flag only in the absence of an encompassing
environment.

While this probably won't solve all database concurrency issues (web accessing the database during updates likely still will behave weird) it could help with recent database corruption issues.

https://docs.oracle.com/cd/E17276_01/html/programmer_reference/program_mt.html

The DB_THREAD flag must be specified to the DB_ENV->open() and
DB->open() methods if the Berkeley DB handles returned by those
interfaces will be used in the context of more than one thread.
Setting the DB_THREAD flag inconsistently may result in database
corruption.

When using the non-cursor Berkeley DB calls to retrieve key/data
items (for example, DB->get()), the memory to which the pointer
stored into the Dbt refers is valid only until the next call using
the DB handle returned by DB->open(). This includes any use of the
returned DB handle, including by another thread within the process.

For this reason, if the DB_THREAD handle was specified to the
DB->open() method, either DB_DBT_MALLOC, DB_DBT_REALLOC or
DB_DBT_USERMEM must be specified in the DBT when performing any
non-cursor key or data retrieval.

It seems that bsddb3 sets appropriate flags in DBTs for us if DB_THREAD is specified.

https://hg.jcea.es/pybsddb/file/tip/src/Module/berkeleydb.c#l2025

(ctrl+f for DB_THREAD)

I believe DBTs used in DB_put shouldn't require any extra flags because the DBTs are only read by Berkeley DB (doesn't matter if they get invalidated on the next call).

Current update script serializes database access using mutexes.
According to a user of Oracle support forums, this is not enough.

https://forums.oracle.com/ords/apexds/post/berkeley-db-file-corrupted-while-operating-for-hours-panic-4953
> if you are accessing the same database from multiple threads or
> multiple processes, they must share a cache (memory pool). In other
> words, it is not sufficient to just make sure no DB->put or DB->get
> operations are run simultaneously as you do with mutexes. Berkeley
> DB also maintains information about database files across calls in
> the cache, such as the list of free pages. If two threads accessing
> a database file have independent freelists, they will eventually
> both try to allocate the same page for different purposes, and the
> structure of the file will be compromised.

DB.open provides a flag that should be specified if database is to be
shared between threads

https://docs.oracle.com/cd/E17276_01/html/api_reference/C/dbopen.html

> DB_THREAD
> Cause the DB handle returned by DB->open() to be free-threaded; that
> is, concurrently usable by multiple threads in the address space.
> You should use this flag only in the absence of an encompassing
> environment.

While this probably won't solve all database concurrency issues (web
accessing the database during updates likely still will behave weird)
it could help with recent database corruption issues.

https://docs.oracle.com/cd/E17276_01/html/programmer_reference/program_mt.html

> The DB_THREAD flag must be specified to the DB_ENV->open() and
> DB->open() methods if the Berkeley DB handles returned by those
> interfaces will be used in the context of more than one thread.
> Setting the DB_THREAD flag inconsistently may result in database
> corruption.

> When using the non-cursor Berkeley DB calls to retrieve key/data
> items (for example, DB->get()), the memory to which the pointer
> stored into the Dbt refers is valid only until the next call using
> the DB handle returned by DB->open(). This includes any use of the
> returned DB handle, including by another thread within the process.
>
> For this reason, if the DB_THREAD handle was specified to the
> DB->open() method, either DB_DBT_MALLOC, DB_DBT_REALLOC or
> DB_DBT_USERMEM must be specified in the DBT when performing any
> non-cursor key or data retrieval.

It seems that bsddb3 sets appropriate flags in DBTs for us if DB_THREAD
is specified.

https://hg.jcea.es/pybsddb/file/tip/src/Module/berkeleydb.c#l2025

(ctrl+f for DB_THREAD)

I believe DBTs used in DB_put shouldn't require any extra flags because
the DBTs are only read by Berkeley DB (doesn't matter if they get
invalidated on the next call).
@fstachura
Copy link
Collaborator Author

fstachura commented Oct 1, 2024

I have done a basic test by indexing a database for musl. I didn't see any changes when compared with original database, but the test was rather short.
I'm still going to implement the solution discussed in #328 as I think it should be good enough for now.
This is kind of separate, as the part of the problem probably is inside of the update script itself.

Copy link
Member

@tleb tleb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart for the nit, I am testing the series out, as a second pass.

elixir/data.py Show resolved Hide resolved
@tleb
Copy link
Member

tleb commented Oct 11, 2024

Merged as 0b8d735. Added tiny commit c54a4c3 on top. Tested indexing and web on staging. Then did the first indexing using that patch on prod manually to see it working. Also modified the crontab to use the virtualenv. Thanks!

@tleb tleb closed this Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants