Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix caching detection for PostgreSQL 16+ versions #274

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ashenBlade
Copy link
Contributor

For caching separate MemoryContext context is used. And to detect that StringInfo instance is cached we use MemoryContextContains function.

But this works only for PG below 16 version - starting from 16 header, which used to contain pointer to MemoryContext, now used as special value with bitfields. So now this function can not be used - it just incorrect.

I fixed this in such way. For PG <16 behaviour does not change - we use MemoryContextContains. But for newer we create special decorator like this:

typedef struct cached_StringInfo
{
    StringInfoData data;
    bool cached;
} cached_StringInfo;

When we saved string info to cache we repalloc real StringInfo with this struct and set cached member to true (otherwise to false). It should not hit performance, because size of struct isn't really big and we just do noop (in repalloc).

This fixes bug described in #273

ashenBlade and others added 3 commits October 7, 2024 18:57
To detect cached StringInfo instance we repalloc decorator structure
with 'cache' flag - this instance created in cached memory context.
Older versions used 'MemoryContextContains' function which is not
available in newer versions (deleted).
@JerrySievert
Copy link
Contributor

do you happen to have a test case that exposes the issue?

@ashenBlade
Copy link
Contributor Author

Yes. That case passed, at least

@ivan-v-kush
Copy link
Contributor

@ashenBlade don't we need somehow to take into account also pfree in EvictCache?

static void
EvictCache(uint64 size)
{
...
	StringInfo str = entry->store;
				if (str->data) {
					pfree(str->data);
				}

				pfree(str);

...
}
  1. We have valueBuffer in chunkData->valueBufferArray[columnIndex]
static ChunkData *
DeserializeChunkData(StripeBuffers *stripeBuffers, uint64 chunkIndex,
					 uint32 rowCount, TupleDesc tupleDescriptor,
					 List *projectedColumnList, StripeReadState *state, uint64 stripeId)
{
...
    chunkData->valueBufferArray[columnIndex] = valueBuffer;
...
}
  1. Then we try to free a cache, so call pfree(valueBuffer) in the cache.
static void
EvictCache(uint64 size)
{
...
	StringInfo str = entry->store;
				if (str->data) {
					pfree(str->data);
				}

				pfree(str);

...
}
  1. And our chunkData->valueBufferArray[columnIndex] will be invalid.
  2. In FreeChunkData we try to pfree chunkData->valueBufferArray[columnIndex] as it's not NULL. But we have already pfreed this memory when we have emptied 10% of a chache. And this memory was returned to memory pool.
void
FreeChunkData(ChunkData *chunkData)
{
...
	for (columnIndex = 0; columnIndex < chunkData->columnCount; columnIndex++)
	{
		if (chunkData->existsArray[columnIndex] != NULL)
		{
			pfree(chunkData->existsArray[columnIndex]);
		}

		if (chunkData->valueArray[columnIndex] != NULL)
		{
			pfree(chunkData->valueArray[columnIndex]);
		}
	}
...
}

So we pfree a garbage

@ivan-v-kush
Copy link
Contributor

ivan-v-kush commented Dec 20, 2024

Maybe use uint8 to note, in how many resources valueBuffer is used? Takes 1 byte as a bool

typedef struct cached_StringInfo
{
    StringInfoData data;
    uint8 count = 0;
};
if(AddToCache)
  count++;

if(AddToArr) {
  count++
}

if(Remove)
    count--;
if(count == 0)
  pfree

@ashenBlade
Copy link
Contributor Author

No, we do not free garbage. valueBufferArray elements freed:

  • Always in EvictCache
  • In FreeChunkBufferValueArray only if cache = false

But StringInfo may be in cache only if cache = true, so we do not need to check this.

existsArray and valueArray are not cached (only valueBufferArray cached), so no need to check them.

@ivan-v-kush
Copy link
Contributor

ivan-v-kush commented Dec 23, 2024

I've added a regression test, you can add it to your MR. +1 vote for your approach
https://github.com/hydradatabase/hydra/pull/273/files

@ashenBlade
Copy link
Contributor Author

Thanks, @ivan-v-kush. I've added these tests to my branch and they pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants