Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reading cache breaks with int-enums with redis-py >= 4.0 #595

Open
syphar opened this issue Mar 29, 2022 · 4 comments · May be fixed by #602
Open

reading cache breaks with int-enums with redis-py >= 4.0 #595

syphar opened this issue Mar 29, 2022 · 4 comments · May be fixed by #602
Labels

Comments

@syphar
Copy link

syphar commented Mar 29, 2022

Describe the bug

When trying to update our app from redis-py==3.5.3 to redis-py==4.2.0 we stumbled onto a problem. While it only started breaking with the new redis-py version, I believe it's actually a bug here in django-redis. This also happens with any version of redis-py>=4.0.2.

I'm happy to provide a PR if this is seen as a bug.

I see in DefaultClient.encode that we have special behaviour where everything that isinstance(obj, int) will not be passed through the serializer (pickle in our case) or the compressor. In DefaultClient.decode we just try int(value) to bypass deserialization and decompression for these.

The problem is: for models.IntegerChoices and an integer-enum (see example) instance(enum_variant, int) is True. This then leads to a text representation of the enum being stored in redis (something like <Values1.CHOICE_1: 1>), which of course cannot be converted back to an integer via int(). And also it cannot be decoded with pickle.

The diff between 3.5.3 and 4.0.2 is huge ( redis/redis-py@3.5.3...v4.0.2 ) and I couldn't directly see why this worked before, but I believe our encode/decode mapping is in the wrong here.

To Reproduce
This is with a default django-redis setup.

With models.IntegerChoices:

from django.core.cache import cache
from django.db import models

class Values1(models.IntegerChoices):
    CHOICE_1 = 1
    CHOICE_2 = 2

cache.set("key", Values1.CHOICE_1)

# this breaks with : 
# UnpicklingError: invalid load key, '<'.
cache.get("key") 

With a python standard int-enum:

import enum
from django.core.cache import cache

class Values2(int, enum.Enum):
    SOMETHING_1 = 1
    SOMETHING_2 = 2


cache.set("key", Values2.SOMETHING_1)
# this breaks with : 
# UnpicklingError: invalid load key, '<'.
cache.get("key")

Expected behavior
I would expect the enum-variant to be cached and returned as enum-variant.

Stack trace
I don't think we need this, I'm happy to provide one if needed.

Environment (please complete the following information):

  • Python version: 3.9.12
  • Django Redis Version: 5.2.0
  • Django Version: 3.2.12
  • Redis Version: [e.g. 2.5.0]
  • redis-py Version: 4.2.0

Additional context
Add any other context about the problem here.

@syphar syphar added the bug label Mar 29, 2022
@WisdomPill
Copy link
Member

Hello @syphar.

To me it sounds very weird that django-redis is the issue,
as far as I remember there are no changes in the encoding or decoding.

I would recommend to do a test to be sure, forcing redis-py to version 3.x.

It seems also pretty strange to me that you want to save an enum to cache,
you can just save the integer and when you read it parse it to the desired enum. It is much faster.
For speed, security and "debuggability" reasons I would also drop pickle and use json or msgpack.

@syphar
Copy link
Author

syphar commented Mar 30, 2022

It seems also pretty strange to me that you want to save an enum to cache,
you can just save the integer and when you read it parse it to the desired enum. It is much faster.
For speed, security and "debuggability" reasons I would also drop pickle and use json or msgpack.

I get the debugability & speed points, security only if you assume your cache is compromised (but then the world is on fire anyways :) ).

In our case "ease of use" wins.

I would expect a cache with pickle to be able to cache every python object I throw at it. In our case we have a decorator that caches expensive method results no matter the arguments or return values ( similar to @cachetools.func.lru_cache ).

I would recommend to do a test to be sure, forcing redis-py to version 3.x.

I already did that, with 3.x it works, with >=4.0 it breaks.

To me it sounds very weird that django-redis is the issue,
as far as I remember there are no changes in the encoding or decoding.

In the end, we have enumeration types only since python 3.4, enumeration types for choices in Django only since 3.0.

Looking at the .encode and .decode code in the DefaultClient it seems clear to me that django-redis has a bug here. It will not encode an object that should be encoded.

( I'm happy be be convinced otherwise :) )

@WisdomPill
Copy link
Member

I just wanted to let you know that I have tried and confirmed your bug,
but I am not sure, I think it is writing some kind of representation as a string and reading it as a number,
I did not have time to finish having a look at it but I will keep you updated when I trace the root of the bug.

Thank you for raising the issue.

@WisdomPill
Copy link
Member

Hello @syphar sorry for the delay, yes were right, I will add an option to force encoding, in that case the decoding if using pickle as a serializer will always try to decode it. Would that work for you?

@WisdomPill WisdomPill linked a pull request Apr 18, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants