JS_ToBigInt64 handles overflow incorrectly #364

Icemic · 2024-04-09T19:07:00Z

JS_ToBigInt64 handles number larger than 2^64 incorrectly.

Divive into the function, it calls JS_ToBigInt64Free then JS_ToBigInt64Free:

static int JS_ToBigInt64Free(JSContext *ctx, int64_t *pres, JSValue val)
{
    bf_t a_s, *a;

    a = JS_ToBigIntFree(ctx, &a_s, val);
    if (!a) {
        *pres = 0;
        return -1;
    }
    bf_get_int64(pres, a, BF_GET_INT_MOD);
    JS_FreeBigInt(ctx, a, &a_s);
    return 0;
}

For numbers larger than 2^64, a has a valid value so it goes to bf_get_int64 where the problem occurs.

/* The rounding mode is always BF_RNDZ. Return BF_ST_INVALID_OP if there
   is an overflow and 0 otherwise. */
int bf_get_int64(int64_t *pres, const bf_t *a, int flags)
{
    uint64_t v;
    int ret;
    if (a->expn >= BF_EXP_INF) {
        ret = BF_ST_INVALID_OP;
        if (flags & BF_GET_INT_MOD) {
            v = 0;
        } else if (a->expn == BF_EXP_INF) {
            v = (uint64_t)INT64_MAX + a->sign;
        } else {
            v = INT64_MAX;
        }
    } else if (a->expn <= 0) {
        v = 0;
        ret = 0;
    } else if (a->expn <= 63) {
#if LIMB_BITS == 32
        if (a->expn <= 32)
            v = a->tab[a->len - 1] >> (LIMB_BITS - a->expn);
        else
            v = (((uint64_t)a->tab[a->len - 1] << 32) |
                 get_limbz(a, a->len - 2)) >> (64 - a->expn);
#else
        v = a->tab[a->len - 1] >> (LIMB_BITS - a->expn);
#endif
        if (a->sign)
            v = -v;
        ret = 0;
    } else if (!(flags & BF_GET_INT_MOD)) {
        ret = BF_ST_INVALID_OP;
        if (a->sign) {
            uint64_t v1;
            v = (uint64_t)INT64_MAX + 1;
            if (a->expn == 64) {
                v1 = a->tab[a->len - 1];
#if LIMB_BITS == 32
                v1 = (v1 << 32) | get_limbz(a, a->len - 2);
#endif
                if (v1 == v)
                    ret = 0;
            }
        } else {
            v = INT64_MAX;
        }
    } else {
        slimb_t bit_pos = a->len * LIMB_BITS - a->expn;
        v = get_bits(a->tab, a->len, bit_pos);
#if LIMB_BITS == 32
        v |= (uint64_t)get_bits(a->tab, a->len, bit_pos + 32) << 32;
#endif
        if (a->sign)
            v = -v;
        ret = 0;
    }
    *pres = v;
    return ret;
}

For example, the number 2^64 + 2, its a->expn is 64. While using BF_GET_INT_MOD flag, it goes to the final else block.
As BF_GET_INT_MOD implies that modulo 2^n instead of saturation. NaN and infinity return 0, finally we got the v as 2 which is from (2^64 + 2) % 2^64, and ret=0 which means returns a right result.
Then everything mess up.

That is to say that we cannot judge the return value from JS_ToBigInt64Free (as well as JS_ToBigInt64) whether it is the actual result or the actual result mods 2^64

It seems to be a problem that has been around for a long time, and a solution has already been suggested: https://github.com/theduke/quickjs-rs/blob/master/libquickjs-sys/embed/patches/js-tobigint64-overflow.patch

I've tried to apply the patch and the problem goes fixed.
So I think maybe it should be adopted?

The text was updated successfully, but these errors were encountered:

saghul · 2024-04-09T19:19:33Z

Interesting! I guess there is no test that checks for this.

A good first step would be to add one that fails.

Icemic · 2024-04-09T19:27:27Z

Here is a Rust example: https://github.com/Icemic/quickjspp-rs/blob/221fff9e177fa5b490bf349aa74f3550eb7f0397/tests/runtime.rs#L641

create a quickjs context
just evals 9223372036854775807n + 11n, and get the result ret of type JSValue
try to get value from ret via JS_ToBigInt64 (will get 10 now)

chqrlie · 2024-04-09T19:50:24Z

Hello @Icemic, nice to see QuickJS used in the rust world...

I don't understand your point: JS_ToBigInt64Free is defined as computing BigInt value modulo 64-bit modulo.

bf_get_int64 never fails with BF_GET_INT_MOD

It follows the ECMA spec: 7.1.15 ToBigInt64 ( argument ).

You seem to expect some other behavior, what is your use case ?

Icemic · 2024-04-10T03:56:50Z

Thanks for the reply! I hadn't noticed the ECMA spec aspect before. Given that this function adheres to an established specification, it should indeed be fine.

I'm maintaining a QuickJS to Rust binding, which is forked from quickjs-rs. There's a piece of code designed to extract the actual value from a JSValue tagged as bigint, and then convert it into a Rust type: https://github.com/Icemic/quickjspp-rs/blob/221fff9e177fa5b490bf349aa74f3550eb7f0397/src/value/value.rs#L276-L306

Its approach begins with an attempt to convert using JS_ToBigInt64. If that fails, it then resorts to a string conversion. The intention behind using JS_ToBigInt64 was presumably to serve as a performance optimization. However, it seems both the original author and I have misunderstood the true definition of this function.

So, my question now shifts to: In similar binding libraries or FFI scenarios, what is the recommended way to retrieve and convert values from a TAG_BIGINIT type JSValue to a native type?

chqrlie · 2024-04-10T08:05:40Z

So, my question now shifts to: In similar binding libraries or FFI scenarios, what is the recommended way to retrieve and convert values from a JS_TAG_BIGINT type JSValue to a native type?

I would recommend adding these APIs:

int JS_IsInt64(JSContext *ctx, JSValue obj): return 1 iff the value is an integer and in the range INT64_MIN to INT64_MAX.
int JS_IsUint64(JSContext *ctx, JSValue obj): return 1 iff the value fits is an integer and in the range 0 to UINT64_MAX.

With these your can determine if JS_ToBigInt64 will give you the expected result. If the number is not in the proper range, you can use the string conversion, but I recommend using a base 16 conversion that is much faster than a base 10 conversion.

For efficient conversions to and from another language native representation, I would add these:

int JS_GetValueSign(JSContext *ctx, JSValue obj): returns 0 for positive or zero, 1 for negative, -1 for NaN and values that are not numerical.
JSValue JS_NegateFree(JSContext *ctx, JSValue obj): return the opposite of a numerical value (value is freed).
int JS_GetBigIntBitLength(JSContext *ctx, JSValue obj): returns the number of significant bits, eg: 0 for 0n, 1 for 1n and -1n, 64 for UINT64_MAX, etc.
int JS_GetBigIntBits(JSContext *ctx, JSValue obj, void *dest, size_t size, size_t count, size_t offset): get the bits by groups of 1, 2, 4 or 8 bytes (a word) into an array dest of such words (uint8_t, uint16_t, uint32_t, uint64_t respectively), from a given offset in the same unit and from least significant to most significant word. Negative values would be retrieved as 2's complement.
JSValue JS_NewBigIntFromBits(JSContext *ctx, const void *src, size_t size, size_t count, int sign): create a JS_TAG_BIGINT value from the array word provided with sign sign. Negative values should be represented in 2's complement.

The current implementation uses a general floating point representation that makes these conversion APIs non trivial but still much faster than a string conversion. We are likely to change the internal representation in the near future to make this API straightforward.

chqrlie added the enhancement New feature or request label Apr 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JS_ToBigInt64 handles overflow incorrectly #364

JS_ToBigInt64 handles overflow incorrectly #364

Icemic commented Apr 9, 2024

saghul commented Apr 9, 2024

Icemic commented Apr 9, 2024

chqrlie commented Apr 9, 2024 •

edited

Loading

Icemic commented Apr 10, 2024

chqrlie commented Apr 10, 2024

JS_ToBigInt64 handles overflow incorrectly #364

JS_ToBigInt64 handles overflow incorrectly #364

Comments

Icemic commented Apr 9, 2024

saghul commented Apr 9, 2024

Icemic commented Apr 9, 2024

chqrlie commented Apr 9, 2024 • edited Loading

Icemic commented Apr 10, 2024

chqrlie commented Apr 10, 2024

chqrlie commented Apr 9, 2024 •

edited

Loading