Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dictionary array transpose not handle null in kernel function TransposeInts #44827

Open
chenbaggio opened this issue Nov 25, 2024 · 6 comments
Open

Comments

@chenbaggio
Copy link

Describe the enhancement requested

Due to some unify dictionary applications not handling the index value corresponding to null, a very large value may randomly occur. As a result, in the final TransposeInts function, src[index] may become a very large value, even exceeding the range of transpose_map, leading to a crash.”

If this is a description of a bug, it suggests that the merge dictionary application should properly handle null indices to avoid generating invalid or out-of-bounds index values, which can cause crashes in later stages of the program

Component(s)

C++

@mapleFU
Copy link
Member

mapleFU commented Nov 25, 2024

Would you mind provide the crash stack? Seems that TransposeInts itself should not handle validity logic, perhaps the caller of TransposeInts wrongly handles validity bitmap

@chenbaggio
Copy link
Author

the core stack has been removed, and here fix the issue using constant index value 0 instead of random value,

the core stack is in TransposeInts , the code snippet:

···
dest[1] = static_cast(transpose_map[src[1]]);

···
here the src[1] should be null, but here not check, the src[1] is random value, a massive number, so crash

here question is :

should check the null bitmap in here or handle it in application, make sure the index would not overflow ?

@chenbaggio
Copy link
Author

chenbaggio commented Nov 26, 2024

another fix version is implement the version which check the bitmap:


template <typename InputInt, typename OutputInt>
void TransposeIntsWithNullBitmap(const InputInt* src, OutputInt* dest, int64_t length,
                   const int32_t* transpose_map, const uint8_t * null_bitsmap, int64_t null_offset) {

  if (null_bitsmap != nullptr) {
     int64_t idx = 0;  
    while (length > 0) {  
    if (bit_util::GetBit(null_bitsmap, (null_offset + idx))) {
        *dest = static_cast<OutputInt>(transpose_map[*src]);
     }
      --length, ++src, ++idx, ++dest;
    }
  } else {
    TransposeInts(src, dest, length, transpose_map);
  }
}
template <typename SrcType>
struct TransposeIntsDestWithNullBitmap {
  const SrcType* src;
  uint8_t* dest;
  int64_t dest_offset;
  int64_t length;
  const int32_t* transpose_map;
  const uint8_t* null_bitmap;
  int64_t null_offset;

  template <typename T>
  enable_if_integer<T, Status> Visit(const T&) {
    using DestType = typename T::c_type;
    TransposeIntsWithNullBitmap(src, reinterpret_cast<DestType*>(dest) + dest_offset, length,
                  transpose_map, null_bitmap, null_offset);
    return Status::OK();
  }

  Status Visit(const DataType& type) {
    return Status::TypeError("TransposeInts received non-integer dest_type");
  }

  Status operator()(const DataType& type) { return VisitTypeInline(type, this); }
};


struct TransposeIntsSrcWithNullBitmap {
  const uint8_t* src;
  uint8_t* dest;
  int64_t src_offset;
  int64_t dest_offset;
  int64_t length;
  const int32_t* transpose_map;
  const DataType& dest_type;
  const uint8_t *null_bitmap;

  template <typename T>
  enable_if_integer<T, Status> Visit(const T&) {
    using SrcType = typename T::c_type;
    return TransposeIntsDestWithNullBitmap<SrcType>{reinterpret_cast<const SrcType*>(src) + src_offset,
                                      dest, dest_offset, length,
                                      transpose_map, null_bitmap, src_offset}(dest_type);
  }

  Status Visit(const DataType& type) {
    return Status::TypeError("TransposeInts received non-integer dest_type");
  }

  Status operator()(const DataType& type) { return VisitTypeInline(type, this); }
};


Status TransposeIntsWithNullBitmap(const DataType& src_type, const DataType& dest_type,
                     const uint8_t* src, uint8_t* dest, int64_t src_offset,
                     int64_t dest_offset, int64_t length, const int32_t* transpose_map,
					 const uint8_t *null_bitmap) {
  TransposeIntsSrcWithNullBitmap transposer{src, dest, src_offset, dest_offset,
                              length, transpose_map, dest_type, null_bitmap};
  return transposer(src_type);
}

@mapleFU
Copy link
Member

mapleFU commented Nov 26, 2024

I mean, the internal::TransposeInts might not designed for null-inputs, so, the I think TransposeIntsWithNullBitmap is a good way, but perhaps we need to find which caller triggers this problem.

There're TransposeDictIndices and concatenate which calls TransposeInts, so i mean which caller might trigger this problem?

@chenbaggio
Copy link
Author

TransposeDictIndices is the trigger, you can consturct a dictionary array with null value, and its index set to a massive value, the issue may reproduced

@mapleFU
Copy link
Member

mapleFU commented Nov 26, 2024

I've check the code of CompactTransposeMapVisitor and RecursiveUnifier::Unify, in tuition they didn't generate invalid ranges... I don't know which caller would make this happens 🤔, would you mind show the upper caller of this case, or add a test in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants