Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

DeviceRunLengthEncode returns incorrect result with nan. #596

Closed
@kshitij12345

Description

Reproducer snippet:

#include <cub/cub.cuh>

#define CUB_WRAPPER(func, ...)                           \
  do {                                                   \
    size_t temp_storage_bytes = 0;                       \
    func(nullptr, temp_storage_bytes, __VA_ARGS__);      \
    float* temp_storage;                                 \
    cudaMalloc(&temp_storage, temp_storage_bytes);       \
    func(temp_storage, temp_storage_bytes, __VA_ARGS__); \
  } while (false)

int main() {
  float* host_array;
  host_array = (float*)malloc(sizeof(float));
  // host_array[0] = 42; // Works fine!
  host_array[0] = NAN;
  float* device_array;
  float* device_out_array;
  int* counts;
  int* length;

  int num_inp = 1;
  cudaMalloc((void**)&device_array, sizeof(float) * num_inp);
  cudaMalloc((void**)&device_out_array, sizeof(float) * num_inp);
  cudaMalloc((void**)&counts, sizeof(int) * num_inp);
  cudaMalloc((void**)&length, sizeof(int) * num_inp);
  cudaMemcpy(device_array, host_array, sizeof(float), cudaMemcpyHostToDevice);

  CUB_WRAPPER(
      cub::DeviceRunLengthEncode::Encode,
      device_array,
      device_out_array,
      counts,
      length,
      num_inp);

  int length_cpu;
  cudaMemcpy(&length_cpu, length, sizeof(int), cudaMemcpyDeviceToHost);
  std::cout << length_cpu << "\n"; // prints `2`
}

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

  • Status

    Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions