[BUG] Parquet chunked reader can throw an 'unexpected short subpass' exception under certain conditions.


This was discovered as a byproduct of changes to nvcomp temporary memory usage for decompression. The change caused us to produce a slightly different set of chunks, exposing the underlying bug in the chunked reader itself (nvcomp was not doing anything wrong).  Spark Rapids customers have experienced this as well, under difficult-to-reproduce conditions, so having a clean repro case here is nice.

To reproduce, build cudf using nvcomp 4.2.0.11  (https://github.com/rapidsai/cudf/pull/18042) and run the tests.  Two of the list tests, `ParquetChunkedReaderInputLimitConstrainedTest.MixedColumns` and `ParquetChunkedReaderInputLimitTest.List` will throw the exception.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Parquet chunked reader can throw an 'unexpected short subpass' exception under certain conditions. #18043

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development