Skip to content

[FEA] Generate offsets from element labels #10955

Closed
@ttnghia

Description

Given an array of integer values which may be the labels of some list elements, we want to generate an array of offsets so we can create a lists column from these offsets and elements gathered from another lists column using the input labels as markup for gather map.

For example:

input_labels = [0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 5, 5]
output_offsets = [0, 4, 6, 10, 12]

This is basically extracting the existing function from drop_list_duplicates (

std::unique_ptr<column> generate_output_offsets(size_type num_lists,
). As such, it will be continuingly used in drop_list_duplicates. In addition, it should be used to implement the set-like operations (#10409).

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

SparkFunctionality that helps Spark RAPIDSfeature requestNew feature or requestlibcudfAffects libcudf (C++/CUDA) code.non-breakingNon-breaking change

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions