Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Add DeviceMergeSort::StableSortKeysCopy API #565

Merged
merged 4 commits into from
Sep 12, 2022

Conversation

davidwendt
Copy link
Contributor

CUB has a SortKeysCopy and a StableSortKeys. This PR adds StableSortKeysCopy so a sort in-place is not required. The in-place StableSortKeys requires a temporary vector if a copy is needed.

Since StableSortKeys calls SortKeys and SortKeysCopy uses the same kernels as SortKeys (which appear to already be stable-sort enabled), the new StableSortKeysCopy simply calls SortKeysCopy.

@jrhemstad jrhemstad requested a review from gevtushenko August 31, 2022 15:58
@davidwendt davidwendt marked this pull request as ready for review September 1, 2022 20:25
Copy link
Collaborator

@gevtushenko gevtushenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution! A few minor suggestions below.

Comment on lines 1117 to 1142
template <typename KeyInputIteratorT,
typename KeyIteratorT,
typename OffsetT,
typename CompareOpT>
CUB_DETAIL_RUNTIME_DEBUG_SYNC_IS_NOT_SUPPORTED
CUB_RUNTIME_FUNCTION static cudaError_t
StableSortKeysCopy(void *d_temp_storage,
std::size_t &temp_storage_bytes,
KeyInputIteratorT d_input_keys,
KeyIteratorT d_output_keys,
OffsetT num_items,
CompareOpT compare_op,
cudaStream_t stream,
bool debug_synchronous)
{
CUB_DETAIL_RUNTIME_DEBUG_SYNC_USAGE_LOG

return StableSortKeysCopy<KeyInputIteratorT, KeyIteratorT, OffsetT, CompareOpT>(d_temp_storage,
temp_storage_bytes,
d_input_keys,
d_output_keys,
num_items,
compare_op,
stream);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
template <typename KeyInputIteratorT,
typename KeyIteratorT,
typename OffsetT,
typename CompareOpT>
CUB_DETAIL_RUNTIME_DEBUG_SYNC_IS_NOT_SUPPORTED
CUB_RUNTIME_FUNCTION static cudaError_t
StableSortKeysCopy(void *d_temp_storage,
std::size_t &temp_storage_bytes,
KeyInputIteratorT d_input_keys,
KeyIteratorT d_output_keys,
OffsetT num_items,
CompareOpT compare_op,
cudaStream_t stream,
bool debug_synchronous)
{
CUB_DETAIL_RUNTIME_DEBUG_SYNC_USAGE_LOG
return StableSortKeysCopy<KeyInputIteratorT, KeyIteratorT, OffsetT, CompareOpT>(d_temp_storage,
temp_storage_bytes,
d_input_keys,
d_output_keys,
num_items,
compare_op,
stream);
}

This is not required. The debug_synchronous overloads exist so as not to break API. For the new functions, we don't have anything to break, so let's remove this overload.

@@ -234,6 +234,18 @@ void TestKeys(std::int64_t num_items,
CustomLess()));

AssertTrue(CheckResult(d_keys));

CubDebugExit(cub::DeviceMergeSort::StableSortKeysCopy(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that there might be a value in something like:

Suggested change
CubDebugExit(cub::DeviceMergeSort::StableSortKeysCopy(
thrust::fill(d_keys.begin(), d_keys.end(), KeyType{});
CubDebugExit(cub::DeviceMergeSort::StableSortKeysCopy(

gevtushenko added a commit to gevtushenko/thrust that referenced this pull request Sep 9, 2022
Copy link
Collaborator

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution and the excellent documentation

@gevtushenko gevtushenko merged commit 33a6a81 into NVIDIA:main Sep 12, 2022
@gevtushenko gevtushenko added testing: gpuCI passed Passed gpuCI testing. type: enhancement New feature or request. labels Sep 12, 2022
@gevtushenko gevtushenko added this to the 2.1.0 milestone Sep 12, 2022
@davidwendt davidwendt deleted the stable-sort-keys-copy branch September 12, 2022 12:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
testing: gpuCI passed Passed gpuCI testing. type: enhancement New feature or request.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants