Improve `dpnp.partition` implementation by antonwolfy · Pull Request #2766 · IntelPython/dpnp

antonwolfy · 2026-02-11T19:33:15Z

The PR propose to improve implementation and to use dpnp.sort call when

input array has number of dimensions > 1
input array has previously not supported integer dtype
axis keyword is passed (previously not supported)
sequence of kth is passed (previously not supported)
In case of ndim > 1 previously the implementation from legacy backend was used, which is significantly slow (see performance comparation below). It used a copy of input data into the shared USM memory and included computations on the host.

This PR proposes to reuse dpnp.sort for all the above cases.
While in case when the legacy implementation is stable and fast (for 1D input array), it will remain, because it relays on std::nth_element from OneDPL.

The benchmark results were collected on PVC with help of the below code:

import dpnp, numpy as np
from dpnp.tests.helper import generate_random_numpy_array

a = generate_random_numpy_array(10**7, dtype=np.float64, seed_value=117)
ia = dpnp.array(a)
%timeit x = dpnp.partition(ia, 513); x.sycl_queue.wait()

Below tables contains data in case of 1D input array (shape=(10**7,)), where the implementation path was kept the same, plus adding support of missing integer dtypes using fallback on the sort function:

Implementation	int32	uint32	int64	uint64	float32	float64	complex64	complex128
old (legacy backend)	7.46 ms	not supported	9.46 ms	not supported	7.39 ms	8.92 ms	10.9 ms	21.2 ms
new (backend + sort)	7.34 ms	10.8 ms	9.48 ms	12.5 ms	7.37 ms	8.89 ms	11 ms	21.2 ms

The following code was used for 2D input array with shape=(104, 104):

import dpnp, numpy as np
from dpnp.tests.helper import generate_random_numpy_array

a = generate_random_numpy_array((10**4, 10**4), dtype=np.float64, seed_value=117)
ia = dpnp.array(a)
%timeit x = dpnp.partition(ia, 1513); x.sycl_queue.wait()

In that case the new implementation is fully based on the sort call:

Implementation	int32	int64	float32	float64	complex64	complex128
old (legacy backend)	6.4 s	6.89 s	7.36 s	7.66 s	8.61 s	10 s
new (sort)	57.4 ms	64.7 ms	62.2 ms	68 ms	77 ms	151 ms

Have you provided a meaningful PR description?
Have you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
Have you checked performance impact of proposed changes?
Have you added documentation for your changes, if necessary?
Have you added your changes to the changelog?

…s the computation issue

…element from OneDPL

github-actions · 2026-02-11T19:55:21Z

View rendered docs @ https://intelpython.github.io/dpnp/index.html

github-actions · 2026-02-11T20:03:19Z

Array API standard conformance tests for dpnp=0.20.0dev2=py313h509198e_33 ran successfully.
Passed: 1353
Failed: 4
Skipped: 7

coveralls · 2026-02-11T20:13:12Z

coverage: 81.124% (-0.04%) from 81.167%
when pulling 6422635 on resolve-gh-2762
into 9d6d5a5 on master.

…gacy backend

vlad-perevezentsev · 2026-02-16T10:37:19Z

dpnp/dpnp_iface_sorting.py

+    axis = normalize_axis_index(axis, nd)
+    length = a.shape[axis]
+
+    if isinstance(kth, int):


It looks too strict because does not support integer-like scalars as dpnp.int64
In other dpnp functions this type are accepted for axes:

# axis : {None, int, tuple}, optional In [22]: dpnp.count_nonzero(a, axis=dpnp.int64(0)) Out[22]: array([1, 1, 2, 1])

# axis : {None, int, tuple of ints}, optional np.all(x, axis=np.int64(0)) Out[25]: array([ True, False])

But axes is validating by special function normalize_axis_tuple from NumPy.

I meant kth argument

dpnp.partition(a,dpnp.int64(4),axis=None) TypeError: kth must be int or sequence of ints, but got <class 'numpy.int64'>

Should we use operator.index for this as implemented in dpnp.tri for k?

We could, but it's not mandatory, because we explicitly state the support of int and sequence of itns in the documentation.
We can implement a general improvement and to add a helper function, like normalize_int_sequence. But I'd prefer to have that done separately as a general improvement.

dpnp/backend/kernels/dpnp_krnl_sorting.cpp

vlad-perevezentsev

LGTM
Thank you @antonwolfy !

The PR propose to improve implementation and to use `dpnp.sort` call when - input array has number of dimensions > 1 - input array has previously not supported integer dtype - `axis` keyword is passed (previously not supported) - sequence of `kth` is passed (previously not supported) In case of `ndim > 1` previously the implementation from legacy backend was used, which is significantly slow (see performance comparation below). It used a copy of input data into the shared USM memory and included computations on the host. This PR proposes to reuse `dpnp.sort` for all the above cases. While in case when the legacy implementation is stable and fast (for 1D input array), it will remain, because it relays on `std::nth_element` from OneDPL. The benchmark results were collected on PVC with help of the below code: ```python import dpnp, numpy as np from dpnp.tests.helper import generate_random_numpy_array a = generate_random_numpy_array(10**7, dtype=np.float64, seed_value=117) ia = dpnp.array(a) %timeit x = dpnp.partition(ia, 513); x.sycl_queue.wait() ``` Below tables contains data in case of 1D input array (shape=(10**7,)), where the implementation path was kept the same, plus adding support of missing integer dtypes using fallback on the sort function: | Implementation | int32 | uint32 | int64 | uint64 | float32 | float64 | complex64 | complex128 | |--------|--------|--------|--------|--------|--------|--------|--------|--------| | old (legacy backend) | 7.46 ms | not supported | 9.46 ms | not supported | 7.39 ms | 8.92 ms | 10.9 ms | 21.2 ms | | new (backend + sort) | 7.34 ms | 10.8 ms | 9.48 ms | 12.5 ms | 7.37 ms | 8.89 ms | 11 ms | 21.2 ms | The following code was used for 2D input array with shape=(10**4, 10**4): ```python import dpnp, numpy as np from dpnp.tests.helper import generate_random_numpy_array a = generate_random_numpy_array((10**4, 10**4), dtype=np.float64, seed_value=117) ia = dpnp.array(a) %timeit x = dpnp.partition(ia, 1513); x.sycl_queue.wait() ``` In that case the new implementation is fully based on the sort call: | Implementation | int32 | int64 | float32 | float64 | complex64 | complex128 | |--------|--------|--------|--------|--------|--------|--------| | old (legacy backend) | 6.4 s | 6.89 s | 7.36 s | 7.66 s | 8.61 s | 10 s | | new (sort) | 57.4 ms | 64.7 ms | 62.2 ms | 68 ms | 77 ms | 151 ms | 9c4aed2

antonwolfy added 5 commits February 11, 2026 10:54

Use dpnp.sort() for nd > 1 which speeds up dpnp.partition and resolve…

2da3758

…s the computation issue

Update backend implememntation of dpnp.partition to base on std::nth_…

bd67b44

…element from OneDPL

Update dpnp.ndarray.partition method

3967a79

Update third party tests

c90f0b2

Update dpnp tests

680ae20

antonwolfy added this to the 0.20.0 release milestone Feb 11, 2026

antonwolfy self-assigned this Feb 11, 2026

antonwolfy added 2 commits February 11, 2026 12:32

Add a test to improve the coverage

69b42d4

Add PR to the changelog

f70212a

antonwolfy linked an issue Feb 11, 2026 that may be closed by this pull request

dpnp.partition fails on 2D array of complex dtype #2762

Closed

antonwolfy added 3 commits February 12, 2026 02:05

Needs to call dpnp.sort() in case of unsupported integer dtypes in le…

06d3950

…gacy backend

Add more testing with different dtypes

b939685

Add more tests for axis keyword

5f41788

antonwolfy marked this pull request as ready for review February 12, 2026 13:05

antonwolfy requested review from ndgrigorian and vlad-perevezentsev as code owners February 12, 2026 13:05

vlad-perevezentsev reviewed Feb 16, 2026

View reviewed changes

Add a note when a multidemension array passed into dpnp_partition

9ebb038

vlad-perevezentsev self-requested a review February 16, 2026 14:40

vlad-perevezentsev approved these changes Feb 16, 2026

View reviewed changes

Merge branch 'master' into resolve-gh-2762

6422635

antonwolfy merged commit 9c4aed2 into master Feb 16, 2026
94 of 97 checks passed

antonwolfy deleted the resolve-gh-2762 branch February 16, 2026 19:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `dpnp.partition` implementation#2766

Improve `dpnp.partition` implementation#2766
antonwolfy merged 12 commits intomasterfrom
resolve-gh-2762

antonwolfy commented Feb 11, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

coveralls commented Feb 11, 2026 •

edited

Loading

Uh oh!

vlad-perevezentsev Feb 16, 2026

Uh oh!

antonwolfy Feb 16, 2026

Uh oh!

vlad-perevezentsev Feb 16, 2026

Uh oh!

antonwolfy Feb 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

vlad-perevezentsev left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

antonwolfy commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vlad-perevezentsev Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

antonwolfy Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

vlad-perevezentsev Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

antonwolfy Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vlad-perevezentsev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

antonwolfy commented Feb 11, 2026 •

edited

Loading

github-actions bot commented Feb 11, 2026 •

edited

Loading

github-actions bot commented Feb 11, 2026 •

edited

Loading

coveralls commented Feb 11, 2026 •

edited

Loading

antonwolfy Feb 16, 2026 •

edited

Loading