001/*
002 * Licensed to the Apache Software Foundation (ASF) under one or more
003 * contributor license agreements.  See the NOTICE file distributed with
004 * this work for additional information regarding copyright ownership.
005 * The ASF licenses this file to You under the Apache License, Version 2.0
006 * (the "License"); you may not use this file except in compliance with
007 * the License.  You may obtain a copy of the License at
008 *
009 *      http://www.apache.org/licenses/LICENSE-2.0
010 *
011 * Unless required by applicable law or agreed to in writing, software
012 * distributed under the License is distributed on an "AS IS" BASIS,
013 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
014 * See the License for the specific language governing permissions and
015 * limitations under the License.
016 */
017
018package org.apache.commons.numbers.arrays;
019
020/**
021 * Select indices in array data.
022 *
023 * <p>Arranges elements such that indices {@code k} correspond to their correctly
024 * sorted value in the equivalent fully sorted array. For all indices {@code k}
025 * and any index {@code i}:
026 *
027 * <pre>{@code
028 * data[i < k] <= data[k] <= data[k < i]
029 * }</pre>
030 *
031 * <p>Examples:
032 *
033 * <pre>
034 * data    [0, 1, 2, 1, 2, 5, 2, 3, 3, 6, 7, 7, 7, 7]
035 *
036 *
037 * k=4   : [0, 2, 1, 1], [2], [6, 3, 2, 3, 5, 7, 7, 7, 7]
038 * k=4,8 : [0, 1, 2, 1], [2], [3, 3, 2], [5], [7, 7, 6, 7, 7]
039 * </pre>
040 *
041 * <p>Notes:
042 *
043 * <ul>
044 * <li>The algorithm does not require that all data with a value equal to the target index
045 *     are ordered consecutively in a range containing the target index.
046 * <li>The algorithm may reorder any part of the array above and below the target indices.
047 * <li>Correct usage for multiple target indices should not call multiple times with each
048 *     index but instead call selection only once with all indices. Use of consecutive calls
049 *     can reorder previously selected indices.
050 * </ul>
051 *
052 * <p>Selection on multiple indices will handle duplicate and
053 * unordered indices. The method detects ordered indices (with or without duplicates) and
054 * uses this during processing. Passing ordered indices is recommended if the order is already
055 * known; for example using uniform spacing through the array data, or to select the top and
056 * bottom {@code n} values from the data.
057 *
058 * <p>Implementation details
059 *
060 * <p>A quickselect adaptive method is used for single indices. This uses analysis of the
061 * partition sizes after each division to update the algorithm mode. If the partition
062 * containing the target does not sufficiently reduce in size then the algorithm is
063 * progressively changed to use partitions with guaranteed margins. This ensures a set fraction
064 * of data is eliminated each step and worse-case linear run time performance. This method can
065 * handle a range of indices {@code [ka, kb]} with a small separation by targeting the start of
066 * the range {@code ka} and then selecting the remaining elements {@code (ka, kb]} that are at
067 * the edge of the partition bounded by {@code ka}.
068 *
069 * <p>Multiple keys are partitioned collectively using an introsort method which only recurses
070 * into partitions containing indices. Excess recursion will trigger use of a heapselect
071 * on the remaining range of indices ensuring non-quadratic worse case performance. Any
072 * partition containing a single index, adjacent pair of indices, or range of indices with a
073 * small separation will use the quickselect adaptive method for single keys. Note that the
074 * maximum number of times that {@code n} indices can be split is {@code n - 1} before all
075 * indices are handled as singles.
076 *
077 * <p>Floating-point order
078 *
079 * <p>The {@code <} relation does not impose a total order on all floating-point values.
080 * This class respects the ordering imposed by {@link Double#compare(double, double)}.
081 * {@code -0.0} is treated as less than value {@code 0.0}; {@code Double.NaN} is
082 * considered greater than any other value; and all {@code Double.NaN} values are
083 * considered equal.
084 *
085 * <p>References
086 *
087 * <p>Quickselect is introduced in Hoare [1]. This selects an element {@code k} from {@code n}
088 * using repeat division of the data around a partition element, recursing into the
089 * partition that contains {@code k}.
090 *
091 * <p>Introsort/select is introduced in Musser [2]. This detects excess recursion in
092 * quicksort/select and reverts to a heapsort or linear select to achieve an improved worst
093 * case bound.
094 *
095 * <p>Use of sampling to identify a pivot that places {@code k} in the smaller partition is
096 * performed in the SELECT algorithm of Floyd and Rivest [3, 4].
097 *
098 * <p>A worst-case linear time algorithm PICK is described in Blum <i>et al</i> [5]. This uses
099 * the median of medians as a partition element for selection which ensures a minimum fraction of
100 * the elements are eliminated per iteration. This was extended to use an asymmetric pivot choice
101 * with efficient placement of the medians sample location in the QuickselectAdpative algorithm of
102 * Alexandrescu [6].
103 *
104 * <ol>
105 * <li>Hoare (1961)
106 * Algorithm 65: Find
107 * <a href="https://doi.org/10.1145%2F366622.366647">Comm. ACM. 4 (7): 321–322</a></li>
108 * <li>Musser (1999)
109 * Introspective Sorting and Selection Algorithms
110 * <a href="https://doi.org/10.1002/(SICI)1097-024X(199708)27:8%3C983::AID-SPE117%3E3.0.CO;2-%23">
111 * Software: Practice and Experience 27, 983-993.</a></li>
112 * <li>Floyd and Rivest (1975)
113 * Algorithm 489: The Algorithm SELECT—for Finding the ith Smallest of n elements.
114 * Comm. ACM. 18 (3): 173.</li>
115 * <li>Kiwiel (2005)
116 * On Floyd and Rivest's SELECT algorithm.
117 * Theoretical Computer Science 347, 214-238.</li>
118 * <li>Blum, Floyd, Pratt, Rivest, and Tarjan (1973)
119 * Time bounds for selection.
120 * <a href="https://doi.org/10.1016%2FS0022-0000%2873%2980033-9">
121 * Journal of Computer and System Sciences. 7 (4): 448–461</a>.</li>
122 * <li>Alexandrescu (2016)
123 * Fast Deterministic Selection
124 * <a href="https://arxiv.org/abs/1606.00484">arXiv:1606.00484</a>.</li>
125 * <li><a href="https://en.wikipedia.org/wiki/Quickselect">Quickselect (Wikipedia)</a></li>
126 * <li><a href="https://en.wikipedia.org/wiki/Introsort">Introsort (Wikipedia)</a></li>
127 * <li><a href="https://en.wikipedia.org/wiki/Introselect">Introselect (Wikipedia)</a></li>
128 * <li><a href="https://en.wikipedia.org/wiki/Floyd%E2%80%93Rivest_algorithm">Floyd-Rivest algorithm (Wikipedia)</a></li>
129 * <li><a href="https://en.wikipedia.org/wiki/Median_of_medians">Median of medians (Wikipedia)</a></li>
130 * </ol>
131 *
132 * @since 1.2
133 */
134public final class Selection {
135
136    /** No instances. */
137    private Selection() {}
138
139    /**
140     * Partition the array such that index {@code k} corresponds to its correctly
141     * sorted value in the equivalent fully sorted array.
142     *
143     * @param a Values.
144     * @param k Index.
145     * @throws IndexOutOfBoundsException if index {@code k} is not within the
146     * sub-range {@code [0, a.length)}
147     */
148    public static void select(double[] a, int k) {
149        IndexSupport.checkIndex(0, a.length, k);
150        doSelect(a, 0, a.length, k);
151    }
152
153    /**
154     * Partition the array such that indices {@code k} correspond to their correctly
155     * sorted value in the equivalent fully sorted array.
156     *
157     * @param a Values.
158     * @param k Indices (may be destructively modified).
159     * @throws IndexOutOfBoundsException if any index {@code k} is not within the
160     * sub-range {@code [0, a.length)}
161     */
162    public static void select(double[] a, int[] k) {
163        IndexSupport.checkIndices(0, a.length, k);
164        doSelect(a, 0, a.length, k);
165    }
166
167    /**
168     * Partition the array such that index {@code k} corresponds to its correctly
169     * sorted value in the equivalent fully sorted array.
170     *
171     * @param a Values.
172     * @param fromIndex Index of the first element (inclusive).
173     * @param toIndex Index of the last element (exclusive).
174     * @param k Index.
175     * @throws IndexOutOfBoundsException if the sub-range {@code [fromIndex, toIndex)} is out of
176     * bounds of range {@code [0, a.length)}; or if index {@code k} is not within the
177     * sub-range {@code [fromIndex, toIndex)}
178     */
179    public static void select(double[] a, int fromIndex, int toIndex, int k) {
180        IndexSupport.checkFromToIndex(fromIndex, toIndex, a.length);
181        IndexSupport.checkIndex(fromIndex, toIndex, k);
182        doSelect(a, fromIndex, toIndex, k);
183    }
184
185    /**
186     * Partition the array such that indices {@code k} correspond to their correctly
187     * sorted value in the equivalent fully sorted array.
188     *
189     * @param a Values.
190     * @param fromIndex Index of the first element (inclusive).
191     * @param toIndex Index of the last element (exclusive).
192     * @param k Indices (may be destructively modified).
193     * @throws IndexOutOfBoundsException if the sub-range {@code [fromIndex, toIndex)} is out of
194     * bounds of range {@code [0, a.length)}; or if any index {@code k} is not within the
195     * sub-range {@code [fromIndex, toIndex)}
196     */
197    public static void select(double[] a, int fromIndex, int toIndex, int[] k) {
198        IndexSupport.checkFromToIndex(fromIndex, toIndex, a.length);
199        IndexSupport.checkIndices(fromIndex, toIndex, k);
200        doSelect(a, fromIndex, toIndex, k);
201    }
202
203    /**
204     * Partition the array such that index {@code k} corresponds to its correctly
205     * sorted value in the equivalent fully sorted array.
206     *
207     * <p>This method pre/post-processes the data and indices to respect the ordering
208     * imposed by {@link Double#compare(double, double)}.
209     *
210     * @param fromIndex Index of the first element (inclusive).
211     * @param toIndex Index of the last element (exclusive).
212     * @param a Values.
213     * @param k Index.
214     */
215    private static void doSelect(double[] a, int fromIndex, int toIndex, int k) {
216        if (toIndex - fromIndex <= 1) {
217            return;
218        }
219        // Sort NaN / count signed zeros.
220        // Caution: This loop contributes significantly to the runtime.
221        int cn = 0;
222        int end = toIndex;
223        for (int i = toIndex; --i >= fromIndex;) {
224            final double v = a[i];
225            // Count negative zeros using a sign bit check
226            if (Double.doubleToRawLongBits(v) == Long.MIN_VALUE) {
227                cn++;
228                // Change to positive zero.
229                // Data must be repaired after selection.
230                a[i] = 0.0;
231            } else if (v != v) {
232                // Move NaN to end
233                a[i] = a[--end];
234                a[end] = v;
235            }
236        }
237
238        // Partition
239        if (end - fromIndex > 1 && k < end) {
240            QuickSelect.select(a, fromIndex, end - 1, k);
241        }
242
243        // Restore signed zeros
244        if (cn != 0) {
245            // Use partition index below zero to fast-forward to zero as much as possible
246            for (int j = a[k] < 0 ? k : -1;;) {
247                if (a[++j] == 0) {
248                    a[j] = -0.0;
249                    if (--cn == 0) {
250                        break;
251                    }
252                }
253            }
254        }
255    }
256
257    /**
258     * Partition the array such that indices {@code k} correspond to their correctly
259     * sorted value in the equivalent fully sorted array.
260     *
261     * <p>This method pre/post-processes the data and indices to respect the ordering
262     * imposed by {@link Double#compare(double, double)}.
263     *
264     * @param fromIndex Index of the first element (inclusive).
265     * @param toIndex Index of the last element (exclusive).
266     * @param a Values.
267     * @param k Indices (may be destructively modified).
268     */
269    private static void doSelect(double[] a, int fromIndex, int toIndex, int[] k) {
270        if (k.length == 0 || toIndex - fromIndex <= 1) {
271            return;
272        }
273        // Sort NaN / count signed zeros.
274        // Caution: This loop contributes significantly to the runtime for single indices.
275        int cn = 0;
276        int end = toIndex;
277        for (int i = toIndex; --i >= fromIndex;) {
278            final double v = a[i];
279            // Count negative zeros using a sign bit check
280            if (Double.doubleToRawLongBits(v) == Long.MIN_VALUE) {
281                cn++;
282                // Change to positive zero.
283                // Data must be repaired after selection.
284                a[i] = 0.0;
285            } else if (v != v) {
286                // Move NaN to end
287                a[i] = a[--end];
288                a[end] = v;
289            }
290        }
291
292        // Partition
293        int n = 0;
294        if (end - fromIndex > 1) {
295            n = k.length;
296            // Filter indices invalidated by NaN check
297            if (end < toIndex) {
298                for (int i = n; --i >= 0;) {
299                    final int index = k[i];
300                    if (index >= end) {
301                        // Move to end
302                        k[i] = k[--n];
303                        k[n] = index;
304                    }
305                }
306            }
307            // Return n, the count of used indices in k.
308            // Use this to post-process zeros.
309            n = QuickSelect.select(a, fromIndex, end - 1, k, n);
310        }
311
312        // Restore signed zeros
313        if (cn != 0) {
314            // Use partition indices below zero to fast-forward to zero as much as possible
315            int j = -1;
316            if (n < 0) {
317                // Binary search on -n sorted indices: hi = (-n) - 1
318                int lo = 0;
319                int hi = ~n;
320                while (lo <= hi) {
321                    final int mid = (lo + hi) >>> 1;
322                    if (a[k[mid]] < 0) {
323                        j = mid;
324                        lo = mid + 1;
325                    } else {
326                        hi = mid - 1;
327                    }
328                }
329            } else {
330                // Unsorted, process all indices
331                for (int i = n; --i >= 0;) {
332                    if (a[k[i]] < 0) {
333                        j = k[i];
334                    }
335                }
336            }
337            for (;;) {
338                if (a[++j] == 0) {
339                    a[j] = -0.0;
340                    if (--cn == 0) {
341                        break;
342                    }
343                }
344            }
345        }
346    }
347
348    /**
349     * Partition the array such that index {@code k} corresponds to its correctly
350     * sorted value in the equivalent fully sorted array.
351     *
352     * @param a Values.
353     * @param k Index.
354     * @throws IndexOutOfBoundsException if index {@code k} is not within the
355     * sub-range {@code [0, a.length)}
356     */
357    public static void select(int[] a, int k) {
358        IndexSupport.checkIndex(0, a.length, k);
359        if (a.length <= 1) {
360            return;
361        }
362        QuickSelect.select(a, 0, a.length - 1, k);
363    }
364
365    /**
366     * Partition the array such that indices {@code k} correspond to their correctly
367     * sorted value in the equivalent fully sorted array.
368     *
369     * @param a Values.
370     * @param k Indices (may be destructively modified).
371     * @throws IndexOutOfBoundsException if any index {@code k} is not within the
372     * sub-range {@code [0, a.length)}
373     */
374    public static void select(int[] a, int[] k) {
375        IndexSupport.checkIndices(0, a.length, k);
376        if (k.length == 0 || a.length <= 1) {
377            return;
378        }
379        QuickSelect.select(a, 0, a.length - 1, k, k.length);
380    }
381
382    /**
383     * Partition the array such that index {@code k} corresponds to its correctly
384     * sorted value in the equivalent fully sorted array.
385     *
386     * @param a Values.
387     * @param fromIndex Index of the first element (inclusive).
388     * @param toIndex Index of the last element (exclusive).
389     * @param k Index.
390     * @throws IndexOutOfBoundsException if the sub-range {@code [fromIndex, toIndex)} is out of
391     * bounds of range {@code [0, a.length)}; or if index {@code k} is not within the
392     * sub-range {@code [fromIndex, toIndex)}
393     */
394    public static void select(int[] a, int fromIndex, int toIndex, int k) {
395        IndexSupport.checkFromToIndex(fromIndex, toIndex, a.length);
396        IndexSupport.checkIndex(fromIndex, toIndex, k);
397        if (toIndex - fromIndex <= 1) {
398            return;
399        }
400        QuickSelect.select(a, fromIndex, toIndex - 1, k);
401    }
402
403    /**
404     * Partition the array such that indices {@code k} correspond to their correctly
405     * sorted value in the equivalent fully sorted array.
406     *
407     * @param a Values.
408     * @param fromIndex Index of the first element (inclusive).
409     * @param toIndex Index of the last element (exclusive).
410     * @param k Indices (may be destructively modified).
411     * @throws IndexOutOfBoundsException if the sub-range {@code [fromIndex, toIndex)} is out of
412     * bounds of range {@code [0, a.length)}; or if any index {@code k} is not within the
413     * sub-range {@code [fromIndex, toIndex)}
414     */
415    public static void select(int[] a, int fromIndex, int toIndex, int[] k) {
416        IndexSupport.checkFromToIndex(fromIndex, toIndex, a.length);
417        IndexSupport.checkIndices(fromIndex, toIndex, k);
418        if (k.length == 0 || toIndex - fromIndex <= 1) {
419            return;
420        }
421        QuickSelect.select(a, fromIndex, toIndex - 1, k, k.length);
422    }
423
424    /**
425     * Partition the array such that index {@code k} corresponds to its correctly
426     * sorted value in the equivalent fully sorted array.
427     *
428     * @param a Values.
429     * @param k Index.
430     * @throws IndexOutOfBoundsException if index {@code k} is not within the
431     * sub-range {@code [0, a.length)}
432     * @since 1.3
433     */
434    public static void select(long[] a, int k) {
435        IndexSupport.checkIndex(0, a.length, k);
436        if (a.length <= 1) {
437            return;
438        }
439        QuickSelect.select(a, 0, a.length - 1, k);
440    }
441
442    /**
443     * Partition the array such that indices {@code k} correspond to their correctly
444     * sorted value in the equivalent fully sorted array.
445     *
446     * @param a Values.
447     * @param k Indices (may be destructively modified).
448     * @throws IndexOutOfBoundsException if any index {@code k} is not within the
449     * sub-range {@code [0, a.length)}
450     * @since 1.3
451     */
452    public static void select(long[] a, int[] k) {
453        IndexSupport.checkIndices(0, a.length, k);
454        if (k.length == 0 || a.length <= 1) {
455            return;
456        }
457        QuickSelect.select(a, 0, a.length - 1, k, k.length);
458    }
459
460    /**
461     * Partition the array such that index {@code k} corresponds to its correctly
462     * sorted value in the equivalent fully sorted array.
463     *
464     * @param a Values.
465     * @param fromIndex Index of the first element (inclusive).
466     * @param toIndex Index of the last element (exclusive).
467     * @param k Index.
468     * @throws IndexOutOfBoundsException if the sub-range {@code [fromIndex, toIndex)} is out of
469     * bounds of range {@code [0, a.length)}; or if index {@code k} is not within the
470     * sub-range {@code [fromIndex, toIndex)}
471     * @since 1.3
472     */
473    public static void select(long[] a, int fromIndex, int toIndex, int k) {
474        IndexSupport.checkFromToIndex(fromIndex, toIndex, a.length);
475        IndexSupport.checkIndex(fromIndex, toIndex, k);
476        if (toIndex - fromIndex <= 1) {
477            return;
478        }
479        QuickSelect.select(a, fromIndex, toIndex - 1, k);
480    }
481
482    /**
483     * Partition the array such that indices {@code k} correspond to their correctly
484     * sorted value in the equivalent fully sorted array.
485     *
486     * @param a Values.
487     * @param fromIndex Index of the first element (inclusive).
488     * @param toIndex Index of the last element (exclusive).
489     * @param k Indices (may be destructively modified).
490     * @throws IndexOutOfBoundsException if the sub-range {@code [fromIndex, toIndex)} is out of
491     * bounds of range {@code [0, a.length)}; or if any index {@code k} is not within the
492     * sub-range {@code [fromIndex, toIndex)}
493     * @since 1.3
494     */
495    public static void select(long[] a, int fromIndex, int toIndex, int[] k) {
496        IndexSupport.checkFromToIndex(fromIndex, toIndex, a.length);
497        IndexSupport.checkIndices(fromIndex, toIndex, k);
498        if (k.length == 0 || toIndex - fromIndex <= 1) {
499            return;
500        }
501        QuickSelect.select(a, fromIndex, toIndex - 1, k, k.length);
502    }
503}