\begin{page}{manpageXXm01}{NAG On-line Documentation: m01} \beginscroll \begin{verbatim} M01(3NAG) Foundation Library (12/10/92) M01(3NAG) M01 -- Sorting Introduction -- M01 Chapter M01 Sorting 1. Scope of the Chapter This chapter is concerned with sorting numeric data. It handles only the simplest types of data structure and it is concerned only with internal sorting -- that is, sorting a set of data which can all be stored within the program. Users with large files of data or complex data structures to be sorted should use a comprehensive sorting program or package. 2. Background to the Problems The usefulness of sorting is obvious (perhaps a little too obvious, since sorting can be expensive and is sometimes done when not strictly necessary). Sorting may traditionally be associated with data-processing and non-numerical programming, but it has many uses within the realm of numerical analysis, for example, to arrange eigenvalues in ascending order of absolute value and in the ranking of observations for nonparametric statistics. The general problem may be defined as follows. We are given N items of data R ,R ,...,R . 1 2 N Each item R contains a key K which can be ordered relative to i i any other key according to some specified criterion (for example, ascending numeric value). The problem is to determine a permutation p(1),p(2),...,p(N) which puts the keys in order: K <=K <=...<=K p(1) p(2) p(N) Sometimes we may wish actually to rearrange the items so that their keys are in order; for other purposes we may simply require a table of indices so that the items can be referred to in sorted order; or yet again we may require a table of ranks, that is, the positions of each item in the sorted order. For example, given the single-character items, to be sorted into alphabetic order: E B A D C the indices of the items in sorted order are 3 2 5 4 1 and the ranks of the items are 5 2 1 4 3 Indices may be converted to ranks, and vice versa, by simply computing the inverse permutation. The items may consist solely of the key (each item may simply be a number). On the other hand, the items may contain additional information (for example, each item may be an eigenvalue of a matrix and its associated eigenvector, the eigenvalue being the key). In the latter case there may be many distinct items with equal keys, and it may be important to preserve the original order among them (if this is achieved, the sorting is called ' stable'). There are a number of ingenious algorithms for sorting. For a fascinating discussion of them, and of the whole subject, see Knuth [1]. 2.1. References [1] Knuth D E (1973) The Art of Computer Programming, Vol 3. Addison-Wesley. 3. Recommendations on Choice and Use of Routines Four categories of routines are provided: - routines which rearrange the data into sorted order (M01C-); - routines which determine the ranks of the data, leaving the data unchanged (M01D-); - routines which rearrange the data according to pre-determined ranks (M01E-); - service routines (M01Z-). Routines are provided for sorting double precision data only. If the task is simply to rearrange a one-dimensional array of data into sorted order, then M01CAF should be used, since this requires no extra workspace and is faster than any other method. For many applications it is in fact preferable to separate the task of determining the sorted order (ranking) from the task of rearranging data into a pre-determined order; the latter task may not need to be performed at all. Frequently it may be sufficient to refer to the data in sorted order via an index vector, without rearranging it. Frequently also one set of data (e.g. a column of a matrix) is used for determining a set of ranks, which are then applied to other data (e.g. the remaining columns of the matrix). To determine the ranks of a set of data, use an M01D- routine. Routines are provided for ranking one-dimensional arrays, and for ranking rows or columns of two-dimensional arrays. To create an index vector so that data can be referred to in sorted order, first call an M01D- routine to determine the ranks, and then call M01ZAF to convert the vector of ranks into an index vector. To rearrange data according to pre-determined ranks: use M01EAF if the data is stored in a one-dimensional array; or if the data is stored in a more complicated structure, use an index vector to generate a new copy of the data in the desired order. Examples of these operations can be found in the routine documents of the relevant routines. 4. Index Ranking: columns of a matrix, double precision numbers M01DJF rows of a matrix, double precision numbers M01DEF vector, double precision numbers M01DAF Rearranging (according to pre-determined ranks): vector, double precision numbers M01EAF Service routines: invert a permutation (ranks to indices or vice M01ZAF versa) Sorting (i.e., rearranging into sorted order): vector, double precision numbers M01CAF M01 -- Sorting Contents -- M01 Chapter M01 Sorting M01CAF Sort a vector, double precision numbers M01DAF Rank a vector, double precision numbers M01DEF Rank rows of a matrix, double precision numbers M01DJF Rank columns of a matrix, double precision numbers M01EAF Rearrange a vector according to given ranks, double precision numbers M01ZAF Invert a permutation \end{verbatim} \endscroll \end{page} \begin{page}{manpageXXm01caf}{NAG On-line Documentation: m01caf} \beginscroll \begin{verbatim} M01CAF(3NAG) Foundation Library (12/10/92) M01CAF(3NAG) M01 -- Sorting M01CAF M01CAF -- NAG Foundation Library Routine Document Note: Before using this routine, please read the Users' Note for your implementation to check implementation-dependent details. The symbol (*) after a NAG routine name denotes a routine that is not included in the Foundation Library. 1. Purpose M01CAF rearranges a vector of double precision numbers into ascending or descending order. 2. Specification SUBROUTINE M01CAF (RV, M1, M2, ORDER, IFAIL) INTEGER M1, M2, IFAIL DOUBLE PRECISION RV(M2) CHARACTER*1 ORDER 3. Description M01CAF is based on Singleton's implementation of the 'median-of- three' Quicksort algorithm [2], but with two additional modifications. First, small subfiles are sorted by an insertion sort on a separate final pass (Sedgewick [1]). Second, if a subfile is partitioned into two very unbalanced subfiles, the larger of them is flagged for special treatment: before it is partitioned, its end-points are swapped with two random points within it; this makes the worst case behaviour extremely unlikely. 4. References [1] Sedgewick R (1978) Implementing Quicksort Programs. Comm. ACM. 21 847--857. [2] Singleton R C (1969) An Efficient Algorithm for Sorting with Minimal Storage: Algorithm 347. Comm. ACM. 12 185--187. 5. Parameters 1: RV(M2) -- DOUBLE PRECISION array Input/Output On entry: elements M1 to M2 of RV must contain double precision values to be sorted. On exit: these values are rearranged into sorted order. 2: M1 -- INTEGER Input On entry: the index of the first element of RV to be sorted. Constraint: M1 > 0. 3: M2 -- INTEGER Input On entry: the index of the last element of RV to be sorted. Constraint: M2 >= M1. 4: ORDER -- CHARACTER*1 Input On entry: if ORDER is 'A', the values will be sorted into ascending (i.e., non-decreasing) order; if ORDER is 'D', into descending order. Constraint: ORDER = 'A' or 'D'. 5: IFAIL -- INTEGER Input/Output On entry: IFAIL must be set to 0, -1 or 1. For users not familiar with this parameter (described in the Essential Introduction) the recommended value is 0. On exit: IFAIL = 0 unless the routine detects an error (see Section 6). 6. Error Indicators and Warnings Errors detected by the routine: If on entry IFAIL = 0 or -1, explanatory error messages are output on the current error message unit (as defined by X04AAF). IFAIL= 1 On entry M2 < 1, or M1 < 1, or M1 > M2. IFAIL= 2 On entry ORDER is not 'A' or 'D'. 7. Accuracy Not applicable. 8. Further Comments The average time taken by the routine is approximately proportional to n*logn, where n = M2 - M1 + 1. The worst case 2 time is proportional to n but this is extremely unlikely to occur. 9. Example The example program reads a list of double precision numbers and sorts them into ascending order. The example program is not reproduced here. The source code for all example programs is distributed with the NAG Foundation Library software and should be available on-line. \end{verbatim} \endscroll \end{page} \begin{page}{manpageXXm01daf}{NAG On-line Documentation: m01daf} \beginscroll \begin{verbatim} M01DAF(3NAG) Foundation Library (12/10/92) M01DAF(3NAG) M01 -- Sorting M01DAF M01DAF -- NAG Foundation Library Routine Document Note: Before using this routine, please read the Users' Note for your implementation to check implementation-dependent details. The symbol (*) after a NAG routine name denotes a routine that is not included in the Foundation Library. 1. Purpose M01DAF ranks a vector of double precision numbers in ascending or descending order. 2. Specification SUBROUTINE M01DAF (RV, M1, M2, ORDER, IRANK, IFAIL) INTEGER M1, M2, IRANK(M2), IFAIL DOUBLE PRECISION RV(M2) CHARACTER*1 ORDER 3. Description M01DAF uses a variant of list-merging, as described by Knuth [1] pp 165-166. The routine takes advantage of natural ordering in the data, and uses a simple list insertion in a preparatory pass to generate ordered lists of length at least 10. The ranking is stable: equal elements preserve their ordering in the input data. 4. References [1] Knuth D E (1973) The Art of Computer Programming, Vol 3. Addison-Wesley. 5. Parameters 1: RV(M2) -- DOUBLE PRECISION array Input On entry: elements M1 to M2 of RV must contain double precision values to be ranked. 2: M1 -- INTEGER Input On entry: the index of the first element of RV to be ranked. Constraint: M1 > 0. 3: M2 -- INTEGER Input On entry: the index of the last element of RV to be ranked. Constraint: M2 >= M1. 4: ORDER -- CHARACTER*1 Input On entry: if ORDER is 'A', the values will be ranked in ascending (i.e., non-decreasing) order; if ORDER is 'D', into descending order. Constraint: ORDER = 'A' or 'D'. 5: IRANK(M2) -- INTEGER array Output On exit: elements M1 to M2 of IRANK contain the ranks of the corresponding elements of RV. Note that the ranks are in the range M1 to M2: thus, if RV(i) is the first element in the rank order, IRANK(i) is set to M1. 6: IFAIL -- INTEGER Input/Output On entry: IFAIL must be set to 0, -1 or 1. For users not familiar with this parameter (described in the Essential Introduction) the recommended value is 0. On exit: IFAIL = 0 unless the routine detects an error (see Section 6). 6. Error Indicators and Warnings Errors detected by the routine: If on entry IFAIL = 0 or -1, explanatory error messages are output on the current error message unit (as defined by X04AAF). IFAIL= 1 On entry M2 < 1, or M1 < 1, or M1 > M2. IFAIL= 2 On entry ORDER is not 'A' or 'D'. 7. Accuracy Not applicable. 8. Further Comments The average time taken by the routine is approximately proportional to n*logn, where n = M2 - M1 + 1. 9. Example The example program reads a list of double precision numbers and ranks them in ascending order. The example program is not reproduced here. The source code for all example programs is distributed with the NAG Foundation Library software and should be available on-line. \end{verbatim} \endscroll \end{page} \begin{page}{manpageXXm01def}{NAG On-line Documentation: m01def} \beginscroll \begin{verbatim} M01DEF(3NAG) Foundation Library (12/10/92) M01DEF(3NAG) M01 -- Sorting M01DEF M01DEF -- NAG Foundation Library Routine Document Note: Before using this routine, please read the Users' Note for your implementation to check implementation-dependent details. The symbol (*) after a NAG routine name denotes a routine that is not included in the Foundation Library. 1. Purpose M01DEF ranks the rows of a matrix of double precision numbers in ascending or descending order. 2. Specification SUBROUTINE M01DEF (RM, LDM, M1, M2, N1, N2, ORDER, IRANK, 1 IFAIL) INTEGER LDM, M1, M2, N1, N2, IRANK(M2), IFAIL DOUBLE PRECISION RM(LDM,N2) CHARACTER*1 ORDER 3. Description M01DEF ranks rows M1 to M2 of a matrix, using the data in columns N1 to N2 of those rows. The ordering is determined by first ranking the data in column N1, then ranking any tied rows according to the data in column N1 + 1, and so on up to column N2. M01DEF uses a variant of list-merging, as described by Knuth [1] pp 165-166. The routine takes advantage of natural ordering in the data, and uses a simple list insertion in a preparatory pass to generate ordered lists of length at least 10. The ranking is stable: equal rows preserve their ordering in the input data. 4. References [1] Knuth D E (1973) The Art of Computer Programming, Vol 3. Addison-Wesley. 5. Parameters 1: RM(LDM,N2) -- DOUBLE PRECISION array Input On entry: columns N1 to N2 of rows M1 to M2 of RM must contain double precision data to be ranked. 2: LDM -- INTEGER Input On entry: the first dimension of the array RM as declared in the (sub)program from which M01DEF is called. Constraint: LDM >= M2. 3: M1 -- INTEGER Input On entry: the index of the first row of RM to be ranked. Constraint: M1 > 0. 4: M2 -- INTEGER Input On entry: the index of the last row of RM to be ranked. Constraint: M2 >= M1. 5: N1 -- INTEGER Input On entry: the index of the first column of RM to be used. Constraint: N1 > 0. 6: N2 -- INTEGER Input On entry: the index of the last column of RM to be used. Constraint: N2 >= N1. 7: ORDER -- CHARACTER*1 Input On entry: if ORDER is 'A', the rows will be ranked in ascending (i.e., non-decreasing) order; if ORDER is 'D', into descending order. Constraint: ORDER = 'A' or 'D'. 8: IRANK(M2) -- INTEGER array Output On exit: elements M1 to M2 of IRANK contain the ranks of the corresponding rows of RM. Note that the ranks are in the range M1 to M2: thus, if the ith row of RM is the first in the rank order, IRANK(i) is set to M1. 9: IFAIL -- INTEGER Input/Output On entry: IFAIL must be set to 0, -1 or 1. For users not familiar with this parameter (described in the Essential Introduction) the recommended value is 0. On exit: IFAIL = 0 unless the routine detects an error (see Section 6). 6. Error Indicators and Warnings Errors detected by the routine: If on entry IFAIL = 0 or -1, explanatory error messages are output on the current error message unit (as defined by X04AAF). IFAIL= 1 On entry M2 < 1, or N2 < 1, or M1 < 1, or M1 > M2, or N1 < 1, or N1 > N2, or LDM < M2. IFAIL= 2 On entry ORDER is not 'A' or 'D'. 7. Accuracy Not applicable. 8. Further Comments The average time taken by the routine is approximately proportional to n*logn, where n = M2 - M1 + 1. 9. Example The example program reads a matrix of double precision numbers and ranks the rows in ascending order. The example program is not reproduced here. The source code for all example programs is distributed with the NAG Foundation Library software and should be available on-line. \end{verbatim} \endscroll \end{page} \begin{page}{manpageXXm01djf}{NAG On-line Documentation: m01djf} \beginscroll \begin{verbatim} M01DJF(3NAG) Foundation Library (12/10/92) M01DJF(3NAG) M01 -- Sorting M01DJF M01DJF -- NAG Foundation Library Routine Document Note: Before using this routine, please read the Users' Note for your implementation to check implementation-dependent details. The symbol (*) after a NAG routine name denotes a routine that is not included in the Foundation Library. 1. Purpose M01DJF ranks the columns of a matrix of double precision numbers in ascending or descending order. 2. Specification SUBROUTINE M01DJF (RM, LDM, M1, M2, N1, N2, ORDER, IRANK, 1 IFAIL) INTEGER LDM, M1, M2, N1, N2, IRANK(N2), IFAIL DOUBLE PRECISION RM(LDM,N2) CHARACTER*1 ORDER 3. Description M01DJF ranks columns N1 to N2 of a matrix, using the data in rows M1 to M2 of those columns. The ordering is determined by first ranking the data in row M1, then ranking any tied columns according to the data in row M1 + 1, and so on up to row M2. M01DJF uses a variant of list-merging, as described by Knuth [1] pp 165-166. The routine takes advantage of natural ordering in the data, and uses a simple list insertion in a preparatory pass to generate ordered lists of length at least 10. The ranking is stable: equal columns preserve their ordering in the input data. 4. References [1] Knuth D E (1973) The Art of Computer Programming, Vol 3. Addison-Wesley. 5. Parameters 1: RM(LDM,N2) -- DOUBLE PRECISION array Input On entry: rows M1 to M2 of columns N1 to N2 of RM must contain double precision data to be ranked. 2: LDM -- INTEGER Input On entry: the first dimension of the array RM as declared in the (sub)program from which M01DJF is called. Constraint: LDM >= M2. 3: M1 -- INTEGER Input On entry: the index of the first row of RM to be used. Constraint: M1 > 0. 4: M2 -- INTEGER Input On entry: the index of the last row of RM to be used. Constraint: M2 >= M1. 5: N1 -- INTEGER Input On entry: the index of the first column of RM to be ranked. Constraint: N1 > 0. 6: N2 -- INTEGER Input On entry: the index of the last column of RM to be ranked. Constraint: N2 >= N1. 7: ORDER -- CHARACTER*1 Input On entry: if ORDER is 'A', the columns will be ranked in ascending (i.e., non-decreasing) order; if ORDER is 'D', into descending order. Constraint: ORDER = 'A' or 'D'. 8: IRANK(N2) -- INTEGER array Output On exit: elements N1 to N2 of IRANK contain the ranks of the corresponding columns of RM. Note that the ranks are in the range N1 to N2: thus, if the ith column of RM is the first in the rank order, IRANK(i) is set to N1. 9: IFAIL -- INTEGER Input/Output On entry: IFAIL must be set to 0, -1 or 1. For users not familiar with this parameter (described in the Essential Introduction) the recommended value is 0. On exit: IFAIL = 0 unless the routine detects an error (see Section 6). 6. Error Indicators and Warnings Errors detected by the routine: If on entry IFAIL = 0 or -1, explanatory error messages are output on the current error message unit (as defined by X04AAF). IFAIL= 1 On entry M2 < 1, or N2 < 1, or M1 < 1, or M1 > M2, or N1 < 1, or N1 > N2, or LDM < M2. IFAIL= 2 On entry ORDER is not 'A' or 'D'. 7. Accuracy Not applicable. 8. Further Comments The average time taken by the routine is approximately proportional to n*logn, where n = N2 - N1 + 1. 9. Example The example program reads a matrix of double precision numbers and ranks the columns in ascending order. The example program is not reproduced here. The source code for all example programs is distributed with the NAG Foundation Library software and should be available on-line. \end{verbatim} \endscroll \end{page} \begin{page}{manpageXXm01eaf}{NAG On-line Documentation: m01eaf} \beginscroll \begin{verbatim} M01EAF(3NAG) Foundation Library (12/10/92) M01EAF(3NAG) M01 -- Sorting M01EAF M01EAF -- NAG Foundation Library Routine Document Note: Before using this routine, please read the Users' Note for your implementation to check implementation-dependent details. The symbol (*) after a NAG routine name denotes a routine that is not included in the Foundation Library. 1. Purpose M01EAF rearranges a vector of double precision numbers into the order specified by a vector of ranks. 2. Specification SUBROUTINE M01EAF (RV, M1, M2, IRANK, IFAIL) INTEGER M1, M2, IRANK(M2), IFAIL DOUBLE PRECISION RV(M2) 3. Description M01EAF is designed to be used typically in conjunction with the M01D- ranking routines. After one of the M01D- routines has been called to determine a vector of ranks, M01EAF can be called to rearrange a vector of real numbers into the rank order. If the vector of ranks has been generated in some other way, then M01ZBF(*) should be called to check its validity before M01EAF is called. 4. References None. 5. Parameters 1: RV(M2) -- DOUBLE PRECISION array Input/Output On entry: elements M1 to M2 of RV must contain double precision values to be rearranged. On exit: these values are rearranged into rank order. For example, if IRANK(i) = M1, then the initial value of RV(i) is moved to RV(M1). 2: M1 -- INTEGER Input 3: M2 -- INTEGER Input On entry: M1 and M2 must specify the range of the ranks supplied in IRANK and the elements of RV to be rearranged. Constraint: 0 < M1 <= M2. 4: IRANK(M2) -- INTEGER array Input On entry: elements M1 to M2 of IRANK must contain a permutation of the integers M1 to M2, which are interpreted as a vector of ranks. 5: IFAIL -- INTEGER Input/Output On entry: IFAIL must be set to 0, -1 or 1. For users not familiar with this parameter (described in the Essential Introduction) the recommended value is 0. On exit: IFAIL = 0 unless the routine detects an error (see Section 6). 6. Error Indicators and Warnings Errors detected by the routine: If on entry IFAIL = 0 or -1, explanatory error messages are output on the current error message unit (as defined by X04AAF). IFAIL= 1 On entry M2 < 1, or M1 < 1, or M1 > M2. IFAIL= 2 Elements M1 to M2 of IRANK contain a value outside the range M1 to M2. IFAIL= 3 Elements M1 to M2 of IRANK contain a repeated value. If IFAIL = 2 or 3, elements M1 to M2 of IRANK do not contain a permutation of the integers M1 to M2. On exit, the contents of RV may be corrupted. To check the validity of IRANK without the risk of corrupting RV, use M01ZBF(*). 7. Accuracy Not applicable. 8. Further Comments The average time taken by the routine is approximately proportional to n, where n = M2 - M1 + 1. 9. Example The example program reads a matrix of double precision numbers and rearranges its rows so that the elements of the kth column are in ascending order. To do this, the program first calls M01DAF to rank the elements of the kth column, and then calls M01EAF to rearrange each column into the order specified by the ranks. The value of k is read from the data-file. The example program is not reproduced here. The source code for all example programs is distributed with the NAG Foundation Library software and should be available on-line. \end{verbatim} \endscroll \end{page} \begin{page}{manpageXXm01zaf}{NAG On-line Documentation: m01zaf} \beginscroll \begin{verbatim} M01ZAF(3NAG) Foundation Library (12/10/92) M01ZAF(3NAG) M01 -- Sorting M01ZAF M01ZAF -- NAG Foundation Library Routine Document Note: Before using this routine, please read the Users' Note for your implementation to check implementation-dependent details. The symbol (*) after a NAG routine name denotes a routine that is not included in the Foundation Library. 1. Purpose M01ZAF inverts a permutation, and hence converts a rank vector to an index vector, or vice versa. 2. Specification SUBROUTINE M01ZAF (IPERM, M1, M2, IFAIL) INTEGER IPERM(M2), M1, M2, IFAIL 3. Description There are two common ways of describing a permutation using an integer vector IPERM. The first uses ranks: IPERM(i) holds the position to which the ith data element should be moved in order to sort the data; in other words its rank in the sorted order. The second uses indices: IPERM(i) holds the current position of the data element which would occur in ith position in sorted order. For example, given the values 3.5 5.9 2.9 0.5 to be sorted in ascending order, the ranks would be 3 4 2 1 and the indices would be 4 3 1 2 The M01D- routines generate ranks, and the M01E- routines require ranks to be supplied to specify the re-ordering. However if it is desired simply to refer to the data in sorted order without actually re-ordering them, indices are more convenient than ranks (see the example in Section 9). M01ZAF can be used to convert ranks to indices, or indices to ranks, as the two permutations are inverses of one another. 4. References None. 5. Parameters 1: IPERM(M2) -- INTEGER array Input/Output On entry: elements M1 to M2 of IPERM must contain a permutation of the integers M1 to M2. On exit: these elements contain the inverse permutation of the integers M1 to M2. 2: M1 -- INTEGER Input 3: M2 -- INTEGER Input On entry: M1 and M2 must specify the range of elements used in the array IPERM and the range of values in the permutation, as specified under IPERM. Constraint: 0 < M1 <= M2. 4: IFAIL -- INTEGER Input/Output On entry: IFAIL must be set to 0, -1 or 1. For users not familiar with this parameter (described in the Essential Introduction) the recommended value is 0. On exit: IFAIL = 0 unless the routine detects an error (see Section 6). 6. Error Indicators and Warnings Errors detected by the routine: If on entry IFAIL = 0 or -1, explanatory error messages are output on the current error message unit (as defined by X04AAF). IFAIL= 1 On entry M2 < 1, or M1 < 1, or M1 > M2. IFAIL= 2 Elements M1 to M2 of IPERM contain a value outside the range M1 to M2. IFAIL= 3 Elements M1 to M2 of IPERM contain a repeated value. If IFAIL = 2 or 3, elements M1 to M2 of IPERM do not contain a permutation of the integers M1 to M2; on exit these elements are usually corrupted. To check the validity of a permutation without the risk of corrupting it, use M01ZBF(*). 7. Accuracy Not applicable. 8. Further Comments None. 9. Example The example program reads a matrix of double precision numbers and prints its rows in ascending order as ranked by M01DEF. The program first calls M01DEF to rank the rows, and then calls M01ZAF to convert the rank vector to an index vector, which is used to refer to the rows in sorted order. The example program is not reproduced here. The source code for all example programs is distributed with the NAG Foundation Library software and should be available on-line. \end{verbatim} \endscroll \end{page}