您的位置：首页 > 编程语言 > Python开发

NumPy间接排序——argsort、lexsort

2019-03-13 13:00 316 查看

最近想系统学习机器学习所需要的工具，像语言python以及其库numpy、pandas啊什么的，所以借博客记录学习路上遇到的疑惑以及解惑；

NumPy是Numerical Python的简称，是一个Python科学计算的基础包；

NumPy的排序方式主要可以概括为直接排序和间接排序两种：直接排序较为直观，对数值直接排序；而间接排序指根据一个或多个键对数据集进行排序。在NumPy中，直接排序主要使用sort函数，而间接排序经常使用argsort和lexsort函数。

此处，主要介绍间接排序；

argsort（）

[code]>>> arr = np.array([2,3,6,8,0,7])
>>> print('排序后的数组为：'，arr.argsort())

排序后的数组为：[4，0，1，2，5，3]

由上展示了argsort()函数的一次运行。可能你会讶异，输入数组内元素与排序后数组元素怎么匹配不上啊？哪里有4？那里有5？

不过你可能会注意到，排序后数组内元素是[0-6）的分布。其实，利用argsort()函数对数组排序，实际上返回的是数组元素从小到大位置的索引。

lexsort（）

[code]numpy.lexsort(keys, axis=-1)

Given multiple sorting keys, which can be interpreted as columns in a spreadsheet,
lexsort returns an array of integer indices that describes the sort order by multiple
columns. The last key in the sequence is used for the primary sort order, the second-
to-last key for the secondary sort order, and so on. The keys argument must be a sequence
of objects that can be converted to arrays of the same shape. If a 2D array is provided
for the keys argument, it’s rows are interpreted as the sorting keys and sorting is
according to the last row, second last row etc.

Parameters:
keys : (k, N) array or tuple containing k (N,)-shaped sequences

The k different “columns” to be sorted. The last column (or row if keys is a 2D
array) is the primary sort key.

axis : int, optional

Axis to be indirectly sorted. By default, sort over the last axis.

Returns:
indices : (N,) ndarray of ints

Array of indices that sort the keys along the specified axis.

以上为官方提供的参考文档，下面我们尝试使用lexsort()函数进行间接排序；

[code]>>> a = np.array([3,2,6,4,5])
>>> b = np.array([50,30,40,20,10])
>>> c = np.array([400,300,600,100,200])
>>> d = np.lexsort((a,b,c))    #lexsort函数只接收一个参数，即(a,b,c)
>>> print(d)

[3 4 1 0 2]

>>> print('排序后的数组为：'，list(zip(a[d],b[d],c[d])))    #zip用于"缝合"序列，返回可迭代对象

排序后的数组为：[(4, 20, 100), (5, 10, 200), (2, 30, 300), (3, 50, 400), (6, 40, 600)]

上面的示例中，我们定义了三个数组，紧接着将对数组a,b,c的排序结果赋值给d;

在print(d)操作后，返回了[3 4 1 0 2]，细心的你可能已经发现，返回的序列正是数组 c 排序后的索引，这等价于argsort(c)不是吗？最后的list操作，将“缝合”起来的数组依据排序结果（即[3 4 1 0 2]）进行输出，即得到了上示结果。简单地说，此示例中，lexsort()根据传入三个数组中的最后一个数组进行排序，无需征求数组a,b的同意，“强迫”它们也按照这个次序排序。若 c 数组中存相同元素，则依据倒数第二个键进行排序；

正如官方文档中写明，序列中的最后一个键用于主排序顺序，倒数第二个键用于辅助排序顺序，依此类推；

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航