您的位置：首页 > 运维架构

spatialhadoop2.3源码阅读(八) RTree索引生成方法(一)

2015-12-21 19:59 330 查看

SpatialHadoop的索引生成类为edu.umn.cs.spatialHadoop.operations.Repartition。该类的main方法，repartition方法以及repartitionMapReduce的第一部分和第三部分，均与spatialhadoop2.3源码阅读(五)
grid 索引生成方法(一)中介绍的相同，本文重点介绍repartitionMapReduce的第二部分，具体代码如下：

/**
* Create rectangles that together pack all points in sample such that
* each rectangle contains roughly the same number of points. In other words
* it tries to balance number of points in each rectangle.
* Works similar to the logic of bulkLoad but does only one level of
* rectangles.
* @param samples
* @param gridInfo - Used as a hint for number of rectangles per row or column
* @return
*/
public static Rectangle[] packInRectangles(GridInfo gridInfo, final Point[] sample) {
Rectangle[] rectangles = new Rectangle[gridInfo.columns * gridInfo.rows];
int iRectangle = 0;
// Sort in x direction
final IndexedSortable sortableX = new IndexedSortable() {
@Override
public void swap(int i, int j) {
Point temp = sample[i];
sample[i] = sample[j];
sample[j] = temp;
}

@Override
public int compare(int i, int j) {
if (sample[i].x < sample[j].x)
return -1;
if (sample[i].x > sample[j].x)
return 1;
return 0;
}
};

// Sort in y direction
final IndexedSortable sortableY = new IndexedSortable() {
@Override
public void swap(int i, int j) {
Point temp = sample[i];
sample[i] = sample[j];
sample[j] = temp;
}

@Override
public int compare(int i, int j) {
if (sample[i].y < sample[j].y)
return -1;
if (sample[i].y > sample[j].y)
return 1;
return 0;
}
};

final QuickSort quickSort = new QuickSort();

quickSort.sort(sortableX, 0, sample.length);
for(int i = 0;i < sample.length;i++){
System.out.println(sample[i]);
}
int xindex1 = 0;
double x1 = gridInfo.x1;
for (int col = 0; col < gridInfo.columns; col++) {
int xindex2 = sample.length * (col + 1) / gridInfo.columns;

// Determine extents for all rectangles in this column
double x2 = col == gridInfo.columns - 1 ?
gridInfo.x2 : sample[xindex2-1].x;

// Sort all points in this column according to its y-coordinate
quickSort.sort(sortableY, xindex1, xindex2);

// Create rectangles in this column
double y1 = gridInfo.y1;
for (int row = 0; row < gridInfo.rows; row++) {
int yindex2 = xindex1 + (xindex2 - xindex1) * (row + 1) / gridInfo.rows;
double y2 = row == gridInfo.rows - 1 ? gridInfo.y2 : sample[yindex2 - 1].y;

rectangles[iRectangle++] = new Rectangle(x1, y1, x2, y2);
y1 = y2;
}

xindex1 = xindex2;
x1 = x2;
}
return rectangles;
}

12行：new出最后的返回值

15-50：定义排序函数

52-57：对采样的所有点的x坐标进行由小到大排序

60：最外层循环遍历x轴上的每一列

61：将所有的点按照columns均分，即将有序的x坐标分为columns份，在循环中对每一份进行处理。每一次处理xindex1
到 xindex2之间的点（xindex1，xindex2为
sample数组的索引）

64：得出索引xindex2对应的x坐标

68：将xindex1
到 xindex2之间的点按照y坐标进行由小到大排序

72：内层循环遍历y轴上的每一行

73：将xindex1
到 xindex2之间的点按照rows进行均分，即将有序的y坐标分为rows分，在循环中对每一份进行处理。每一次处理yindex1
到 yindex2之间的点

74：得出索引yindex2对应的y坐标，至此已获得x1,x2,y1,y2

76：得出当前网格的(x1,y1)-(x2,y2)

整个算法大概为：先将所有点按照x坐标由小到大排序，然后等分，再将等分后的每一部分按照y坐标由小到大排序，再等分，算出每一份即每一个网格的点数，因为点已经排序，所以可以得到该网格内的最小x1，y1，最大x2，y2.这个网格就可以用该(x1,y1)-(x2,y2)描述。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航