您的位置:首页 > 其它

算法:支持重复元素的二分查找

2015-01-08 23:40 330 查看
近几天在处理的一个项目,需要频繁对一些有序超大集合进行目标查找,二分查找算法是这类问题的最优解。但是java的Arrays.binarySearch()方法,如果集合中有重复元素,而且遇到目标元素正好是这些重复元素之一,该方法只能返回一个,并不能将所有的重复目标元素都返回,没办法,只能自造轮子了。

先复习下二分查找的经典算法:

private int binarySearch1(Integer[] A, Integer x) {
int low = 0, high = A.length - 1;
while (low <= high) {
int mid = (low + high) / 2;
if (A[mid].equals(x)) {
return mid;
} else if (x < A[mid]) {
high = mid - 1;
} else {
low = mid + 1;
}
}
return -1;
}


思路很简单,先定位到中间元素,如果中间元素比目标元素大,则扔掉后一半,反之扔掉前一半,如果正好一次命中,直接返回。

略做改进:

private List<Integer> binarySearch2(Integer[] A, Integer x) {
List<Integer> result = new ArrayList<Integer>();
int low = 0, high = A.length - 1;
while (low <= high) {
int mid = (low + high) / 2;
if (A[mid].equals(x)) {
if (mid > 0) {
//看前一个元素是否=目标元素
if (A[mid - 1].equals(x)) {
for (int i = mid - 1; i >= 0; i--) {
if (A[i].equals(x)) {
result.add(i);
} else break;
}
}
}
result.add(x);
if (mid < high) {
//看后一个元素是否=目标元素
if (A[mid + 1].equals(x)) {
for (int i = mid + 1; i <= high; i++) {
if (A[i].equals(x)) {
result.add(i);
} else break;
}
}
}
return result;
} else if (x < A[mid]) {
high = mid - 1;
} else {
low = mid + 1;
}
}
return result;

}


思路:命中目标后,看下前一个紧挨着的元素是否也是要找的元素,如果是,则顺着向前取,直到遇到不等于目标元素为止。然后再看后一个紧挨着的元素,做类似处理。

测试:

Integer[] A = new Integer[]{1, 2, 3, 4, 5, 5, 5, 6, 7, 8, 9};

System.out.println("binarySearch1 => ");
System.out.println(binarySearch1(A, 5));

System.out.println("binarySearch2 => ");
System.out.println(binarySearch2(A, 5));


binarySearch1 =>
5
binarySearch2 =>
[4, 5, 6]

从返回的下标值看,都在预期之中,但是事情并未到此止步,通常要查找的列表元素,并不是数值这么简单,一般是一些复杂的对象实例,为了做到通用,得弄成一个泛型版本:

private <T> List<Integer> binarySearch4(List<T> A, T x, Comparator<? super T> comparator) {
List<Integer> result = new ArrayList<Integer>();
int low = 0, high = A.size() - 1;
while (low <= high) {
int mid = (low + high) / 2;
int temp = comparator.compare(x, A.get(mid));
if (temp == 0) {
if (mid > 0) {
if (comparator.compare(x, A.get(mid - 1)) == 0) {
for (int i = mid - 1; i >= 0; i--) {
if (comparator.compare(A.get(i), x) == 0) {
result.add(i);
} else break;
}
}
}
result.add(mid);
if (mid < high) {
if (comparator.compare(x, A.get(mid + 1)) == 0) {
for (int i = mid + 1; i <= high; i++) {
if (comparator.compare(x, A.get(i)) == 0) {
result.add(i);
} else break;
}
}
}
return result;

} else if (temp < 0) {
high = mid - 1;
} else {
low = mid + 1;
}
}

return result;
}


为了比较二个复杂对象实例的大小,引入了Comparator接口,可以根据业务需要,则调用者自定义比较规则。

测试一下:

先定义一个业务对象类AwbDto:

package com.cnblogs.yjmyzz.test.domain;

/**
* Created by jimmy on 15/1/8.
*/
public class AwbDto {

/**
* 运单号
*/
private String awbNumber;

/**
* 始发站
*/
private String originAirport;

public AwbDto(String awbNumber, String originAirport) {
this.awbNumber = awbNumber;
this.originAirport = originAirport;
}

public String getAwbNumber() {
return awbNumber;
}

public void setAwbNumber(String awbNumber) {
this.awbNumber = awbNumber;
}

public String getOriginAirport() {
return originAirport;
}

public void setOriginAirport(String originAirport) {
this.originAirport = originAirport;
}
}


还需要定义AwbData比较大小的业务规则,假设:只要运单号相同,则认为相等(即:用运单号来区分对象大小)

private class AwbDtoComparator implements Comparator<AwbDto> {

@Override
public int compare(AwbDto x, AwbDto y) {
return x.getAwbNumber().compareTo(y.getAwbNumber());
}
}


测试代码:

List<AwbDto> awbList = new ArrayList<AwbDto>();
awbList.add(new AwbDto("112-10010011", "北京"));
awbList.add(new AwbDto("112-10010022", "上海"));
awbList.add(new AwbDto("112-10010033", "天津"));
awbList.add(new AwbDto("112-10010044", "武汉"));
awbList.add(new AwbDto("112-10010044", "武汉"));
awbList.add(new AwbDto("112-10010055", "广州"));

AwbDtoComparator comparator = new AwbDtoComparator();

AwbDto x = new AwbDto("112-10010044", "武汉");

System.out.println("binarySearch4 => ");
System.out.println(binarySearch4(awbList, x, comparator));


binarySearch4 =>
[3, 4]

测试结果,一切顺利,皆大欢喜,可以去休息了。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐