您的位置:首页 > 编程语言 > Java开发

Jdk1.8 Collections Framework源码解析(1)-ArrayList

2017-08-24 00:00 483 查看
摘要: 学习做分享,与人为乐。

在工作中Java的集合类是我们经常会遇到的,最近闲下来了,就把相关知识做一个总结,方便自己理解。
打开java.util.ArrayList的源码,线上的 openJdk源码可以查看 这里。 看一个源码的类,首先八一八该类的作者是@author Josh Bloch 和 @author Neal Gafter 这两位老兄(有兴趣的猿们可以,八卦一下他们的故事,其中Josh Bloch 可是著名的Effective Java的作者,绝对牛掰,值得拜读吧!)。

好的,言归正转,首先我们看一个类首先要看看这个是干什么的, 有什么特性,这些都在实现接口上有体现(好废话,不过也有很多小白不知道如何看的,我当初就是那么中的一个,囧),好,我们直接从最顶级的接口看吧。如何知道他的顶级接口是谁(我废话了),看看:

public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess, Cloneable, java.io.Serializable

通过看到这个类的接口实现,我们点击List<E> , 直到接口的顶层,我们就能看到这样的接口实现。

package java.lang;
import java.util.Iterator;

public interface Iterable<T> {
Iterator<T> iterator();
}

由于篇幅的关系,去掉了注释。注释上说明,实现了这个接口的类,可以使用"foreach"语法。 现在jdk8 大部分循环都开始隐藏Iterator的实现方式,支持foreach 的遍历实现。

接下来,我们从顶级接口开始往下八卦一下,看到的接口是java.util.Collection:

package java.util;
public interface Collection<E> extends Iterable<E> {
int size();
boolean isEmpty();
boolean contains(Object o);
Iterator<E> iterator();
Object[] toArray();
<T> T[] toArray(T[] a);
boolean add(E e);
boolean remove(Object o);
boolean containsAll(Collection<?> c);
boolean addAll(Collection<? extends E> c);
boolean removeAll(Collection<?> c);
boolean retainAll(Collection<?> c);
void clear();
boolean equals(Object o);
int hashCode();
}

看过《算法4》这本书的人,就会觉得这大窜的balabala 的好熟悉吧。源码里面的注释很详细,反正大概意思是说(英语不太好),这是集合层次的顶级接口,代表了一组对象blablablabla,所有通用的实现应该提供两个"标准"的构造方法:无参的构造方法和拥有一个集合类型参数的构造方法blablablabla,总之这个接口对集合进行了高度滴、抽象滴定义:)

接下来就是java.util.List接口了。

public interface List<E> extends Collection<E> {
//和Collection重复的方法就不贴了,但必须知道有这些方法
boolean addAll(int index, Collection<? extends E> c);
E get(int index);
E set(int index, E element);
void add(int index, E element);
E remove(int index);
int indexOf(Object o);
int lastIndexOf(Object o);
ListIterator<E> listIterator();
ListIterator<E> listIterator(int index);
List<E> subList(int fromIndex, int toIndex);
}

list可以通过index来访问和查找集合内的元素,集合内元素是可重复的,因为index(集合中的位置)不同。

还有一个ArrayList实现的接口是java.util.RandomAccess:

package java.util;
public interface RandomAccess {
}

一看没有方法的接口,就知道这大概是个标记接口喽。实现这个接口的集合类会在随机访问元素上提供很好的性能。比如说,ArrayList实现了RandomAccess而LinkedList没有,那么当我们拿到一个集合时(假设这个集合不是ArrayList就是LinkedList),如果这个集合时ArrayList,那么我们遍历集合元素可以使用下标遍历,如果是LinkedList,就使用迭代器遍历。比如集合工具类Collections中提供的对集合中元素进行二分查找的方法,代码片段如下:

public static <T>
int binarySearch(List<? extends Comparable<? super T>> list, T key) {
if (list instanceof RandomAccess || list.size()<BINARYSEARCH_THRESHOLD)
return Collections.indexedBinarySearch(list, key);
else
return Collections.iteratorBinarySearch(list, key);
}

还有java.lang.Cloneable和java.io.Serializable两个标记接口,表示ArrayList是可克隆的可序列化的

还要看一下AbstractCollection和AbstractList,这两个抽象类里实现了一部分公共的功能,代码都还比较容易看懂。

需要注意的地方是AbstractList中的一个属性modCount。这个属性主要由集合的迭代器来使用,对于List来说,可以调用iterator()和listIterator()等方法来生成一个迭代器,这个迭代器在生成时会将List的modCount保存起来(迭代器实现为List的内部类),在迭代过程中会去检查当前list的modCount是否发生了变化(和自己保存的进行比较),如果发生变化,那么马上抛出java.util.ConcurrentModificationException异常,这种行为就是fail-fast。

好了,终于可以看ArrayList了,在看之前可以先思考一下,如果我们自己来实现ArrayList,要怎么实现。

思考中..............

首先,我们都会把ArrayList简单的看作动态数组(区别于LinkedList的链表实现),说明我们使用它的时候大都是当做数组来使用的,只不过它的长度是可改变的。那我们的问题就变成了一个实现长度可变化的数组的问题了。

那么首先,我们会用一个数组来做底层数据存储的结构,新建的时候会提供一个默认的数组长度。当我们添加或删除元素的时候,会改变数据的长度(当默认长度不够或者其他情况下),这里涉及到数据的拷贝,我们可以使用系统提供的System.arraycopy来进行数组拷贝。如果我们每次添加或者删除元素时都会因为改变数组长度而拷贝数组,那也太2了,可以在当数组长度不够的时候,扩展一定的空间,比如可以扩展到原来的2倍,这样会提高一点性能(如果不是在数组末尾添加或删除元素的话,还是会进行数组拷贝),但同时浪费了一点空间。再看看上面的接口,比如我们要实现size方法,那么就不能直接返回数组的长度了(由于上面的原因),也没办法去遍历数组得到长度(因为可能存在null元素),我们会想到利用一个私有变量来保存长度,在添加删除等方法里面去相应修改这个变量(这里不考虑线程安全问题,因为ArrayList本身就不是线程安全的)。

带着这些问题,来看一下ArrayList的代码。
首先,真正存储元素的是一个数组。

/**
* The array buffer into which the elements of the ArrayList are stored.
* The capacity of the ArrayList is the length of this array buffer. Any
* empty ArrayList with elementData == EMPTY_ELEMENTDATA will be expanded to
* DEFAULT_CAPACITY when the first element is added.
*/
private transient Object[] elementData;

那么这个数组该怎么初始化呢?数组的长度该怎么确定呢?看一下构造方法。

/**
* Constructs an empty list with the specified initial capacity.
*
* @param  initialCapacity  the initial capacity of the list
* @throws IllegalArgumentException if the specified initial capacity
*         is negative
*/
public ArrayList(int initialCapacity) {
super();
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal Capacity: "+
initialCapacity);
this.elementData = new Object[initialCapacity];
}

/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
super();
this.elementData = EMPTY_ELEMENTDATA;
}

可以看到,我们可以选择使用一个参数作为ArrayList内部数组的容量来构造一个ArrayList;或者使用无参的构造方法,这样会默认使用10作为数组容量。我们在实际应用中可以根据具体情况来选择使用哪个构造方法,例如在某种特定逻辑下,我们需要存储200个元素(固定值)到一个ArrayList里,那我们就可以使用带一个int型参数的构造方法,这样就避免了在数组长度不够,进行扩容时,内部数组的拷贝。

当然,根据接口java.util.Collection的规范(非强制),ArrayList也提供用一个集合做为参数的构造方法。

/**
* Constructs a list containing the elements of the specified
* collection, in the order they are returned by the collection's
* iterator.
*
* @param c the collection whose elements are to be placed into this list
* @throws NullPointerException if the specified collection is null
*/
public ArrayList(Collection<? extends E> c) {
elementData = c.toArray();
size = elementData.length;
// c.toArray might (incorrectly) not return Object[] (see 6260652)
if (elementData.getClass() != Object[].class)
elementData = Arrays.copyOf(elementData, size, Object[].class);
}

上面说到说考虑用一个私有变量来存储size(集合中实际原始的个数)。在ArrayList也有这样一个变量。

/**
* The size of the ArrayList (the number of elements it contains).
*
* @serial
*/
private int size;

有了这个变量,一些方法实现起来就非常简单了。就是教科书般的,函数了。

/**
* Returns the number of elements in this list.
*
* @return the number of elements in this list
*/
public int size() {
return size;
}

/**
* Returns <tt>true</tt> if this list contains no elements.
*
* @return <tt>true</tt> if this list contains no elements
*/
public boolean isEmpty() {
return size == 0;
}

/**
* Returns <tt>true</tt> if this list contains the specified element.
* More formally, returns <tt>true</tt> if and only if this list contains
* at least one element <tt>e</tt> such that
* <tt>(o==null ? e==null : o.equals(e))</tt>.
*
* @param o element whose presence in this list is to be tested
* @return <tt>true</tt> if this list contains the specified element
*/
public boolean contains(Object o) {
return indexOf(o) >= 0;
}

/**
* Returns the index of the first occurrence of the specified element
* in this list, or -1 if this list does not contain the element.
* More formally, returns the lowest index <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>,
* or -1 if there is no such index.
*/
public int indexOf(Object o) {
if (o == null) {
for (int i = 0; i < size; i++)
if (elementData[i]==null)
return i;
} else {
for (int i = 0; i < size; i++)
if (o.equals(elementData[i]))
return i;
}
return -1;
}

/**
* Returns the index of the last occurrence of the specified element
* in this list, or -1 if this list does not contain the element.
* More formally, returns the highest index <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>,
* or -1 if there is no such index.
*/
public int lastIndexOf(Object o) {
if (o == null) {
for (int i = size-1; i >= 0; i--)
if (elementData[i]==null)
return i;
} else {
for (int i = size-1; i >= 0; i--)
if (o.equals(elementData[i]))
return i;
}
return -1;
}

此外,还有维护数组的的一些函数,也是比较容易理解的。

/**
* Returns the element at the specified position in this list.
*
* @param  index index of the element to return
* @return the element at the specified position in this list
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E get(int index) {
rangeCheck(index);

return elementData(index);
}

/**
* Replaces the element at the specified position in this list with
* the specified element.
*
* @param index index of the element to replace
* @param element element to be stored at the specified position
* @return the element previously at the specified position
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E set(int index, E element) {
rangeCheck(index);

E oldValue = elementData(index);
elementData[index] = element;
return oldValue;
}

/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1);  // Increments modCount!!
elementData[size++] = e;
return true;
}

/**
* Inserts the specified element at the specified position in this
* list. Shifts the element currently at that position (if any) and
* any subsequent elements to the right (adds one to their indices).
*
* @param index index at which the specified element is to be inserted
* @param element element to be inserted
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public void add(int index, E element) {
rangeCheckForAdd(index);

ensureCapacityInternal(size + 1);  // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}

/**
* Removes the element at the specified position in this list.
* Shifts any subsequent elements to the left (subtracts one from their
* indices).
*
* @param index the index of the element to be removed
* @return the element that was removed from the list
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E remove(int index) {
rangeCheck(index);

modCount++;
E oldValue = elementData(index);

int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work

return oldValue;
}

/**
* Removes the first occurrence of the specified element from this list,
* if it is present.  If the list does not contain the element, it is
* unchanged.  More formally, removes the element with the lowest index
* <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>
* (if such an element exists).  Returns <tt>true</tt> if this list
* contained the specified element (or equivalently, if this list
* changed as a result of the call).
*
* @param o element to be removed from this list, if present
* @return <tt>true</tt> if this list contained the specified element
*/
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true;
}
} else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true;
}
}
return false;
}

/*
* Private remove method that skips bounds checking and does not
* return the value removed.
*/
private void fastRemove(int index) {
modCount++;
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
}

/**
* Removes all of the elements from this list.  The list will
* be empty after this call returns.
*/
public void clear() {
modCount++;

// clear to let GC do its work
for (int i = 0; i < size; i++)
elementData[i] = null;

size = 0;
}

有几个要注意的地方,首先是RangeCheck方法,顾名思义是做边界检查的,看看方法实现。

/**
* Checks if the given index is in range.  If not, throws an appropriate
* runtime exception.  This method does *not* check if the index is
* negative: It is always used immediately prior to an array access,
* which throws an ArrayIndexOutOfBoundsException if index is negative.
*/
private void rangeCheck(int index) {
if (index >= size)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

/**
* A version of rangeCheck used by add and addAll.
*/
private void rangeCheckForAdd(int index) {
if (index > size || index < 0)
throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

看看,它抛出的事 throw new IndexOutOfBoundsException(outOfBoundsMsg(index)); 大家犯过这个错,见过这个异常的人不少吧,常遇到的,哈哈,囧逼了,这样我们就可以理解我们平时遇到这个异常的原因了。

还有就是在"添加"方法中的ensureCapacity方法,上面也考虑到扩容的问题,这个方法其实就干了这件事情。

/**
* Increases the capacity of this <tt>ArrayList</tt> instance, if
* necessary, to ensure that it can hold at least the number of elements
* specified by the minimum capacity argument.
*
* @param   minCapacity   the desired minimum capacity
*/
public void ensureCapacity(int minCapacity) {
int minExpand = (elementData != EMPTY_ELEMENTDATA)
// any size if real element table
? 0
// larger than default for empty table. It's already supposed to be
// at default size.
: DEFAULT_CAPACITY;

if (minCapacity > minExpand) {
ensureExplicitCapacity(minCapacity);
}
}

private void ensureCapacityInternal(int minCapacity) {
if (elementData == EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}

ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
modCount++;

// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}

/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}

private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
}

最后再看一下另一个删除方法。

/**
* Removes the element at the specified position in this list.
* Shifts any subsequent elements to the left (subtracts one from their
* indices).
*
* @param index the index of the element to be removed
* @return the element that was removed from the list
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E remove(int index) {
rangeCheck(index);

modCount++;
E oldValue = elementData(index);

int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work

return oldValue;
}

/**
* Removes the first occurrence of the specified element from this list,
* if it is present.  If the list does not contain the element, it is
* unchanged.  More formally, removes the element with the lowest index
* <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>
* (if such an element exists).  Returns <tt>true</tt> if this list
* contained the specified element (or equivalently, if this list
* changed as a result of the call).
*
* @param o element to be removed from this list, if present
* @return <tt>true</tt> if this list contained the specified element
*/
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true;
}
} else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true;
}
}
return false;
}

/*
* Private remove method that skips bounds checking and does not
* return the value removed.
*/
private void fastRemove(int index) {
modCount++;
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
}

可以看到,像size、isEmpty、get、set这样的方法时间复杂度为O(1),而像indexOf、add、remove等方法,最坏的情况下(如添加元素到第一个位置,删除第一个位置的元素,找最后一个元素的下标等)时间复杂度为O(n)。

一般情况下内部数组的长度总是大于集合中元素总个数的,ArrayList也提供了一个释放多余空间的方法,我们可以适时调用此方法来减少内存占用。

/**
* Trims the capacity of this <tt>ArrayList</tt> instance to be the
* list's current size.  An application can use this operation to minimize
* the storage of an <tt>ArrayList</tt> instance.
*/
public void trimToSize() {
modCount++;
if (size < elementData.length) {
elementData = Arrays.copyOf(elementData, size);
}
}

最后,别忘了它是线程不安全的。

参考:

http://brokendreams.iteye.com/blog/1913619

推荐一些资源:

推荐一本书:《算法4》

推荐博客:http://brokendreams.iteye.com/blog/1913619
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  JDK