您的位置：首页 > 其它

内存优化之ArrayMap源码解析

2015-09-21 21:47 483 查看

一、为什么要使用ArrayMap

ArrayMap是一个普通的键值映射的数据结构，这种数据结构比传统的HashMap有着更好的内存管理效率。传统HashMap非常的好用，但是它对内存的占用非常的大。为了解决HashMap更占内存的弊端，Android提供了内存效率更高的ArrayMap。

二、ArrayMap的实现原理

它内部使用两个数组进行工作，其中一个数组记录key hash过后的顺序列表，另外一个数组按key的顺序记录Key-Value值，如下图所示：

这样做的好处就是它避免了为每个加入到map的实体构造额外的对象。在ArrayMap大小增长的时候，我们也只需要复制两个数组的实体，而不需要重新构建一个hash map。

我们需要注意的是这种数据结构不适合包含大量数据项的数据结构,因为它内部使用的是数组，对数组进行插入和删除操作效率比较低。

三、源码分析

正如上面所说，它里面包含了两个数组，看看下面定义，另外定义一个size来记录存放的实体个数：

int[] mHashes;
Object[] mArray;
int mSize;

1、构造函数

public ArrayMap() {
    mHashes = ContainerHelpers.EMPTY_INTS;
    mArray = ContainerHelpers.EMPTY_OBJECTS;
    mSize = 0;
}

可以看到，这里就是对这两个数组以及这个size进行初始化，它定义了两个空数组，并且把大小置为0。

具体可以看看ContainerHelpers里面的实现：

static final int[] EMPTY_INTS = new int[0];
static final Object[] EMPTY_OBJECTS = new Object[0];

从上面可以看到，它并没有为数组分配空间，在要使用的时候才会分配空间，这也是ArrayMap比HashMap内存占用率低的一个原因。

第二个构造函数

public ArrayMap(int capacity) {
    if (capacity == 0) {
        mHashes = ContainerHelpers.EMPTY_INTS;
        mArray = ContainerHelpers.EMPTY_OBJECTS;
    } else {
        allocArrays(capacity);
    }
    mSize = 0;
}

这里为数组的初始化设置了一个大小，如果capacity为0，那么就跟上面第一个构造函数一样，如果不为0，那么需要为数组分配大小，具体看看allocArrays函数。

private void allocArrays(final int size) {
    if (mHashes == EMPTY_IMMUTABLE_INTS) {
        throw new UnsupportedOperationException("ArrayMap is immutable");
    }
    if (size == (BASE_SIZE*2)) {
        synchronized (ArrayMap.class) {
            if (mTwiceBaseCache != null) {
                final Object[] array = mTwiceBaseCache;
                mArray = array;
                mTwiceBaseCache = (Object[])array[0];
                mHashes = (int[])array[1];
                array[0] = array[1] = null;
                mTwiceBaseCacheSize--;
                if (DEBUG) Log.d(TAG, "Retrieving 2x cache " + mHashes
                        + " now have " + mTwiceBaseCacheSize + " entries");
                return;
            }
        }
    } else if (size == BASE_SIZE) {
        synchronized (ArrayMap.class) {
            if (mBaseCache != null) {
                final Object[] array = mBaseCache;
                mArray = array;
                mBaseCache = (Object[])array[0];
                mHashes = (int[])array[1];
                array[0] = array[1] = null;
                mBaseCacheSize--;
                if (DEBUG) Log.d(TAG, "Retrieving 1x cache " + mHashes
                        + " now have " + mBaseCacheSize + " entries");
                return;
            }
        }
    }

    mHashes = new int[size];
    mArray = new Object[size<<1];
}

对于BASE_SIZE*2和BASE_SIZE两种尺寸的数组在这里它并没有对它们进行释放，而是把它们缓存起来，这样我们在分配的时候，如果需要分配这两种大小的数组，就可以直接从缓存中取得，否则，就直接new两个数组，第二个数组存放的是键值对，所以大小是size的两倍，size<<1左移一位操作就相当于乘以2，从上图就可以看到效果。

下面我们来说说缓存和使用缓存的过程，我们先来看看释放数组的操作代码。

private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
    if (hashes.length == (BASE_SIZE*2)) {
        synchronized (ArrayMap.class) {
            if (mTwiceBaseCacheSize < CACHE_SIZE) {
                array[0] = mTwiceBaseCache;
                array[1] = hashes;
                for (int i=(size<<1)-1; i>=2; i--) {
                    array[i] = null;
                }
                mTwiceBaseCache = array;
                mTwiceBaseCacheSize++;
                if (DEBUG) Log.d(TAG, "Storing 2x cache " + array
                        + " now have " + mTwiceBaseCacheSize + " entries");
            }
        }
    } else if (hashes.length == BASE_SIZE) {
        synchronized (ArrayMap.class) {
            if (mBaseCacheSize < CACHE_SIZE) {
                array[0] = mBaseCache;
                array[1] = hashes;
                for (int i=(size<<1)-1; i>=2; i--) {
                    array[i] = null;
                }
                mBaseCache = array;
                mBaseCacheSize++;
                if (DEBUG) Log.d(TAG, "Storing 1x cache " + array
                        + " now have " + mBaseCacheSize + " entries");
            }
        }
    }
}

mBaseCache和mTwiceBaseCache两种size的缓存过程相同，我们以mTwiceBaseCache为例来分析它的过程。

为了防止过多的缓存数组，所以代码中对其设置了一个上限CACHE_SIZE，值为10，如果缓存的数目达到了这个值，就不能进行缓存了。所以首先判断mTwiceBaseCacheSize的大小。

假设还没有达到上限CACHE_SIZE，执行核心代码：

array[0] = mTwiceBaseCache;
array[1] = hashes;
for (int i=(size<<1)-1; i>=2; i--) {
    array[i] = null;
}
mTwiceBaseCache = array;
mTwiceBaseCacheSize++;

第一次执行freeArrays得到如下图：

第二次执行freeArrays得到如下图：

将上面的核心代码和图形结合，就可以知道它的执行过程了。

在allocArrays中，使用缓存的核心代码如下：

final Object[] array = mTwiceBaseCache;
mArray = array;
mTwiceBaseCache = (Object[])array[0];
mHashes = (int[])array[1];
array[0] = array[1] = null;
mTwiceBaseCacheSize--;

由于mTwiceBaseCache指向整个Object数组，这个数组就是存储键值对的数组，可以直接拿了使用，用来存放键值对，因为这个数组的第一个元素存放的上一个键值对数组的引用，把它存放在mTwiceBaseCache中，这样下一次就可以获取下一个缓存的数组了，第二个元素存放的是hash数组的引用，这样就可以得到这个hash数组，直接来使用。

下面来看看另外一个构造函数，下面这个构造函数是创建一个不可变的hansh数组。

private ArrayMap(boolean immutable) {
    mHashes = EMPTY_IMMUTABLE_INTS;
    mArray = ContainerHelpers.EMPTY_OBJECTS;
    mSize = 0;
}

可以看到唯一不同的就是它的mHashes的赋值。

static final int[] EMPTY_IMMUTABLE_INTS = new int[0];

可以看到EMPTY_IMMUTABLE_INTS是一个静态的变量，就是说所有的hash值会共用一个数组。

下面是最后一个构造函数，它就是用一个ArrayMap来初始化另外一个ArrayMap。

public ArrayMap(ArrayMap map) {
    this();
    if (map != null) {
        putAll(map);
    }
}

从实现函数可以看到，它就是调用了一个putAll函数。

public void putAll(ArrayMap<? extends K, ? extends V> array) {
    final int N = array.mSize;
    ensureCapacity(mSize + N);
    if (mSize == 0) {
        if (N > 0) {
            System.arraycopy(array.mHashes, 0, mHashes, 0, N);
            System.arraycopy(array.mArray, 0, mArray, 0, N<<1);
            mSize = N;
        }
    } else {
        for (int i=0; i<N; i++) {
            put(array.keyAt(i), array.valueAt(i));
        }
    }
}

可以看到它首先执行了ensureCapacity，它的作用就是分配空间，下面来看看实现吧。

public void ensureCapacity(int minimumCapacity) {
    if (mHashes.length < minimumCapacity) {
        final int[] ohashes = mHashes;
        final Object[] oarray = mArray;
        allocArrays(minimumCapacity);
        if (mSize > 0) {
            System.arraycopy(ohashes, 0, mHashes, 0, mSize);
            System.arraycopy(oarray, 0, mArray, 0, mSize<<1);
        }
        freeArrays(ohashes, oarray, mSize);
    }
}

上面代码应该很容易看懂，它就是根据当前数组的长度来看是否需要重新分配空间。

分配完了空间回到上面的函数，如果mSize为0，就是说当前没有存储任何元素，这样就执行System.arraycopy分别进行数组的复制，否则，就通过循环，使用put函数一个一个键值对进行添加。

顺着思路，我们继续分析，下面我们趁热打铁，看看put函数吧，这个函数我们可能会经常的使用到，就是向集合中添加元素。

@Override
public V put(K key, V value) {
    final int hash;
    int index;
    if (key == null) {
        hash = 0;
        index = indexOfNull();
    } else {
        hash = key.hashCode();
        index = indexOf(key, hash);
    }
    if (index >= 0) {
        index = (index<<1) + 1;
        final V old = (V)mArray[index];
        mArray[index] = value;
        return old;
    }

    index = ~index;
    if (mSize >= mHashes.length) {
        final int n = mSize >= (BASE_SIZE*2) ? (mSize+(mSize>>1))
                : (mSize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

        if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);

        final int[] ohashes = mHashes;
        final Object[] oarray = mArray;
        allocArrays(n);

        if (mHashes.length > 0) {
            if (DEBUG) Log.d(TAG, "put: copy 0-" + mSize + " to 0");
            System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
            System.arraycopy(oarray, 0, mArray, 0, oarray.length);
        }

        freeArrays(ohashes, oarray, mSize);
    }

    if (index < mSize) {
        if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (mSize-index)
                + " to " + (index+1));
        System.arraycopy(mHashes, index, mHashes, index + 1, mSize - index);
        System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
    }

    mHashes[index] = hash;
    mArray[index<<1] = key;
    mArray[(index<<1)+1] = value;
    mSize++;
    return null;
}

从上面，可以看到如果键值key为null，hash会被赋值为0，并且执行indexOfNull函数，就是查找null对应的索引index的值。

int indexOfNull() {
    //得到元素的个数
    final int N = mSize;

    // 如果没有任何元素，即执行返回，根本不需要查询
    if (N == 0) {
        return ~0;
    }
    //通过二分查找来找到mHashes中hash为0的索引值
    int index = ContainerHelpers.binarySearch(mHashes, N, 0);

    // 如果没有找到，说明没有对应的实体，index < 0
    if (index < 0) {
        return index;
    }

    // 如果找到了对应的index，则在mArray中索引其对应的值，所以这个值为null，则返回index
    //index<<1就相当于index*2，因为mArray数组里面存放的是键值对
    if (null == mArray[index<<1]) {
        return index;
    }

    // 如果上面仍不满足条件，遍历index索引后面的所有hash值
    int end;
    for (end = index + 1; end < N && mHashes[end] == 0; end++) {
        if (null == mArray[end << 1]) return end;
    }

    // 如果上面仍不满足条件，遍历index索引前面的所有hash值
    for (int i = index - 1; i >= 0 && mHashes[i] == 0; i--) {
        if (null == mArray[i << 1]) return i;
    }

    //如果key没有找到，就返回这个值的负值，表示需要添加一个新的实体
    return ~end;
}

回到上面的函数，如果key不为空，就执行indexOf来找到它对应的索引值。

int indexOf(Object key, int hash) {
    final int N = mSize;

    // Important fast case: if nothing is in here, nothing to look for.
    if (N == 0) {
        return ~0;
    }

    int index = ContainerHelpers.binarySearch(mHashes, N, hash);

    // If the hash code wasn't found, then we have no entry for this key.
    if (index < 0) {
        return index;
    }

    // If the key at the returned index matches, that's what we want.
    if (key.equals(mArray[index<<1])) {
        return index;
    }

    // Search for a matching key after the index.
    int end;
    for (end = index + 1; end < N && mHashes[end] == hash; end++) {
        if (key.equals(mArray[end << 1])) return end;
    }

    // Search for a matching key before the index.
    for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
        if (key.equals(mArray[i << 1])) return i;
    }

    // Key not found -- return negative value indicating where a
    // new entry for this key should go.  We use the end of the
    // hash chain to reduce the number of array entries that will
    // need to be copied when inserting.
    return ~end;
}

它的过程跟上面基本一样，这里就不分析了，它的过程如下图所示：

直接回到上面的调用函数：

如果返回的index>0，则说明数组里面已经存在相同的key值，所以我们直接更新它的value值就可以了。

如果返回的index<0，则说明没有找到这个key对应的索引值，所以就需要新插入一个元素。

我们把上面的核心代码放下***体来分析分析。

//得到上面的end值
index = ~index;
//判断是否需要增加数组的容量
if (mSize >= mHashes.length) {
    //增加数组容量是按BASE_SIZE*2或者BASE_SIZE来分配新容量
    final int n = mSize >= (BASE_SIZE*2) ? (mSize+(mSize>>1))
            : (mSize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

    if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);
    //当当前的两个数组的引用缓存起来
    final int[] ohashes = mHashes;
    final Object[] oarray = mArray;
    //分配新的空间大小，就是按照上面的容量n重新分配两个新的数组
    allocArrays(n);

    //把之前两个数组中的值重新赋值给新的数组
    if (mHashes.length > 0) {
        if (DEBUG) Log.d(TAG, "put: copy 0-" + mSize + " to 0");
        System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
        System.arraycopy(oarray, 0, mArray, 0, oarray.length);
    }
    //把之前的两个数组是释放掉
    freeArrays(ohashes, oarray, mSize);
}
//从index开始，统一往后挪动一个位置，把新数组中的index位置的空间挪出来，用来存放我们要插入的值
if (index < mSize) {
    if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (mSize-index)
            + " to " + (index+1));
    System.arraycopy(mHashes, index, mHashes, index + 1, mSize - index);
    System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
}
//把我们需要插入的值放入指定的index位置，mSize++
mHashes[index] = hash;
mArray[index<<1] = key;
mArray[(index<<1)+1] = value;
mSize++;

可以看到正如上面所说，执行插入操作的效率比较低，它需要统一的挪动数组，效果图如下所示：

下面我们来看看删除操作：

public V removeAt(int index) {
    //得到指定索引的value值
    final Object old = mArray[(index << 1) + 1];
    //如果mSize<=0，说明数组没有元素，即对数组清空，重新初始化
    if (mSize <= 1) {
        // Now empty.
        if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
        freeArrays(mHashes, mArray, mSize);
        mHashes = ContainerHelpers.EMPTY_INTS;
        mArray = ContainerHelpers.EMPTY_OBJECTS;
        mSize = 0;
    } else {
        //如果分配的空间大于BASE_SIZE*2，并且里面已经被使用的空间小于mHashes.length/3
        if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
            // Shrunk enough to reduce size of arrays.  We don't allow it to
            // shrink smaller than (BASE_SIZE*2) to avoid flapping between
            // that and BASE_SIZE.
            //压缩数组的大小，避免占用不必要的空间
            final int n = mSize > (BASE_SIZE*2) ? (mSize + (mSize>>1)) : (BASE_SIZE*2);

            if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);

            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            //重新分配数组
            allocArrays(n);

            mSize--;
            //把旧数组中除了要删除的元素外，把其他的复制到新的数组中
            if (index > 0) {
                if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0");
                System.arraycopy(ohashes, 0, mHashes, 0, index);
                System.arraycopy(oarray, 0, mArray, 0, index << 1);
            }
            if (index < mSize) {
                if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + mSize
                        + " to " + index);
                System.arraycopy(ohashes, index + 1, mHashes, index, mSize - index);
                System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1,
                        (mSize - index) << 1);
            }
        } else {
            //如果不需要重新分配数组的大小时
            //直接将数组index后面的元素统一前移一个元素
            mSize--;
            if (index < mSize) {
                if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + mSize
                        + " to " + index);
                System.arraycopy(mHashes, index + 1, mHashes, index, mSize - index);
                System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1,
                        (mSize - index) << 1);
            }
            //这样最后一个元素就会空处理，将它置为null
            mArray[mSize << 1] = null;
            mArray[(mSize << 1) + 1] = null;
        }
    }
    return (V)old;
}

过程可以看上面的注释，可以看到它分两种情况：

1、当空间利用率太低的时候，它就会重新分配较小的空间，它之前的内容复制过来，然后把之前的数组释放掉，这样保证内存的占有率较低。

2、当空间利用率满足条件的时候，就直接在原有的数组里面完成删除操作，也就是把数组元素统一前移。

另外，我们可以看到删除操作也要进行数组的统一移动，所以效率比较低，这也是为什么ArrayMap不适合存放大数据量的数据。

效果如下图所示：

了解了上面过程，我们应该在满足下面2个条件的时候才考虑使用ArrayMap：

1、对象个数的数量级最好是千以内；

2、数据组织形式包含Map结构。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航