StringBuilder的内部有一个char[], 不断的append()就是不断的往char[]里填东西的过程。

new StringBuilder() 时char[]的默认长度是16,然后,如果要append第17个字符,怎么办?


这样一来有数组拷贝的成本,二来原来的char[]也白白浪费了要被GC掉。可以想见,一个129字符长度的字符串,经过了16,32,64, 128四次的复制和丢弃,合共申请了496字符的数组,在高性能场景下,这几乎不能忍。



StringBuilder和StringBuffer都继承了AbstractStringBuilder,而AbstractStringBuilder是一个接口并且实现了Appendable和 CharSequence接口,下面先看看这两个接口:

package java.lang;

public interface CharSequence {
* Returns the length of this character sequence.  The length is the number
int length();

* Returns the <code>char</code> value at the specified index.  An index ranges from zero
* to <tt>length() - 1</tt>.  The first <code>char</code> value of the sequence is at
* index zero, the next at index one, and so on, as for array
* indexing. </p>
char charAt(int index);

* Returns a new <code>CharSequence</code> that is a subsequence of this sequence.
* The subsequence starts with the <code>char</code> value at the specified index and
* ends with the <code>char</code> value at index <tt>end - 1</tt>.  The length
* (in <code>char</code>s) of the
* returned sequence is <tt>end - start</tt>, so if <tt>start == end</tt>
* then an empty sequence is returned. </p>   */
CharSequence subSequence(int start, int end);

* Returns a string containing the characters in this sequence in the same
* order as this sequence.  The length of the string will be the length of
* this sequence. </p>
public String toString();



package java.lang;

import java.io.IOException;

* An object to which <tt>char</tt> sequences and values can be appended.  The
* <tt>Appendable</tt> interface must be implemented by any class whose
* instances are intended to receive formatted output from a {@link
* java.util.Formatter}.
public interface Appendable {
* Appends the specified character sequence to this <tt>Appendable</tt>.
* <p> Depending on which class implements the character sequence
* <tt>csq</tt>, the entire sequence may not be appended.  For
* instance, if <tt>csq</tt> is a {@link java.nio.CharBuffer} then
* the subsequence to append is defined by the buffer's position and limit.
* @param  csq
*         The character sequence to append.  If <tt>csq</tt> is
*         <tt>null</tt>, then the four characters <tt>"null"</tt> are
*         appended to this Appendable.
* @return  A reference to this <tt>Appendable</tt>
* @throws  IOException
*          If an I/O error occurs
Appendable append(CharSequence csq) throws IOException;

* Appends a subsequence of the specified character sequence to this
* <tt>Appendable</tt>.
* <p> An invocation of this method of the form <tt>out.append(csq, start,
* end)</tt> when <tt>csq</tt> is not <tt>null</tt>, behaves in
* exactly the same way as the invocation
* <pre>
*     out.append(csq.subSequence(start, end)) </pre>
* @param  csq
*         The character sequence from which a subsequence will be
*         appended.  If <tt>csq</tt> is <tt>null</tt>, then characters
*         will be appended as if <tt>csq</tt> contained the four
*         characters <tt>"null"</tt>.
* @param  start
*         The index of the first character in the subsequence
* @param  end
*         The index of the character following the last character in the
*         subsequence
* @return  A reference to this <tt>Appendable</tt>
* @throws  IndexOutOfBoundsException
*          If <tt>start</tt> or <tt>end</tt> are negative, <tt>start</tt>
*          is greater than <tt>end</tt>, or <tt>end</tt> is greater than
*          <tt>csq.length()</tt>
* @throws  IOException
*          If an I/O error occurs
Appendable append(CharSequence csq, int start, int end) throws IOException;

* Appends the specified character to this <tt>Appendable</tt>.
* @param  c
*         The character to append
* @return  A reference to this <tt>Appendable</tt>
* @throws  IOException
*          If an I/O error occurs
Appendable append(char c) throws IOException;

注释写的真是炒鸡棒!不忍删。这个接口就是An object to which <tt>char</tt> sequences and values can be appended,Java专门把can be appended中的append拿出来了写了一个接口,感觉非常的细致和规矩,原谅我的表达能力。下面终于到AbstractStringBuilder了,在java中如果要了解某个东西,需要抽丝拨茧,毕竟要继承和实现那么一堆东西!!


里面有这么几个方法:ensureCapacity(int minimumCapacity)、ensureCapacityInternal(int minimumCapacity),expandCapacity(int minimumCapacity)。意思就是需要确保当前的容量也就是value.length至少与这个minimumCapacity相等。如果比这个参数小,则这个内部的数组(也就是value)需要重新分配。也就是需要expandCapacity(int minimumCapacity):

void expandCapacity(int minimumCapacity) {
int newCapacity = value.length * 2 + 2;
if (newCapacity - minimumCapacity < 0)
newCapacity = minimumCapacity;
if (newCapacity < 0) {
if (minimumCapacity < 0) // overflow
throw new OutOfMemoryError();
newCapacity = Integer.MAX_VALUE;
value = Arrays.copyOf(value, newCapacity);

首先把原来的length *2 + 2,如果还小与minimumCapacity,就直接让新的capacity = minimumCapacity,然后利用Arrays.copyOf(value, newCapacity)进行扩展。原本以为不需要再整些别的类中的东西了,真是天真,容我再吐口血,下面是copeOf方法:

public static char[] copyOf(char[] original, int newLength) {
char[] copy = new char[newLength];
System.arraycopy(original, 0, copy, 0,
Math.min(original.length, newLength));
return copy;


public static native void arraycopy(Object src, int srcPos,Object dest, int destPos,int length);



在AbstractStringBuilder中append重载了很多,像append Sting类型,StringBuffer类型,CharSequence类型等等,但大多数其实都是一个思想,举个栗子:

public AbstractStringBuilder append(String str) {
if(str == null) str = "null";
int len = str.length();
ensureCapacityInternal(count + len);
str.getChars(0, len, value, count);
count += len;
return this;


public void getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)
if (srcBegin < 0)
throw new StringIndexOutOfBoundsException(srcBegin);
if ((srcEnd < 0) || (srcEnd > count))
throw new StringIndexOutOfBoundsException(srcEnd);
if (srcBegin > srcEnd)
throw new StringIndexOutOfBoundsException("srcBegin > srcEnd");
System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);



public AbstractStringBuilder delete(int start, int end) {
if (start < 0)
throw new StringIndexOutOfBoundsException(start);
if (end > count)
end = count;
if (start > end)
throw new StringIndexOutOfBoundsException();
int len = end - start;
if (len > 0) {
System.arraycopy(value, start+len, value, start, count-end);
count -= len;
return this;



public AbstractStringBuilder insert(int index, char[] str, int offset,
int len)
if ((index < 0) || (index > length()))
throw new StringIndexOutOfBoundsException(index);
if ((offset < 0) || (len < 0) || (offset > str.length - len))
throw new StringIndexOutOfBoundsException(
"offset " + offset + ", len " + len + ", str.length "
+ str.length);
ensureCapacityInternal(count + len);
System.arraycopy(value, index, value, index + len, count - index);
System.arraycopy(str, offset, value, index, len);
count += len;
return this;



public AbstractStringBuilder reverse() {
boolean hasSurrogate = false;
int n = count - 1;
for (int j = (n-1) >> 1; j >= 0; --j) {
char temp = value[j];
char temp2 = value[n - j];
if (!hasSurrogate) {
hasSurrogate = (temp >= Character.MIN_SURROGATE && temp <= Character.MAX_SURROGATE)
|| (temp2 >= Character.MIN_SURROGATE && temp2 <= Character.MAX_SURROGATE);
value[j] = temp2;
value[n - j] = temp;
if (hasSurrogate) {
// Reverse back all valid surrogate pairs
for (int i = 0; i < count - 1; i++) {
char c2 = value[i];
if (Character.isLowSurrogate(c2)) {
char c1 = value[i + 1];
if (Character.isHighSurrogate(c1)) {
value[i++] = c1;
value[i] = c2;
return this;


一个完整的 Unicode 字符叫代码点CodePoint,而一个 Java char 叫 代码单元 code unit。String 对象以UTF-16保存 Unicode 字符,需要用2个字符表示一个超大字符集的汉字,这这种表示方式称之为 Surrogate,第一个字符叫 Surrogate High,第二个就是 Surrogate Low。具体需要注意的事宜如下:

判断一个char是否是Surrogate区的字符,用Character的 isHighSurrogate()/isLowSurrogate()方法即可判断。从两个Surrogate High/Low 字符,返回一个完整的 Unicode CodePoint 用 Character.toCodePoint()/codePointAt()方法。

一个Code Point,可能需要一个也可能需要两个char表示,因此不能直接使用 CharSequence.length()方法直接返回一个字符串到底有多少个汉字,而需要用String.codePointCount()/Character.codePointCount()。

要定位字符串中的第N个字符,不能直接将N作为偏移量,而需要从字符串头部依次遍历得到,需要用String/Character.offsetByCodePoints() 方法。

从字符串的当前字符,找到上一个字符,也不能直接用offset-- 实现,而需要用 String.codePointBefore()/Character.codePointBefore(),或用 String/Character.offsetByCodePoints()

从当前字符,找下一个字符,不能直接用 offset++实现,需要判断当前 CodePoint的长度后,再计算得到,或用String/Character.offsetByCodePoints()。



