集合的源码分析 - wtdig/study GitHub Wiki

集合的源码分析

参考资料

一、原理解析

1、提升查询效率

哈希表:将存储的对象,通过hash函数,算出为一个位置后,将存储的对象放入进去;可能存在哈希碰撞(此时,使用数组+链表的方式进行解决)

二叉树:

二叉树的优化:插入过程中对二叉排序树进行调整,从而提升性能;比如:平衡二叉树、红黑树;

红黑树是JDK中TreeMap、TreeSet的底层数据结构,在JDK1.8中HashMap也用到了红黑树

2、ArrayList:

数组是一个连续的内存空间,所以需要容量大小的确认;

创建一个无参的ArrayList,首先后创建一个空的数组,当添加第一个元素时,会默认扩容到默认的大小10,该过程涉及到数组的复

制,会导致效率降低;

创建一个指定大小的ArrayList,会直接初始化一个指定大小的数组,会一直占用空间;可以在添加元素的时候,手动进行容器的扩

容,使用ensureCapacity(int minCapacity),在插入前先扩容;

进行元素的删除时,先先进行数组的复制,以便gc的回收

源码的解析

/**
     * Appends the specified element to the end of this list.
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {
        //添加元素时,进行容器大小初始化
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }


    private void ensureCapacityInternal(int minCapacity) {
        //如果容器为空
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            //进行大小的初始化:如果指定的大小超过默认值的10,就取指定大小,否则为默认大小10
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }
        //按照大小进行容器初始化
        ensureExplicitCapacity(minCapacity);
    }

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            //进行容器的扩容
            grow(minCapacity);
    }

/**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        //进行数据的复制
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

LinkedList采用链表的方式进行数据的存储:在介绍链表结构时提到过,其数据单元分为数据域和指针域,使用node节点进行维护,而node节点是一个对象,在内存空间中不是连续的,没有空间大小的初始化问题,所以数据的增和删比较快,而进行数据查询时,就比较慢了。

private static class Node<E> {
        E item;//存储数据
        Node<E> next;//下一个节点
        Node<E> prev;//先前的一个节点

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }

数组与链表在处理数据时各有优缺点,数组查询速度很快而插入很慢,链表在插入时表现优秀但查询无力。哈希表则整合了数组与

链表的优点,能在插入和查找等方面都有不错的速度。

3、HashMap

HashMap是哈希表的数据结构,底层是数组+链表的结构,jdk1.8之后,加上了红黑树的优化;

首先初始化一个数组,下标存储的是key的hash值,当下标的hash冲突时,以链表的形式挂载起来,当链表挂载的元素超过8个元素后,链表将后转换成红黑树进行存储;容量的扩容,默认初始容量是16,默认的负载因子是0.75,当存储的数据超过16*0.75=12后,进行容量的扩容,直接扩大2倍,就是32了。

构造函数

public HashMap(int initialCapacity, float loadFactor) {
        //指定容量大小,负载因子
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        //寻找最近的2次幂
        this.threshold = tableSizeFor(initialCapacity);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        //指定容量大小
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        //无参构造,采用默认的负载因子0.75
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }



 /**
     * Returns a power of two size for the given target capacity.
     */
    //寻找最近的2次幂
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

添加一个元素

 public V put(K key, V value) {
        //添加一个元素
        return putVal(hash(key), key, value, false, true);
    }

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab;
        Node<K,V> p;
        int n, i;
        //开始为空,进行空间的分配;resize()
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            // 将元素直接插进去
            tab[i] = newNode(hash, key, value, null);
        else {
            //这时就需要链表或红黑树了
            // e是用来查看是不是待插入的元素已经有了,有就替换
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                //要插入的元素就是p,这说明目的是修改值,key值相同进行值的覆盖
                e = p;
            else if (p instanceof TreeNode)
                //将节点存储到树中
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                //进行链表的处理
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        // 链表比较长,需要树化,
                        // 由于初始即为p.next,所以当插入第9个元素才会树化
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        // size太大,达到了capacity的0.75,需要扩容
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                //容量变成原先的2倍
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {
            // zero initial threshold signifies using defaults
            //开始,分配默认大小空间16
            newCap = DEFAULT_INITIAL_CAPACITY;
            //上限临界值: 16*0.75 = 12 ,元素超过12时,进行2倍扩容,变成32
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
                //常见一个指定大小的数组
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        //数组的指定下标放入元素
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        // 重新规划树,如果树的size很小,默认为6,就退化为链表
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        // 处理链表的数据
                        // loXXX指的是在原表中出现的位置
                        Node<K,V> loHead = null, loTail = null;
                        // hiXXX指的是在原表中不包含的位置
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            //这里把hash值与oldCap按位与。
                            //oldCap是2的次幂,所以除了最高位为1以外其他位都是0
                            // 和它按位与的结果为0,说明hash比它小,原表有这个位置
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        // 挂在原表相应位置
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        // 挂在后边
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

4、TreeMap

TreeMap是红黑树的java实现,底层是二叉树升级后的红黑树实现;

源码分析

构造函数

 public TreeMap() {
        //无参构造,比较器为空,使用自然排序
        comparator = null;
    }

    /**
     * Constructs a new, empty tree map, ordered according to the given
     * comparator.  All keys inserted into the map must be <em>mutually
     * comparable</em> by the given comparator: {@code comparator.compare(k1,
     * k2)} must not throw a {@code ClassCastException} for any keys
     * {@code k1} and {@code k2} in the map.  If the user attempts to put
     * a key into the map that violates this constraint, the {@code put(Object
     * key, Object value)} call will throw a
     * {@code ClassCastException}.
     *
     * @param comparator the comparator that will be used to order this map.
     *        If {@code null}, the {@linkplain Comparable natural
     *        ordering} of the keys will be used.
     */
    public TreeMap(Comparator<? super K> comparator) {
        //指定比较器,采用比较器的方法
        this.comparator = comparator;
    }

红黑树的存储实体类

/**
     * Node in the Tree.  Doubles as a means to pass key-value pairs back to
     * user (see Map.Entry).
     */
    //树的实体类
    static final class Entry<K,V> implements Map.Entry<K,V> {
        //键
        K key;
        //值
        V value;
        //左树节点
        Entry<K,V> left;
        //右树节点
        Entry<K,V> right;
        //父节点
        Entry<K,V> parent;
        //节点默认的颜色,黑色
        boolean color = BLACK;

        /**
         * Make a new cell with given key, value, and parent, and with
         * {@code null} child links, and BLACK color.
         */
        Entry(K key, V value, Entry<K,V> parent) {
            this.key = key;
            this.value = value;
            this.parent = parent;
        }

        /**
         * Returns the key.
         *
         * @return the key
         */
        public K getKey() {
            return key;
        }

        /**
         * Returns the value associated with the key.
         *
         * @return the value associated with the key
         */
        public V getValue() {
            return value;
        }

        /**
         * Replaces the value currently associated with the key with the given
         * value.
         *
         * @return the value associated with the key before this method was
         *         called
         */
        public V setValue(V value) {
            V oldValue = this.value;
            this.value = value;
            return oldValue;
        }

        public boolean equals(Object o) {
            if (!(o instanceof Map.Entry))
                return false;
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;

            return valEquals(key,e.getKey()) && valEquals(value,e.getValue());
        }

        public int hashCode() {
            int keyHash = (key==null ? 0 : key.hashCode());
            int valueHash = (value==null ? 0 : value.hashCode());
            return keyHash ^ valueHash;
        }

        public String toString() {
            return key + "=" + value;
        }
    }

添加方法

 public V put(K key, V value) {
        //树的根节点
        Entry<K,V> t = root;
        //开始为空
        if (t == null) {
            compare(key, key); // type (and possibly null) check
            //将第一个元素作为树的根节点
            root = new Entry<>(key, value, null);
            size = 1;
            modCount++;
            return null;
        }
        int cmp;
        Entry<K,V> parent;
        // split comparator and comparable paths
        Comparator<? super K> cpr = comparator;
        //进行key值的比较,比根节点key值小的,放到左树节点上,比根节点key值大的,放到右树节点上
        //如果相等,进行值的覆盖
        if (cpr != null) {
            do {
                parent = t;
                cmp = cpr.compare(key, t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        else {
            if (key == null)
                throw new NullPointerException();
            @SuppressWarnings("unchecked")
                Comparable<? super K> k = (Comparable<? super K>) key;
            do {
                parent = t;
                cmp = k.compareTo(t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        Entry<K,V> e = new Entry<>(key, value, parent);
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;
        //进行红黑树的调整,使其达到平衡
        fixAfterInsertion(e);
        size++;
        modCount++;
        return null;
    }

红黑树的优化算法策略

 /** From CLR */
    //进行红黑树的调整,使其达到数的平衡,比如树节点的旋转,节点颜色的变换
    private void fixAfterInsertion(Entry<K,V> x) {
        x.color = RED;

        while (x != null && x != root && x.parent.color == RED) {
            if (parentOf(x) == leftOf(parentOf(parentOf(x)))) {
                Entry<K,V> y = rightOf(parentOf(parentOf(x)));
                if (colorOf(y) == RED) {
                    setColor(parentOf(x), BLACK);
                    setColor(y, BLACK);
                    setColor(parentOf(parentOf(x)), RED);
                    x = parentOf(parentOf(x));
                } else {
                    if (x == rightOf(parentOf(x))) {
                        x = parentOf(x);
                        rotateLeft(x);
                    }
                    setColor(parentOf(x), BLACK);
                    setColor(parentOf(parentOf(x)), RED);
                    rotateRight(parentOf(parentOf(x)));
                }
            } else {
                Entry<K,V> y = leftOf(parentOf(parentOf(x)));
                if (colorOf(y) == RED) {
                    setColor(parentOf(x), BLACK);
                    setColor(y, BLACK);
                    setColor(parentOf(parentOf(x)), RED);
                    x = parentOf(parentOf(x));
                } else {
                    if (x == leftOf(parentOf(x))) {
                        x = parentOf(x);
                        rotateRight(x);
                    }
                    setColor(parentOf(x), BLACK);
                    setColor(parentOf(parentOf(x)), RED);
                    rotateLeft(parentOf(parentOf(x)));
                }
            }
        }
        root.color = BLACK;
    }

5、LinkedHashMap

LinkedHashMap还维护了一个双向链表,以保证通过Iterator遍历时顺序与插入顺序一致。除此之外,它还支持Access Order,即按

照元素被访问的顺序来排序,我们熟知的LRUCache底层就依赖于此; Linked内部含有一个private transient Entry header;来记录元素插入的顺序或者是元素被访问的顺序。利用这个线性结构的对象

,可以帮助记录entry加入的前后顺序或者记录entry被访问的频率(最少被访问的entry靠前,最近访问的entry靠后)

5、set

HashSet底层的实现是HashMap,存储的是key,value为默认的new Object()对象;

TreeMap底层的实现是TreeMap,存储的是key;

⚠️ **GitHub.com Fallback** ⚠️