學習一下 JVM （二） -- 學習一下 JVM 中物件、String 相關知識-有解無憂

一、JDK 8 版本下 JVM 物件的分配、布局、訪問（簡單了解下）

1、物件的創建程序

（1）前言
　　Java 是一門面向物件的編程語言，程式運行程序中在任意時刻都可能有物件被創建，開發中常用 new 關鍵字、反射等方式創建物件， JVM 底層是如何處理的呢？

（2）物件的創建的幾種常見方式？
　　Type1：使用 new 關鍵字創建（常見比如：單例模式、工廠模式等創建），
　　Type2：反射機制創建（呼叫 class 的 newInstance() 方法），
　　Type3：克隆創建（實作 Cloneable 介面，并重寫 clone() 方法），
　　Type4：反序列化創建，

（3）物件創建步驟
Step1：判斷物件對應的類是否已經被加載、決議、初始化過，
　　虛擬機執行 new 指令時，先去檢查該指令的引數能否在方法區（元空間）的運行時常量池中定位到某個類的符號參考，并檢查這個符號參考代表的類是否被加載、決議、初始化過，如果沒有，則在雙親委派模式下，查找相應的類并加載，

Step2：為物件分配記憶體空間，
　　類加載完成后，即可確定物件所需的記憶體大小，在堆中根據適當演算法劃分記憶體空間給物件，
劃分演算法：
　　劃分演算法根據 Java 堆中記憶體是否規整進行可劃分為：指標碰撞、空閑串列，
　　堆記憶體規整時，采用指標碰撞方式分配記憶體空間，由于記憶體規整，即指標只需移動所需物件記憶體大小即可，
　　堆記憶體不規整時，采用空閑串列方式分配記憶體空間，存在記憶體碎片，需要維護一個串列用于記錄哪些記憶體塊可用，在串列中找到足夠大的記憶體空間分配給物件，

堆記憶體是否規整：
　　堆記憶體是否規整由垃圾回收器演算法決定，
　　使用 Serial、ParNew 等帶有 Compact（壓縮）程序的垃圾回收器時，堆記憶體規整，即指標碰撞，
　　使用 CMS 等帶有 Mark-Sweep（標記清除）演算法的垃圾回收器時，堆記憶體不規整，即空閑串列，

Step3：處理并發安全問題，
　　分配記憶體空間時，指標修改可能會碰到并發問題（比如物件 A 分配記憶體后，但指標還沒修改，此時物件 B 仍使用原來指標進行記憶體分配，那么 A 與 B 就會出現沖突），
解決方式一：
　　對分配記憶體空間的動作進行同步處理（CAS 加上失敗重試保證更新操作的原子性），

解決方式二：
　　將分配記憶體空間的動作按照執行緒劃分到不同空間中執行（Thread Local Allocation Buffer，TLAB，每個執行緒在堆中預先分配一小塊記憶體空間，哪個執行緒需要分配記憶體，就在哪個 TLAB 上進行分配），

Step4：初始化屬性值，
　　將記憶體空間中的屬性賦零值（默認值），

Step5：設定物件的物件頭，
　　將物件所屬類的元資料資訊、物件的哈希值、物件 GC 分代年齡等資訊存盤在物件的物件頭，

Step6：執行 <init> 方法進行初始化，
　　執行 <init> 方法，加載非靜態代碼塊、非靜態變數、構造器，且執行順序為從上到下執行，但構造器最后執行，并將堆內物件的首地址賦值給參考變數，

2、物件記憶體布局

（1）物件記憶體布局
　　物件在記憶體中存盤布局可以分為：物件頭（Header）、實體資料（Instance Data）、對齊填充（Padding），

（2）物件頭（Header）
　　物件頭用于存盤運行時元資料以及型別指標，
運行時元資料：
　　物件的哈希值、GC 分代年齡、鎖狀態標志、偏向時間戳等，

型別指標：
　　即物件指向類元資料的指標（通過該指標確定該物件屬于哪個類），

（3）實體資料（Instance Data）
　　其為物件存盤的真實有效資訊，即程式中各型別欄位的內容，

（4）對齊填充（Padding）
　　不是必然存在的，起著占位符的作用，比如 HotSpot 中物件大小為 8 位元組的整數倍，當物件實體資料不是 8 位元組的整數倍時，通過對齊填充補全，

3、物件訪問定位（句柄訪問、直接指標）

（1）問題
　　物件存于堆中，而物件的參考存放在堆疊幀中，如何根據堆疊幀存放的參考定位堆中存盤的物件，即為物件訪問定位問題，取決于 JVM 的具體實作，常見方式：句柄訪問、直接指標，

（2）句柄訪問
　　在堆中劃分出一塊記憶體作為句柄池，用于保存物件的句柄地址（指標），而堆疊幀中存放的即為句柄地址，
　　當物件被移動（垃圾回收）時，只需要改變句柄池中指向物件實體資料的指標即可，不需要修改堆疊幀中的資料，

（3）直接訪問（HotSpot 使用）
　　堆疊幀中直接存放物件實體資料的地址，物件移動時，需要修改堆疊幀中的資料，
　　相較于句柄訪問，減少了一次指標定位的時間開銷（積少成多還是很可觀的），

二、JDK8 中的 String（可以深入研究一下，有不對的地方還望不吝賜教）

1、String 基本概念（JDK9 稍作改變）

（1）基本概念
　　String 指的是字串，一般使用雙引號括起來 "" 表示（比如： "hello"），
　　使用 final 型別修飾 String 類，表示不可被繼承，
　　String 類實作了 Serializable 介面，表示字串支持序列化，
　　String 類實作了 Comparable 介面，表示可以比較大小，
　　String 類內部使用 final 修飾的陣列存盤字符，
注：
　　JDK8 及以前內部使用 final char[] value 用于存盤字串資料，
　　JDK9 時改為 final byte[] 存盤資料（內部將每個字符與 0xFF 比較，當有一個比 0xFF 大時，使用 2 個位元組存盤，否則使用 1 個位元組存盤），

【JDK 8：】
public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    private final char value[];
}

（2）賦值方式
　　String 賦值一般分為：字面量直接賦值、使用 new 關鍵字通過構造器賦值，

【字面量直接賦值：（值會存放于 字串常量池 中）】
    String a = "hello";

【new + 構造器賦值：（值可能會存放于 字串常量池 中，且 new 關鍵字會在堆中創建一個物件）】
    String a = new String("hello");    
注：
    值不一定會存放于 字串常量池中，可以呼叫 String 的 intern() 方法將值放于字串常量池中，
    intern() 方法在不同 JDK 版本中實作不同，后面會舉例，此處大概有個印象即可，

2、字串常量池（String Pool）、String 不可變性

（1）字串常量池（String Pool）
　　JVM 內部維護一個字串常量池（String Pool），當 String 以字面量形式賦值時，此時字串會宣告在字串常量池中（比如：String a = "hello" 賦值時，會生成一個 "hello" 字串存于常量池中），
　　字串常量池中不會存盤相同內容的字串，其內部實作是一個固定大小的 Hashtable，如果常量池中存盤 String 過多，將會造成 hash 沖突，從而造成性能下降，可以通過 -XX:StringTableSize 設定 StringTable 大小（比如：-XX:StringTableSize=2000），

注：
　　常量池類似于快取，使程式運行更快、節省記憶體，
　　JDK 6 及以前，字串常量池存放于永久代中，StringTable 默認長度為 1009，
　　JDK 7 及之后，字串常量池存放于堆中，StringTable 默認長度為 60013，其最小值為 1009，

【常用 JVM 引數：】
-XX:StringTableSize    配置字串常量池中的 StringTable 大小，JDK 8 默認：60013，
-XX:+PrintStringTableStatistics  在JVM 行程退出時，列印出 StringTable 相關統計資訊，

（2）String 不可變性：
　　String 一旦在記憶體中創建，其值將是不可變的（反射場景除外），當值改變時，改變的是指向記憶體的參考，而非直接修改記憶體中的值，

JDK 8 String 不可變：
　　JDK8 采用 final 修飾 String 類，表示該類不可被繼承，
　　String 類內部采用 private final char value[] 存盤字串，使用 private 修飾陣列且不對外提供 setter 方法，即外部不可修改字串，使用 final 修飾陣列，表示內部不可修改字串（參考地址不變，內容可變，使用反射可能會改變字串），且 String 提供的相關方法中，并沒有去修改原有字串中的值，而是回傳一個新的參考指向記憶體中新的 String 值（比如 replace() 方法回傳一個 new String() 物件），

（3）常見場景（修改參考地址）

　　對現有字串重新賦值時，
　　對現有字串進行連接操作時，
　　使用字串的 replace() 方法修改指定字串時，

【舉例：（給現有字串重新賦值）】
package com.test;

public class JVMDemo {

    public static void main(String[] args) {
        String C1 = new String("abc");
        String C2 = C1;
        System.out.println(C1 == C2); // true
        System.out.println(System.identityHashCode(C1));
        System.out.println(System.identityHashCode(C2));
        C2 = "abc";
        System.out.println(C1 == C2); // false
        System.out.println(System.identityHashCode(C1));
        System.out.println(System.identityHashCode(C2));
    }
}

3、String 拼接操作 -- 筆試題

（1）拼接操作可能存在的情況：
　　常量與常量（字面量或者 final 修飾的變數）的拼接結果會存放于常量池中，由編譯期優化導致，
　　拼接資料中若有一個是變數，則拼接結果會存放于堆中，由 StringBuilder 拼接，
　　如果拼接結果呼叫 intern() 方法，且常量池中不存在該字串物件，則將拼接結果存放于常量池中，

（2）常量（字面量）拼接 -- 拼接結果存于常量池
　　對于兩個及以上字面量拼接操作，在編譯時會進行優化，若該拼接結果不存在于常量池中，則直接將其拼接結果存于常量池，并回傳其參考地址，否則，回傳常量池中該結果所在的參考地址，

【舉例：】
package com.test;

public class JVMDemo {

    public static void main(String[] args) {
        String c1 = "a" + "b" + "c";// 編譯期優化，等同于 "abc"，并存放于常量池中
        String c2 = "abc"; // "abc" 已存在于常量池，此時直接將常量池中的地址 賦給 c2
        System.out.println(c1 == c2); // true
        System.out.println(System.identityHashCode(c1));
        System.out.println(System.identityHashCode(c2));
    }
}

（3）final 修飾的變數拼接（可以理解為常量） -- 拼接結果存于常量池
　　由于 final 修飾的變數不可被修改，在編譯期優化等同于常量進行拼接操作，所以結果存放于常量池中，

【舉例：】
package com.test;

public class JVMDemo {

    public static void main(String[] args) {
        final String c1 = "hello";
        final String c2 = "world";
        String c3 = "helloworld"; // "helloworld" 存放于常量池中
        String c4 = c1 + c2; // 等價于 "hello" + "world" 常量進行拼接操作
        System.out.println(c3 == c4); // true
        System.out.println(System.identityHashCode(c3));
        System.out.println(System.identityHashCode(c4));
    }
}

（4）一般變數拼接 -- 拼接結果存于堆
　　拼接操作中出現變數時，會觸發 new StringBuilder 操作，并使用 StringBuilder 的 append 方法進行字串拼接，最終呼叫其 toString 方法轉為字串，并回傳參考地址，
注：
　　StringBuilder 的 toString 方法內部呼叫 new String()，其最終拼接結果存放于堆中（不會將拼接結果存放于常量池，可以手動呼叫 intern() 方法將結果放入常量池，后面介紹，往下看），

使用 StringBuilder 進行字串拼接操作效率要遠高于使用 String 進行字串拼接操作，
　　使用 String 直接進行拼接操作時，若出現變數，則會先創建 StringBuilder 物件，最終輸出結果還得轉為 String 物件，即使用 String 進行字串拼接程序中可能出現多個 StringBuilder 和 String 物件（比如在回圈中進行字串拼接操作），且創建物件過多會占用更多的記憶體，
　　而使用 StringBuilder 進行拼接操作時，只需要創建一個 StringBuilder 物件即可，可以節省記憶體空間以及提高效率執行，

【舉例：】
package com.test;

public class JVMDemo {

    public static void main(String[] args) {
        String c1 = "hello";
        String c2 = "world";
        String c3 = "hello" + "world";  // 等價于 "helloworld"，存于常量池
        String c4 = "helloworld"; // 常量池中已存在，直接賦值常量池參考
        String c5 = "hello" + c2; // 拼接結果存于 堆
        String c6 = c1 + "world"; // 拼接結果存于 堆
        String c7 = c1 + c2; // 拼接結果存于 堆
        System.out.println(System.identityHashCode(c3));
        System.out.println(System.identityHashCode(c4));
        System.out.println(System.identityHashCode(c5));
        System.out.println(System.identityHashCode(c6));
        System.out.println(System.identityHashCode(c7));
    }
}

（5）拼接結果呼叫 intern 方法 -- 結果存放于常量池
　　由于不同版本 JDK 的 intern() 方法執行結果不同，此處暫時略過，接著往下看，

4、String 使用 new 關鍵字創建物件問題 -- 筆試題

（1）new String("hello") 會創建幾個物件？
　　可能會創建 1 個或 2 個物件，
　　new 關鍵字會在堆中創建一個物件，而當字串常量池中不存在 "hello" 時，會創建一個物件存入字串常量池，若常量池中存在物件，則不會創建、會直接參考，

【舉例：】
public class JVMDemo {
    public static void main(String[] args) {
        String c1 = new String("hello");
        String c2 = new String("hello");
    }
}

（2）new String("hello") + new String("world") 創建了幾個物件？
　　創建了 6 個物件（不考慮常量池是否存在資料），
物件創建：
　　由于涉及到變數的拼接，所以會觸發 new StringBuilder() 操作，此處創建 1 個物件，
　　new String("hello") 通過上例分析，可以知道會創建 2 個物件（堆 1 個，字串常量池 1 個），
　　同理 new String("world") 也會創建 2 個物件，
　　最終拼接結果會觸發 StringBuilder 的 toString() 方法，內部呼叫 new String() 在堆中創建一個物件（此處不會在字串常量池中創建物件），

注：
　　StringBuilder 的 toString() 內部的 new String() 并不會在字串常量池中創建物件，
　　String str = new String("hello"); 這種形式創建的字串可以在字串常量池中創建物件，
此處我是根據位元組碼檔案中是否有 ldc 指令來判斷的（后續根據 intern() 方法同樣也可以證明這點），有不對的地方，還望不吝賜教，

【舉例：】
public class JVMDemo {
    public static void main(String[] args) {
        String a = new String("hello") + new String("world");
    }
}

5、String 中的 intern() 相關問題 -- 筆試題

（1）intern() 作用
　　對于非字面量直接宣告的 String 物件（通過 new 創建的物件），可以使用 String 提供的 intern 方法獲取字串常量池中的資料，

該方法作用：

　　從字串常量池中查詢當前字串是否存在（通過 equals 方法比較），如果不存在，則會將當前字串放入常量池中并回傳該參考地址（此處不同版本的 JDK 有不同的實作），若存在則直接回傳參考地址，

【JDK 8 注釋】
/**
 * Returns a canonical representation for the string object.
 * <p>
 * A pool of strings, initially empty, is maintained privately by the
 * class {@code String}.
 * <p>
 * When the intern method is invoked, if the pool already contains a
 * string equal to this {@code String} object as determined by
 * the {@link #equals(Object)} method, then the string from the pool is
 * returned. Otherwise, this {@code String} object is added to the
 * pool and a reference to this {@code String} object is returned.
 * <p>
 * It follows that for any two strings {@code s} and {@code t},
 * {@code s.intern() == t.intern()} is {@code true}
 * if and only if {@code s.equals(t)} is {@code true}.
 * <p>
 * All literal strings and string-valued constant expressions are
 * interned. String literals are defined in section 3.10.5 of the
 * <cite>The Java&trade; Language Specification</cite>.
 *
 * @return  a string that has the same contents as this string, but is
 *          guaranteed to be from a pool of unique strings.
 */
public native String intern();

（2）不同 JDK 版本中 intern() 使用的區別
　　JDK 6：嘗試將該字串物件放入字串常量池中（字串常量池位于方法區中），
　　　　若字串常量池中已經存在該物件，則回傳字串常量池當前物件的參考地址，
　　　　若沒有該物件，則將當前物件值復制一份放入字串常量池，并回傳此時物件的參考地址，

　　JDK 7 之后：嘗試將該字串物件放入字串常量池中（字串常量池位于堆中），
　　　　若字串常量池中已經存在該物件，則回傳字串常量池當前物件的參考地址，
　　　　若沒有該物件，則將當前物件的參考地址復制一份放入字串常量池，并回傳參考地址，

（3）使用 JDK8 演示 intern()
　　此處使用 JDK8 演示 intern() 方法，有興趣可以自行研究 JDK6 的操作，

【例一：】
public class JVMDemo {
    public static void main(String[] args) {
        String a = new String("hello"); // 此時常量池存在 "hello"
        String b = "hello"; // 直接參考常量池中 "hello"
        String c = a.intern(); // 直接參考常量池中 "hello"
        System.out.println(System.identityHashCode(a));
        System.out.println(System.identityHashCode(b));
        System.out.println(System.identityHashCode(c));
        System.out.println(b == a); // false，a 指向 堆 內物件，b 指向 字串常量池
        System.out.println(b == c); // true，b,c 均指向字串常量池
    }
}

【例二：（b,c 互換位置）】
public class JVMDemo {
    public static void main(String[] args) {
        String a = new String("hello"); // 此時常量池存在 "hello"
        String c = a.intern(); // 直接參考常量池中 "hello"
        String b = "hello"; // 直接參考常量池中 "hello"
        System.out.println(System.identityHashCode(a));
        System.out.println(System.identityHashCode(b));
        System.out.println(System.identityHashCode(c));
        System.out.println(b == a); // false，a 指向 堆 內物件，b 指向 字串常量池
        System.out.println(b == c); // true，b,c 均指向字串常量池
    }
}

【分析：】
    例一 與 例二 的區別在于 intern() 執行時機不同，且兩者輸出結果相同，
    JDK 8 中 intern() 執行時，若字串常量池中 equals 未比較出相同資料，則將當前物件的參考地址 復制一份并放入常量池，
    若存在資料，則回傳常量池中資料的參考地址，
    
    即 new String() 操作后，若常量池中不存在 資料，則呼叫 intern() 后，會復制 堆的地址 并存入 常量池中，后續獲得的均為 堆的地址，也即上述 例一、例二 中 a、b、c 操作后，均相同且指向 堆，
　　若常量池存在資料，則呼叫 intern() 后，回傳常量池參考，后續獲得的均為 常量池參考，也即上述 例一、例二 中 a 為指向堆 的參考地址，b,c 均為指向常量池的參考地址，

    通過輸出結果可以看到，上述 例一、例二 中 a、b、c 操作后，b, c 相同且不同于 a（即 b、c 指向常量池），從側面也反映出 new String("hello") 執行后 常量池中存在 "hello"，

【例四：】
public class JVMDemo {
    public static void main(String[] args) {
        char[] a = new char[]{'h', 'e', 'l', 'l', 'o'};
        String b = new String(a, 0 , a.length); // 此時常量池中不存在 "hello"
        String c = "hello"; // 此時常量池中存在 "hello"
        String d = b.intern(); // 直接參考常量池中的 "hello"

        System.out.println(System.identityHashCode(b));
        System.out.println(System.identityHashCode(c));
        System.out.println(System.identityHashCode(d));
        System.out.println(c == b); // false, b 指向堆物件， c 指向常量池
        System.out.println(c == d); // true，c，d 均指向常量池
    }
}

【例五：】
public class JVMDemo {
    public static void main(String[] args) {
        char[] a = new char[]{'h', 'e', 'l', 'l', 'o'};
        String b = new String(a, 0 , a.length); // 此時常量池中不存在 "hello"
        String d = b.intern(); // 常量池不存在 "hello"，復制 b 的參考到常量池中（指向堆的參考）
        String c = "hello"; // 直接獲取常量池中的參考（指向堆的參考）

        System.out.println(System.identityHashCode(b));
        System.out.println(System.identityHashCode(c));
        System.out.println(System.identityHashCode(d));
        System.out.println(c == b); // true, c, b 均指向堆
        System.out.println(c == d); // true, c, d 均指向堆
    }
}

【分析：】
    例四 與 例五 的區別在于 intern() 執行時機不同，且兩者輸出結果相同，
    JDK 8 中 intern() 執行時，若字串常量池中 equals 未比較出相同資料，則將當前物件的參考地址 復制一份并放入常量池，
    若存在資料，則回傳常量池中資料的參考地址，
    
    例四中，new String() 執行后，常量池中不存在 "hello"，
    但 String c = "hello" 執行后，常量池中存在 "hello"，從而 intern() 獲取的是常量池中的參考地址，
    也即 b 為指向 堆的參考，c,d 均為指向常量池的參考，
    
    例五中，new String() 執行后，常量池中不存在 "hello"，
    intern() 執行后會將當前物件地址（指向堆的參考）復制并放入常量池，從而 String c = "hello" 獲取的是常量池的參考地址，
    也即 b,c,d 獲取的均是指向 堆 的參考，

對例四、例五進行一下延伸，

【例六：】
public class JVMDemo {
    public static void main(String[] args) {
        String a = new String("hello") + new String("world");
        String b = "helloworld";
        String c = a.intern();
        System.out.println(System.identityHashCode(a));
        System.out.println(System.identityHashCode(b));
        System.out.println(System.identityHashCode(c));
        System.out.println(b == a); // false, a 指向堆, b 指向常量池
        System.out.println(b == c); // true, b、c 均指向常量池
    }
}

【例七：】
public class JVMDemo {
    public static void main(String[] args) {
        String a = new String("hello") + new String("world");
        String c = a.intern();
        String b = "helloworld";
        System.out.println(System.identityHashCode(a));
        System.out.println(System.identityHashCode(b));
        System.out.println(System.identityHashCode(c));
        System.out.println(b == a); // true, a、b 均指向堆,
        System.out.println(b == c); // true, b、c 均指向堆
    }
}

【分析：】
    涉及到變數字串拼接，會觸發 StringBuilder 進行相關操作，
    最終觸發 toString() 轉為 String，其內部呼叫的是 String(char value[], int offset, int count) 構造方法，
    此方法在堆中創建 字串 但不會向常量池中添加資料（與 例四、例五 是同樣的場景），

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/53066.html

標籤：Java

上一篇：面試題系列第1篇：說說==和equals的區別？你的回答可能是錯誤的

下一篇：理解傳輸層中UDP協議首部校驗和以及校驗和計算方法的Java實作