文章目錄
- 簡述
- 快取行Cache Line
- 驗證CacehLine存在?
- 參考
- 你的鼓勵也是我創作的動力
- Posted by 微博@Yangsc_o
- 原創文章,著作權宣告:自由轉載-非商用-非衍生-保持署名 | Creative Commons BY-NC-ND 3.0
簡述
本地旨在驗證在《深入刨析volatile關鍵詞》中提到的CPU Cache中快取一致性協議可能會出現的CacheMiss;
快取行Cache Line
快取是由快取行組成的,一般一行快取行有64位元組,CPU在操作快取時是以快取行為單位的,可以通過如下命令查看快取行的大小:
[root@yangsc-01 ~]# cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size
64
[root@yangsc-01 ~]#
由于CPU存取快取都是按行為最小單位操作的,對于long型別來說,一個long型別的資料有64位,也就是8個位元組,所以對于陣列來說,由于陣列中元素的地址是連續的,所以在加載陣列中第一個元素的時候會把后面的元素也加載到快取行中,如果一個long型別的陣列長度是8,那么也就是64個位元組了,CPU這時操作該陣列,似乎應該會把陣列中所有的元素都放入快取行,但是答案卻是否定的,原因就是在Java中,物件在記憶體中的結構包含物件頭,在《深入剖析synchronized關鍵詞》一個物件的記憶體布局小節 有相關描述;
一張經典的Cache Line
一個運行在處理器core 1上的執行緒想要更新變數X的值, 同時另外一個運行在處理器core 2上的執行緒想要更新變數Y的值. 但是, 這兩個頻繁改動的變數都處于同一條快取行. 兩個執行緒就會輪番發送RFO訊息, 占得此快取行的擁有權. 當core 1取得了擁有權開始更新X, 則core 2對應的快取行需要設為I狀態. 當core 2取得了擁有權開始更新Y, 則core 1對應的快取行需要設為I狀態(失效態). 輪番奪取擁有權不但帶來大量的RFO訊息, 而且如果某個執行緒需要讀此行資料時, L1和L2快取上都是失效資料, 只有L3快取上是同步好的資料;而L3的Cache性能不好;
驗證CacehLine存在?
先看結果
- VolatileLong耗時:31028毫秒
private static VolatileLong[] longs = new VolatileLong[NUM_THREADS];
- VolatileLong2耗時:7650毫秒
private static VolatileLong2[] longs = new VolatileLong2[NUM_THREADS]; // 7650
- VolatileLong3耗時:7385毫秒
private static VolatileLong3[] longs = new VolatileLong3[NUM_THREADS]; // 7650
public class FalseSharing implements Runnable {
public final static int NUM_THREADS = 4; // change
public final static long ITERATIONS = 500L * 1000L * 1000L;
private final int arrayIndex;
// private static VolatileLong[] longs = new VolatileLong[NUM_THREADS]; // 31028
private static VolatileLong2[] longs = new VolatileLong2[NUM_THREADS]; // 7650
// private static VolatileLong3[] longs = new VolatileLong3[NUM_THREADS]; // 7385
static {
for (int i = 0; i < longs.length; i++) {
longs[i] = new VolatileLong2();
}
VolatileLong volatileLong = new VolatileLong();
VolatileLong2 volatileLong2 = new VolatileLong2();
VolatileLong3 volatileLong3 = new VolatileLong3();
System.out.println(ClassLayout.parseInstance(volatileLong).toPrintable());
System.out.println(ClassLayout.parseInstance(volatileLong2).toPrintable());
System.out.println(ClassLayout.parseInstance(volatileLong3).toPrintable());
}
public FalseSharing(final int arrayIndex) {
this.arrayIndex = arrayIndex;
}
public static void main(final String[] args) throws Exception {
long start = System.currentTimeMillis();
runTest();
System.out.println("duration = " + (System.currentTimeMillis() - start));
}
private static void runTest() throws InterruptedException {
Thread[] threads = new Thread[NUM_THREADS];
for (int i = 0; i < threads.length; i++) {
threads[i] = new Thread(new FalseSharing(i));
}
for (Thread t : threads) {
t.start();
}
for (Thread t : threads) {
t.join();
}
}
@Override
public void run() {
long i = ITERATIONS + 1;
while (0 != --i) {
longs[arrayIndex].value = i;
}
}
public final static class VolatileLong {
public volatile long value = 0L;
}
// long padding避免false sharing
public final static class VolatileLong2 {
volatile long p0, p1, p2, p3, p4, p5, p6;
public volatile long value = 0L;
volatile long q0, q1, q2, q3, q4, q5, q6;
}
/**
* jdk8新特性,Contended注解避免false sharing
* 需要加引數運行: -XX:-RestrictContended
*/
@sun.misc.Contended
public final static class VolatileLong3 {
public volatile long value = 0L;
}
}
- ClassLayout記憶體布局分析
開啟了指標壓縮,markword+classporint+padding,VolatileLong占用了24bytes,不滿足CacheLine在大多數機器上的64位元組的條件,volatile又是執行緒可見的,不同的執行緒修改了之后,需要讓別的執行緒看到,在不同的CacheLine
- ClassLayout2記憶體布局分析
markword+classporint+padding+(p+q自主)padding占用136bytes,可以分布到不同的CacheLine上;
- ClassLayout3記憶體布局分析
markword+classporint+padding+(p+q自主)padding占用280bytes,可以分布到不同的CacheLine上;
com.yangsc.juc.FalseSharing$VolatileLong object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) c1 c1 00 f8 (11000001 11000001 00000000 11111000) (-134168127)
12 4 (alignment/padding gap)
16 8 long VolatileLong.value 0
Instance size: 24 bytes
Space losses: 4 bytes internal + 0 bytes external = 4 bytes total
com.yangsc.juc.FalseSharing$VolatileLong2 object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 47 c1 00 f8 (01000111 11000001 00000000 11111000) (-134168249)
12 4 (alignment/padding gap)
16 8 long VolatileLong2.p0 0
24 8 long VolatileLong2.p1 0
32 8 long VolatileLong2.p2 0
40 8 long VolatileLong2.p3 0
48 8 long VolatileLong2.p4 0
56 8 long VolatileLong2.p5 0
64 8 long VolatileLong2.p6 0
72 8 long VolatileLong2.value 0
80 8 long VolatileLong2.q0 0
88 8 long VolatileLong2.q1 0
96 8 long VolatileLong2.q2 0
104 8 long VolatileLong2.q3 0
112 8 long VolatileLong2.q4 0
120 8 long VolatileLong2.q5 0
128 8 long VolatileLong2.q6 0
Instance size: 136 bytes
Space losses: 4 bytes internal + 0 bytes external = 4 bytes total
com.yangsc.juc.FalseSharing$VolatileLong3 object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 132 (alignment/padding gap)
144 8 long VolatileLong3.value 0
152 128 (loss due to the next object alignment)
Instance size: 280 bytes
Space losses: 132 bytes internal + 128 bytes external = 260 bytes total
參考檔案寫的比我好,想了解更多,請移步到參考連接文章,
參考
從Java視角理解偽共享(False Sharing)
從Java視角理解CPU快取(CPU Cache)
理解CPU-Cache
你的鼓勵也是我創作的動力
打賞地址
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/174989.html
標籤:Java
