在String建構式中缺少邊界檢查消除？-有解無憂

查看 UTF8 解碼性能，我注意到 protobuf 的性能UnsafeProcessor::decodeUtf8優于String(byte[] bytes, int offset, int length, Charset charset)以下非 ascii 字串："Quizdeltagerne spiste jordb\u00e6r med fl\u00f8de, mens cirkusklovnen".

我試圖找出原因，所以我復制了相關代碼String并將陣列訪問替換為不安全的陣列訪問，與UnsafeProcessor::decodeUtf8. 以下是 JMH 基準測驗結果：

Benchmark                       Mode  Cnt    Score   Error  Units
StringBenchmark.safeDecoding    avgt   10  127.107 ± 3.642  ns/op
StringBenchmark.unsafeDecoding  avgt   10  100.915 ± 4.090  ns/op

我認為差異是由于缺少我希望啟動的邊界檢查消除，特別是因為checkBoundsOffCount(offset, length, bytes.length)在String(byte[] bytes, int offset, int length, Charset charset).

這個問題真的是一個缺失的邊界檢查消除嗎？

這是我使用 OpenJDK 17 和 JMH 進行基準測驗的代碼。請注意，這只是String(byte[] bytes, int offset, int length, Charset charset)建構式代碼的一部分，并且僅適用于這個特定的德語字串。靜態方法是從String. 查找// the unsafe version:表明我在哪里將安全訪問替換為不安全的注釋。

    private static byte[] safeDecode(byte[] bytes, int offset, int length) {
        checkBoundsOffCount(offset, length, bytes.length);
        int sl = offset   length;
        int dp = 0;
        byte[] dst = new byte[length];
        while (offset < sl) {
            int b1 = bytes[offset];
            // the unsafe version:
            // int b1 = UnsafeUtil.getByte(bytes, offset);
            if (b1 >= 0) {
                dst[dp  ] = (byte)b1;
                offset  ;
                continue;
            }
            if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
                    offset   1 < sl) {
                // the unsafe version:
                // int b2 = UnsafeUtil.getByte(bytes, offset   1);
                int b2 = bytes[offset   1];
                if (!isNotContinuation(b2)) {
                    dst[dp  ] = (byte)decode2(b1, b2);
                    offset  = 2;
                    continue;
                }
            }
            // anything not a latin1, including the repl
            // we have to go with the utf16
            break;
        }
        if (offset == sl) {
            if (dp != dst.length) {
                dst = Arrays.copyOf(dst, dp);
            }
            return dst;
        }

        return dst;
    }

跟進

顯然，如果我將 while 回圈條件從更改為offset < sl，0 <= offset && offset < sl 我會在兩個版本中獲得相似的性能：

Benchmark                       Mode  Cnt    Score    Error  Units
StringBenchmark.safeDecoding    avgt   10  100.802 ± 13.147  ns/op
StringBenchmark.unsafeDecoding  avgt   10  102.774 ± 3.893  ns/op

uj5u.com熱心網友回復：

為了測量您感興趣的分支，特別是while回圈變熱時的場景，我使用了以下基準：

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class StringConstructorBenchmark {
  private byte[] array;

  @Setup
  public void setup() {
    String str = "Quizdeltagerne spiste jordb?r med fl?de, mens cirkusklovnen. Я";
    array = str.getBytes(StandardCharsets.UTF_8);
  }

  @Benchmark
  public String newString()  {
      return new String(array, 0, array.length, StandardCharsets.UTF_8);
  }
}

事實上，通過修改建構式，它確實有顯著的改進：

//baseline
Benchmark                             Mode  Cnt    Score   Error  Units
StringConstructorBenchmark.newString  avgt   50  173,092 ± 3,048  ns/op

//patched
Benchmark                             Mode  Cnt    Score   Error  Units
StringConstructorBenchmark.newString  avgt   50  126,908 ± 2,355  ns/op

這很可能是 HotSpot 問題：由于某種原因優化編譯器未能消除while-loop內的陣列邊界檢查。我想原因是offset在回圈中修改了：

while (offset < sl) {
    int b1 = bytes[offset];
    if (b1 >= 0) {
        dst[dp  ] = (byte)b1;
        offset  ;                     // <---
        continue;
    }
    if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
            offset   1 < sl) {
        int b2 = bytes[offset   1];
        if (!isNotContinuation(b2)) {
            dst[dp  ] = (byte)decode2(b1, b2);
            offset  = 2;
            continue;
        }
    }
    // anything not a latin1, including the repl
    // we have to go with the utf16
    break;
}

此外，我通過LinuxPerfAsmProfiler查看代碼，這里是基線https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6的鏈接，這是用于修補建構式的鏈接https://gist.github.com/stsypanov/16c787e4f9fa23b1628

應該注意什么？讓我們找到對應的代碼int b1 = bytes[offset];（第 538 行）。在基線中，我們有這個：

  3.62%           ││            │  0x00007fed70eb4c1c:   mov    





        
      轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/379351.html
      標籤：爪哇 数组 表现 protobuf-java 边界检查消除 
      上一篇：在日期之間有效聚合
下一篇：.NET應用程式的性能改程序序是什么？