因此,我使用 Robert W. Sebesta 的書“編程語言的概念”學習了編程語言。有一個有趣的段落heterogeneous array與record. 顯然,與靜態欄位名稱相比,陣列的訪問時間要慢得多,因為下標是動態的。
問題:那么為什么它更快?和堆疊和堆記憶體的利用率有關系嗎?
第 265 頁,第 6 章:
當資料值的集合是異構的并且不同的欄位以不同的方式處理時使用記錄。此外,記錄的欄位通常不需要按特定順序處理。欄位名稱就像文字或常量下標。因為它們是靜態的,所以它們提供了對欄位的非常有效的訪問。動態下標可用于訪問記錄欄位,但它不允許型別檢查并且速度也較慢。
uj5u.com熱心網友回復:
首先,我想提一下,“性能”的概念并不真正存在于實際硬體和低級實作之外。當本書講述某種資料組織方式“更快”時,它對實際編譯器/解釋器實作和硬體(馮諾依曼架構、CPU 指令集、記憶體布局)做出了許多假設。
順便說一下,讓我們談談靜態記錄、同構和異構陣列的實作可能存在的差異。
1.靜態記錄
struct Point {
int x;
int y;
};
這是一個簡單的 C 結構。當你這樣做時:
Point *p = new Point();
記憶體分配器在堆上分配 8 個位元組,并將該塊的起始地址存盤在p.
編譯器Point預先知道大小(8 個位元組),對于 Point 編譯器的每個欄位都知道它的大小和移位(即y有大小 4 和移位 4)。
當您訪問 Point ( p -> x) 的某個欄位時,編譯器將其替換為計算該欄位的實際記憶體地址x(p shift of 的x地址)并訪問該地址。沒有任何運行時開銷。
2. 同構陣列
int *arr = new int[10];
這是 C 中的一個簡單的同構陣列。與結構體類似,分配器4 * 10在堆上分配位元組并將其起始地址存盤在arr.
當您訪問陣列的元素時arr[i],編譯器知道arr其元素的地址、大小(它們是相同的,因為陣列是同構的),因此它可以計算您正在訪問的記憶體地址為arr i * element_size。
同樣,如果您不計算i除arr.
3. 異構陣列
現在,這就是事情變得有趣的地方。異構陣列沒有直接的低級表示。有多種可能的實作方式可以將這個高級概念映射到實際的記憶體和機器代碼中。
讓我們考慮其中之一。
enum ElType {
INT, POINT
};
struct ArrEl {
ElType type;
char* elPtr;
};
ArrEl *arr = new ArrEl[10];
arr[1].type = POINT;
arr[1].elPtr = (char*)(new Point {3,4} );
arr[2].type = INT;
arr[2].elPtr = (char*)(new int {2} );
請注意,由于陣列現在可以存盤任何型別的元素(僅適用于本演示int或Point):
- Array elements can have different size (
Pointis 8 bytes,intis4), so actual elements have to be allocated on the heap, and array stores only pointers (Alternative approach would allocate the memory equal to the size of the largest possible element for each element, and it is too wasteful in practice). - To know the actual type of the stored element, additional metadata has to be stored. This approach choses to store it in the array, but most actual implementations (including python and java) store it together with the actual object, as an "object header". See the simplified implementation here.
Now to access an element of such array, this metadata has to be checked:
ArrEl el = arr[2];
if (el.type == INT) {
int *value = (int *) el.elPtr;
std::cout << *value;
} else if (el.type == POINT) {
Point *p = (Point *) el.elPtr;
std::cout << p->x;
}
The additional overhead of accessing heterogeneous array consists of an additional indirection: you have to first access an element of the array, and then follow the pointer to the actual element on the heap, and checking the metadata.
There is also an additional memory overhead of storing all the pointers and metadata. This is especially noticeable, when you compare homogeneous arrays of "simple" types, such as int to heterogeneous arrays that can store, say int and float.
Ok, now, when I've given the general idea why heterogeneous arrays can be slow in theory, let's talk about the real world.
Most modern VMs for languages that have heterogeneous arrays have JIT. Opposite to the static compilers (like C ), JIT can make some optimistic assumptions about the executed code and, if such assumptions fail, recompile code to the more pessimistic variant in runtime.
Back to arrays, although arrays in dynamic languages, such as Javascript and Python, are heterogeneous, it's possible that when array is used in a homogeneous way, JIT compiles it to be homogeneous internally! For example, V8 certainly does that. I don't think Python does that at the moment, but it might in the future.
Also, modern optimizing compilers, including static compilers, can rewrite your code in ways you wouldn't expect. For example, your code creates an object and performs some operations on its fields, but in reality compiler will replace field access with CPU registers. No "actual" object is created at all.
That is why it's always important to verify theoretical assumptions about performance with actual benchmarks.
P.S. all C code in this demo is not idiomatic, unsafe and bad. Don't use it at home.
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/355008.html
