英文原文地址Memory optimization for feeds on Android
讀后感
在Java中HashSet只能存放繼承自Objcet的物件,這中情況下“基本資料型別”轉化為繼承自Object的(Integer、Long等)會產生很多中間Object物件,占用過多的記憶體,從而引發垃圾回收,造成系統卡頓,
下邊是原文及翻譯
Millions of people use Facebook on Android devices every day, scrolling through News Feed, Profile, Events, Pages, and Groups to interact with the people and information they care about. All of these different feed types are powered by a platform created by the Android Feed Platform team, so any optimization we make to the Feed platform could potentially improve performance across our app. We focus on scroll performance, as we want people to have a smooth experience when scrolling through their feeds.
每天有數百萬的用戶,在Android平臺上使用Facebook,通過滑動 "新聞動態"、"用戶資料"、"Events"、"Pages"和"群組"動態串列 來與他們關心的人和資訊互動,所有這些不同的動態型別是由Android動態平臺組提供的,因此我們對動態串列的任何優化,都有可能提高Fackbook這個app的性能,我們專注于動態串列滑動時的性能,當用戶滑動動態串列時,我們希望可以為用戶提供一種流暢的用戶體驗,
To help us achieve that, we have several automatic tools that run performance tests on the Feed platform, across different scenarios and on different devices, measuring how our code performs in runtime, memory use, frame rate, and more. One of those tools, Traceview, showed a relatively high number of calls to the Long.valueOf() function, which led to objects accumulating in memory and causing the app to stall. This post describes the problem, the potential solutions we weighed, and the set of optimizations we made to improve the Feed platform.
為了幫助我們實作這一點,我們使用了幾個自動工具,在不同的設備和不同的情景下,測驗動態串列滑動時候的記憶體使用情況、幀率等,這些測驗工具中Traceview顯示Long.valueof()方法的呼叫次數較多,使記憶體空閑物件迅速增加,從而影響App的運行速度,在這篇文章中,我們介紹了問題、潛在解決方案、以及在動態串列上我們做的優化,
The downside of convenience (便利化的缺陷)
After noticing the high number of calls to the Long.valueOf() function in one of our method profiling reports from Traceview, we ran further tests that confirmed that as we scrolled through News Feed, there were an unexpectedly high number of invocations for this method.
通過分析Traceview的運行報告,在注意到Long.valueof()方法被多次呼叫后,我們進行了進一步的測驗,確認當我們滑動新聞動態時,這個方法的呼叫數量出乎意料地高,
When we looked at the stacktrace, we found that the function was not being called explicitly from Facebook's code but implicitly by code inserted by the compiler. This function was called when assigning(分派、設定) a primitive long value where a Long object is expected. Java supports both Object and primitive representation of the simple types (e.g., integer, long character) and provides a way to seamlessly convert between them. This feature is called autoboxing, because it “boxes” a primitive type into a corresponding Object type. While that's a convenient development feature, it creates new objects unbeknownst to the developer.
當我們查看stacktrace時,我們發現在Facebook的代碼并沒有直接呼叫該方法,而是由編譯器插入的代碼隱式呼叫的,這個函式在long轉化為Long時被呼叫,Java提供了“基本資料型別”(例如: int、long)與繼承自Object資料型別(例如: Integer、Long)轉化的支持,這個特性被稱為 autoboxing,它將“基本資料型別”轉化為相應的物件型別,這是一個方便的開發特性,在開發者無感知的情況下,為開發者創建了一個新的物件,
And these objects add up.
In this heap dump taken from a sample app, Long objects have a noticeable presence; while each object is not big by itself, there are so many that they occupy(占據) a large portion( 部分) of the app's memory in the heap. This is especially problematic(問題的) for devices running the Dalvik runtime. Unlike ART, which is the newer Android runtime environment, Dalvik doesn't have a generational garbage collection, known to be more optimal for handling many small objects. As we scroll through News Feed and the number of objects grows, the garbage collection will cause the app to pause and sweep(打掃) unused objects from the memory. The more objects that accumulate(積攢), the more frequently( 頻繁地) the garbage collector will have to pause the app, causing it to stutter(結巴) and stall and making for a poor user experience.
截圖取自一個示例App,其中Long有一個明顯的呼叫,從圖中可以看出,每個物件本身并不大,卻占用了App堆記憶體的很大一部分空間,這肯定是有問題的,不像Android runtime(ART虛擬機)環境,Dalvik虛擬機在處理許多小物件時,并沒有分代垃圾收集機制,當我們滾動 “新聞動態” 時,物件的數量迅速增長,垃圾回識訓制會導致應用程式暫停并清理沒有參考的物件的記憶體,物件積累越多,垃圾回收的次數就越頻繁,進行垃圾回收時,Dalvik將暫停應用程式,導致App卡頓,造成一個很差的用戶體驗,
Luckily, tools such as Traceview and Allocation Tracker can help find where these calls are made. After reviewing the origins of these autoboxing occurrences, we found that the majority of them were done while inserting long values into a HashSet
幸運的是,Traceview和Allocation Tracker這些工具可以幫助我們查找這些呼叫,回顧這些autoboxing轉化,我們發現大部分的autoboxing轉化發生在long插入到HashSet<Long>時,(我們用HashSet<Long>這個資料結構存盤“新聞動態”資料,并通過hash查找某一個“新聞動態”資料是否在此資料結構中),當用戶滑動“新聞動態”串列時,HashSet提供快速快速的資料查找能力,HashSet只能存盤繼承自Object的物件,并不能存盤“基本資料型別”,而我們用于計算和存盤的物件是一個long型的變數,當進行setStories.put(lStoryHash)操作時,不可避免的發生long到Long的轉化,
As a solution, a Set implementation( 實作) for primitive types can be used, but that turned out not to be as straightforward(簡單的) as we expected.
其中一個解決方案,可以實作一種支持"“基本資料型別”資料型別"的集合,但這種解決方案并不像我們預期的那么簡單,
Existing solutions (解決方案)
There are a few existing Java libraries that provide a Set implementation for primitive types. Almost all of these libraries were created more than 10 years ago, when the only Java running on mobile devices was J2ME. So, to determine viability, we needed to test them under Dalvik/ART and ensure they could perform well on more constrained mobile devices. We created a small testing framework to help compare these libraries with the existing HashSet and with one another. The results showed that a few of these libraries had a faster runtime than HashSet, and with fewer Long objects, but they still internally allocated a lot of objects. As an example, TLongHashSet, part of the Trove library, allocated about 2 MB worth of objects when tested with 1,000 items:
有一些現有的Java庫,為"“基本資料型別”提供Set實作,幾乎所有這些庫創建十多年前,當時在移動設備上使用Java的設備只有J2ME,因此,為了確定是否可行,我們需要在Dalvik/ART上進行測驗,確保他們可以運行在更多的移動設備上,我們構建了一個小測驗框架來,來實作這些現有庫與HashSet進行比較,結果表明,其中一些庫相比于HashSet運行速度更快,創建的Long物件更少,但他們仍然在記憶體中分配了大量的物件,例如,Trove庫中的TLongHashSet,1000項Items分配了約2 MB記憶體控制元件的Object物件,
Testing other libraries, such as PCJ and Colt, showed similar results.
測驗其他的庫,例如 PCJ 和Colt得到與之相似的結果,
It was possible that the existing solutions didn't fit our needs. We considered whether we could instead create a new Set implementation and optimize it for Android. Looking inside Java's HashSet, there's a relatively simple implementation using a single HashMap to do the heavy lifting.
現有的解決方案不符合我們的需要,我們考慮是否可以創建一個對Android優化的Set實作,查看Java HashSet原始碼實作,發現其實作相對簡單,內部采用HashMap來實作,
public class HashSet<E> extends AbstractSet<E> implements Set<E>, ... {
transient HashMap<E, HashSet<E>> backingMap;
...
@Override public boolean add(E object) {
return backingMap.put(object, this) == null;
}
@Override public boolean contains(Object object) {
return backingMap.containsKey(object);
}
...
}
Adding a new item to the HashSet means adding it to the internal HashMap where the object is the key and the HashSet's instance is the value. To check object membership, HashSet checks whether its internal HashMap contains the object as a key. An alternative to HashSet could be implemented using an Android-optimized map and the same principles.
添加一個Item到HashSet意味著將此Item添加到HashMap中,在HashMap中,該Item物件做為Key存在;在HashSet中查找該Item物件時,實際是在對比HashMap 中的Key中查找是否存在該Item物件,在Android平臺上的資料交換同樣使用的這套準則,
Introducing(介紹) LongArraySet
You may already be familiar with LongSparseArray, a class in the Android Support Library that acts as a map using a primitive long as a key. Example usage:
您可能已經熟悉LongSparseArray,其存在于Android的Support Library庫中,做為一個Map實作,其key為long型資料變數,使用示例:
LongSparseArray<String> longSparseArray = new LongSparseArray<>();
longSparseArray.put(3L, "Data");
String data = https://www.cnblogs.com/xiaxveliang/p/longSparseArray.get(3L); // the value of data is "Data"
LongSparseArray works differently than HashMap, though. When calling mapHashmap.get(KEY5), this is how the value is found in the HashMap:
LongSparseArray與HashMap作業原理不同,當呼叫mapHashmap.get(KEY5)時,以下為HashMap查找原理:
When retrieving a value using a key on a HashMap, it's accessing the value in the array using a hash of the key as the index, a direct access in O(1). Making the same call on a LongSparseArray looks like this:
在HashMap中檢索一個值時,通過其對應key的hash值進行比較,直接檢索時時間復雜度為O(1),LongSparseArray 做同樣操作時:
LongSparseArray searches a sorted keys array for the key's value using a binary search, an operation with runtime of O(log N). The index of the key in the array is later used to find the value in the values array.
LongSparseArray搜索某一個key時,采用二分查找演算法查找存放key的陣列,時間復雜度O(log N),查找到的key的index用于在Value陣列中找到對應的value,
HashMap allocates one big array in an effort to avoid collisions, which are causing the search to be slower. LongSparseArray allocates two small arrays, making its memory footprint smaller. But to support its search algorithm, LongSparseArray needs to allocate its internal arrays in consecutive memory blocks. Adding more items will require allocating new arrays when there's no more space in the current ones. The way LongSparseArray works makes it less ideal when holding more than 1,000 items, where these differences have a more significant impact on performance. (You can learn more about LongSparseArray in the official documentation and by watching this short video by Google.)
為了避免資料沖突HashMap創建了一個很大的陣列,從而導致搜索速度較慢,LongSparseArray分配兩個較小的陣列,使其記憶體占用較小,但是為了支持其搜索演算法,LongSparseArray需要分配連續的記憶體塊,添加更多的Item時,當陣列空間不足時,需要分配新的陣列空間,LongSparseArray的作業方式,使得Item數量超過1000項時,其表現不是很理想,這些差異對性能有更重大的影響,(想了解更多關于LongSparseArray的內容,可以閱讀official documentation和谷歌提供的short video ) ,
Since the LongSparseArray's keys are of a primitive long type, we can create a data structure with the same approach as HashSet but with a LongSparseArray as the internal map instead of HashMap.
由于LongSparseArray使用“基本資料型別”long做為key,我們可以用LongSparseArray替換HashSet中的HashMap,從而創建一個與HashSet相似的資料結構,
And so LongArraySet was created.
因此LongArraySet被創建出來了,
The new data structure looked promising, but the first rule of optimization is “always measure.” By using the same testing framework from earlier, we compared the new data structure with HashSet. Each data structure was tested by adding X number of items, checking the existence of each item, and later removing all of them. We ran tests using different numbers of items (X=10, X=100, X=1,000 ...) and averaged the time it took to complete each operation per item.
新的資料結構看起來是很有前景的,但是最優的解決方案總是在不斷的權衡的,通過之間介紹的小測驗程式,我們把新的資料結構與HashSet進行了對比,對每一種資料結構進行添加X條資料、檢測Item資料存在情況和移除資料測驗,我們測驗使用不同數量的Item(X = 10、X = 100、X = 1000…),并計算對應操作的平均時間,
The runtime results (time shown is in nanoseconds):
測驗結果(單位:納秒)
We saw a runtime improvement for the contains and remove methods using the new data structure. Also, as the number of items increased in the array set, it took more time to add new items. That's consistent with what we already knew about LongSparseArray — it doesn't perform as well as HashMap when the number of items is more than 1,000. In our use cases, we're dealing with only hundreds of items, so this is a trade-off we're willing to make.
我們看到使用新的資料結構后,contains與remove操作性能的顯著提升,同時,隨著item數量的增加,需要花更多的時間去添加一個新的item,這符合我們對于 LongSparseArray中item超過1000后的操作表現預期,在我們的案例中,我們僅僅要處理數百條的資料,因此經過權衡,我們將采用這種方案,
We also saw a big improvement related to memory. In reviewing the heap dumps and Allocation Tracker reports, we noticed a decrease in object allocations. Here's a side-by-side allocation report for the HashSet and LongArraySet implementations, when adding 1,000 items for 20 iterations:
我們還看到一個記憶體的重大提升,回顧heap dumps和Allocation Tracker報告,我們注意到物件分配的減少,以下添加1000條資料時,HashSet和LongArraySet的記憶體分配報告,
In addition to avoiding all the Long object allocations, LongSparseArray was more memory-efficient internally, with about 30 percent fewer allocations in this scenario.
避免了Long物件占用記憶體,在這種情景下,LongSparseArray減少了30%的記憶體占用,
Conclusion 結論
By understanding how other data structures work, we were able to create a more optimized data structure for our needs. The less the garbage collector has to work, the lower the likelihood of dropped frames. Using the new LongArraySet class, and a similar IntArraySet for the primitive int data type, we were able to cut down a significant number of allocations in our entire app.
通過了解資料結構的作業原理,我們可以創建一個滿足我們要求的更優的資料結構,垃圾回收的次數越少,對幀率的影響就越小,使用LongArraySet和IntArraySet ,我們可以減少整個應用程式中,物件的記憶體分配,
This case study demonstrates the importance of measuring our options to ensure that we were introducing an improvement. We also accepted that while this solution is not perfect for all use cases, since this implementation is slower for very large data sets, it was an acceptable trade-off for us to optimize our code.
這個學習案例展示了,為了確保我們引入改進的可行,不斷權衡的重要性,對于這個方案并不適用于所有的情況,這點我們是可以接收的,雖然當資料量很大時,這個方案的效率是很差的,但是這是一個我們可以接受的折衷優化方案,
You can find the source code for the two data structures here. We are excited to continue working on challenges and optimizing our Feed platform and to share our solutions with the community.
從這里可以下載LongArraySet 與IntArraySet的代碼,
如果打不開,可以從以下連接獲取原始碼:
http://blog.csdn.net/xiaxl/article/details/72730959
本人英語水平較爛,如有翻譯錯誤之處,請指出,我會積極修正,謝謝大家
========== THE END ==========

轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/28323.html
標籤:Android
