引言
現在的一個Android設備出貨,比如手機,平板和車機,都肯定是經過了很多次的測驗,
軟體的品質起碼是有一個基本的保障,
但是有個實際情況是,當手機在市場上面發售以后,測驗是沒有辦法模擬出來用戶的所有操作的,
市場上的消費者包括小白用戶,當手機出現各種例外時,用戶只能通過設備商售后處理,
而現在售后一般對ROOT,和自己燒一些不是官方發布的軟體版本是不保修的,
Android考慮到了這一點,所以增加了救援模式的功能,
可以在嚴重時,提供給用戶恢復出廠設定的選項,
這也就是本文分析的內容,
救援級別
針對不同問題的嚴重級別,系統定制了不同的救援等級,說明如下:
@VisibleForTesting
static final int LEVEL_NONE = 0;
@VisibleForTesting
static final int LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS = 1;
@VisibleForTesting
static final int LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES = 2;
@VisibleForTesting
static final int LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS = 3;
@VisibleForTesting
static final int LEVEL_FACTORY_RESET = 4;
我們可以看到,從0 -> 4其實就是隨著嚴重的等級不斷的提升,到了4,其實就是factory的操作,
APP級別救援實作
流程圖如下:

我們來看下具體的實作程序:
PWD:frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
/**
* Handle application death from an uncaught exception. The framework
* catches these for the main threads, so this should only matter for
* threads created by applications. Before this method runs, the given
* instance of {@link LoggingHandler} should already have logged details
* (and if not it is run first).
*/
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
private final LoggingHandler mLoggingHandler;
@Override
public void uncaughtException(Thread t, Throwable e) {
try {
ensureLogging(t, e);
// Don't re-enter -- avoid infinite loops if crash-reporting crashes.
if (mCrashing) return;
mCrashing = true;
// Try to end profiling. If a profiler is running at this point, and we kill the
// process (below), the in-memory buffer will be lost. So try to stop, which will
// flush the buffer. (This makes method trace profiling useful to debug crashes.)
if (ActivityThread.currentActivityThread() != null) {
ActivityThread.currentActivityThread().stopProfiling();
}
// Bring up crash dialog, wait for it to be dismissed
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
if (t2 instanceof DeadObjectException) {
// System process is dead; ignore
} else {
try {
Clog_e(TAG, "Error reporting crash", t2);
} catch (Throwable t3) {
// Even Clog_e() fails! Oh well.
}
}
} finally {
// Try everything to make sure this process goes away.
Process.killProcess(Process.myPid());
System.exit(10);
}
}
KillApplicationHandler是一個內部類,我們這邊只截取了一個方法KillApplicationHandler,
當APP出現例外,被Kill掉后,會進入到該方法中去進行處理,
這里會呼叫ActivityManager.getService().handleApplicationCrash來進行后續的處理,
PWD:frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
/**
* Used by {@link com.android.internal.os.RuntimeInit} to report when an application crashes.
* The application process will exit immediately after this call returns.
* @param app object of the crashing app, null for the system server
* @param crashInfo describing the exception
*/
public void handleApplicationCrash(IBinder app,
ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
ProcessRecord r = findAppProcess(app, "Crash");
final String processName = app == null ? "system_server"
: (r == null ? "unknown" : r.processName);
handleApplicationCrashInner("crash", r, processName, crashInfo);
}
這個注釋也很有意思:
Used by {@link com.android.internal.os.RuntimeInit} to report when an application crashes.
然后就去將Crash的ProcessName,和CrashInfo去通過handleApplicationCrashInner進行處理,
/* Native crash reporting uses this inner version because it needs to be somewhat
* decoupled from the AM-managed cleanup lifecycle
*/
void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
ApplicationErrorReport.CrashInfo crashInfo) {
EventLogTags.writeAmCrash(Binder.getCallingPid(),
UserHandle.getUserId(Binder.getCallingUid()), processName,
r == null ? -1 : r.info.flags,
crashInfo.exceptionClassName,
crashInfo.exceptionMessage,
crashInfo.throwFileName,
crashInfo.throwLineNumber);
FrameworkStatsLog.write(FrameworkStatsLog.APP_CRASH_OCCURRED,
Binder.getCallingUid(),
eventType,
processName,
Binder.getCallingPid(),
(r != null && r.info != null) ? r.info.packageName : "",
(r != null && r.info != null) ? (r.info.isInstantApp()
? FrameworkStatsLog.APP_CRASH_OCCURRED__IS_INSTANT_APP__TRUE
: FrameworkStatsLog.APP_CRASH_OCCURRED__IS_INSTANT_APP__FALSE)
: FrameworkStatsLog.APP_CRASH_OCCURRED__IS_INSTANT_APP__UNAVAILABLE,
r != null ? (r.isInterestingToUserLocked()
? FrameworkStatsLog.APP_CRASH_OCCURRED__FOREGROUND_STATE__FOREGROUND
: FrameworkStatsLog.APP_CRASH_OCCURRED__FOREGROUND_STATE__BACKGROUND)
: FrameworkStatsLog.APP_CRASH_OCCURRED__FOREGROUND_STATE__UNKNOWN,
processName.equals("system_server") ? ServerProtoEnums.SYSTEM_SERVER
: (r != null) ? r.getProcessClassEnum()
: ServerProtoEnums.ERROR_SOURCE_UNKNOWN
);
final int relaunchReason = r == null ? RELAUNCH_REASON_NONE
: r.getWindowProcessController().computeRelaunchReason();
final String relaunchReasonString = relaunchReasonToString(relaunchReason);
if (crashInfo.crashTag == null) {
crashInfo.crashTag = relaunchReasonString;
} else {
crashInfo.crashTag = crashInfo.crashTag + " " + relaunchReasonString;
}
addErrorToDropBox(
eventType, r, processName, null, null, null, null, null, null, crashInfo);
mAppErrors.crashApplication(r, crashInfo);
}
addErrorToDropBox函式如果熟悉android Log系統的同學,都會知道這個是一個非常重要的Error處理函式,
這個我們會在后續Log的分析文章中,進行專門的說明,
這里我們關心的是mAppErrors.crashApplication(r, crashInfo);
/**
* Bring up the "unexpected error" dialog box for a crashing app.
* Deal with edge cases (intercepts from instrumented applications,
* ActivityController, error intent receivers, that sort of thing).
* @param r the application crashing
* @param crashInfo describing the failure
*/
void crashApplication(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo) {
final int callingPid = Binder.getCallingPid();
final int callingUid = Binder.getCallingUid();
final long origId = Binder.clearCallingIdentity();
try {
crashApplicationInner(r, crashInfo, callingPid, callingUid);
} finally {
Binder.restoreCallingIdentity(origId);
}
}
看下CrashApplicationInner的實作:
void crashApplicationInner(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo,
int callingPid, int callingUid) {
long timeMillis = System.currentTimeMillis();
String shortMsg = crashInfo.exceptionClassName;
String longMsg = crashInfo.exceptionMessage;
String stackTrace = crashInfo.stackTrace;
if (shortMsg != null && longMsg != null) {
longMsg = shortMsg + ": " + longMsg;
} else if (shortMsg != null) {
longMsg = shortMsg;
}
if (r != null) {
mPackageWatchdog.onPackageFailure(r.getPackageListWithVersionCode(),
PackageWatchdog.FAILURE_REASON_APP_CRASH);
mService.mProcessList.noteAppKill(r, (crashInfo != null
&& "Native crash".equals(crashInfo.exceptionClassName))
? ApplicationExitInfo.REASON_CRASH_NATIVE
: ApplicationExitInfo.REASON_CRASH,
ApplicationExitInfo.SUBREASON_UNKNOWN,
"crash");
}
final int relaunchReason = r != null
? r.getWindowProcessController().computeRelaunchReason() : RELAUNCH_REASON_NONE;
AppErrorResult result = new AppErrorResult();
int taskId;
synchronized (mService) {
/**
* If crash is handled by instance of {@link android.app.IActivityController},
* finish now and don't show the app error dialog.
*/
if (handleAppCrashInActivityController(r, crashInfo, shortMsg, longMsg, stackTrace,
timeMillis, callingPid, callingUid)) {
return;
}
// Suppress crash dialog if the process is being relaunched due to a crash during a free
// resize.
if (relaunchReason == RELAUNCH_REASON_FREE_RESIZE) {
return;
}
/**
* If this process was running instrumentation, finish now - it will be handled in
* {@link ActivityManagerService#handleAppDiedLocked}.
*/
if (r != null && r.getActiveInstrumentation() != null) {
return;
}
// Log crash in battery stats.
if (r != null) {
mService.mBatteryStatsService.noteProcessCrash(r.processName, r.uid);
}
AppErrorDialog.Data data = new AppErrorDialog.Data();
data.result = result;
data.proc = r;
// If we can't identify the process or it's already exceeded its crash quota,
// quit right away without showing a crash dialog.
if (r == null || !makeAppCrashingLocked(r, shortMsg, longMsg, stackTrace, data)) {
return;
}
final Message msg = Message.obtain();
msg.what = ActivityManagerService.SHOW_ERROR_UI_MSG;
taskId = data.taskId;
msg.obj = data;
mService.mUiHandler.sendMessage(msg);
}
int res = result.get();
Intent appErrorIntent = null;
MetricsLogger.action(mContext, MetricsProto.MetricsEvent.ACTION_APP_CRASH, res);
if (res == AppErrorDialog.TIMEOUT || res == AppErrorDialog.CANCEL) {
res = AppErrorDialog.FORCE_QUIT;
}
synchronized (mService) {
if (res == AppErrorDialog.MUTE) {
stopReportingCrashesLocked(r);
}
if (res == AppErrorDialog.RESTART) {
mService.mProcessList.removeProcessLocked(r, false, true,
ApplicationExitInfo.REASON_CRASH, "crash");
if (taskId != INVALID_TASK_ID) {
try {
mService.startActivityFromRecents(taskId,
ActivityOptions.makeBasic().toBundle());
} catch (IllegalArgumentException e) {
// Hmm...that didn't work. Task should either be in recents or associated
// with a stack.
Slog.e(TAG, "Could not restart taskId=" + taskId, e);
}
}
}
if (res == AppErrorDialog.FORCE_QUIT) {
long orig = Binder.clearCallingIdentity();
try {
// Kill it with fire!
mService.mAtmInternal.onHandleAppCrash(r.getWindowProcessController());
if (!r.isPersistent()) {
mService.mProcessList.removeProcessLocked(r, false, false,
ApplicationExitInfo.REASON_CRASH, "crash");
mService.mAtmInternal.resumeTopActivities(false /* scheduleIdle */);
}
} finally {
Binder.restoreCallingIdentity(orig);
}
}
if (res == AppErrorDialog.APP_INFO) {
appErrorIntent = new Intent(Settings.ACTION_APPLICATION_DETAILS_SETTINGS);
appErrorIntent.setData(Uri.parse("package:" + r.info.packageName));
appErrorIntent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK);
}
if (res == AppErrorDialog.FORCE_QUIT_AND_REPORT) {
appErrorIntent = createAppErrorIntentLocked(r, timeMillis, crashInfo);
}
if (r != null && !r.isolated && res != AppErrorDialog.RESTART) {
// XXX Can't keep track of crash time for isolated processes,
// since they don't have a persistent identity.
mProcessCrashTimes.put(r.info.processName, r.uid,
SystemClock.uptimeMillis());
}
}
if (appErrorIntent != null) {
try {
mContext.startActivityAsUser(appErrorIntent, new UserHandle(r.userId));
} catch (ActivityNotFoundException e) {
Slog.w(TAG, "bug report receiver dissappeared", e);
}
}
}
在出現Crash的情況下,將會呼叫mPackageWatchdog的onPackageFailure函式,
mPackageWatchdog.onPackageFailure(r.getPackageListWithVersionCode(),
PackageWatchdog.FAILURE_REASON_APP_CRASH);
onPackageFailure的實作如下:
/**
* Called when a process fails due to a crash, ANR or explicit health check.
*
* <p>For each package contained in the process, one registered observer with the least user
* impact will be notified for mitigation.
*
* <p>This method could be called frequently if there is a severe problem on the device.
*/
public void onPackageFailure(List<VersionedPackage> packages,
@FailureReasons int failureReason) {
if (packages == null) {
Slog.w(TAG, "Could not resolve a list of failing packages");
return;
}
mLongTaskHandler.post(() -> {
synchronized (mLock) {
if (mAllObservers.isEmpty()) {
return;
}
boolean requiresImmediateAction = (failureReason == FAILURE_REASON_NATIVE_CRASH
|| failureReason == FAILURE_REASON_EXPLICIT_HEALTH_CHECK);
if (requiresImmediateAction) {
handleFailureImmediately(packages, failureReason);
} else {
for (int pIndex = 0; pIndex < packages.size(); pIndex++) {
VersionedPackage versionedPackage = packages.get(pIndex);
// Observer that will receive failure for versionedPackage
PackageHealthObserver currentObserverToNotify = null;
int currentObserverImpact = Integer.MAX_VALUE;
// Find observer with least user impact
for (int oIndex = 0; oIndex < mAllObservers.size(); oIndex++) {
ObserverInternal observer = mAllObservers.valueAt(oIndex);
PackageHealthObserver registeredObserver = observer.registeredObserver;
if (registeredObserver != null
&& observer.onPackageFailureLocked(
versionedPackage.getPackageName())) {
int impact = registeredObserver.onHealthCheckFailed(
versionedPackage, failureReason);
if (impact != PackageHealthObserverImpact.USER_IMPACT_NONE
&& impact < currentObserverImpact) {
currentObserverToNotify = registeredObserver;
currentObserverImpact = impact;
}
}
}
// Execute action with least user impact
if (currentObserverToNotify != null) {
currentObserverToNotify.execute(versionedPackage, failureReason);
}
}
}
}
});
}
在Crash的原因為Native_Crash和FAILURE_REASON_EXPLICIT_HEALTH_CHECK時,將會呼叫RollBack進行處理,但是其余的情況,將會進行進一步的通知,我們這里注意的是非RollBack的處理:
for (int pIndex = 0; pIndex < packages.size(); pIndex++) {
VersionedPackage versionedPackage = packages.get(pIndex);
// Observer that will receive failure for versionedPackage
PackageHealthObserver currentObserverToNotify = null;
int currentObserverImpact = Integer.MAX_VALUE;
// Find observer with least user impact
for (int oIndex = 0; oIndex < mAllObservers.size(); oIndex++) {
ObserverInternal observer = mAllObservers.valueAt(oIndex);
PackageHealthObserver registeredObserver = observer.registeredObserver;
if (registeredObserver != null
&& observer.onPackageFailureLocked(
versionedPackage.getPackageName())) {
int impact = registeredObserver.onHealthCheckFailed(
versionedPackage, failureReason);
if (impact != PackageHealthObserverImpact.USER_IMPACT_NONE
&& impact < currentObserverImpact) {
currentObserverToNotify = registeredObserver;
currentObserverImpact = impact;
}
}
}
// Execute action with least user impact
if (currentObserverToNotify != null) {
currentObserverToNotify.execute(versionedPackage, failureReason);
}
}
這里首先會注冊PackageHealthObserver,然后呼叫相應的execute的函式:
// Execute action with least user impact
if (currentObserverToNotify != null) {
currentObserverToNotify.execute(versionedPackage, failureReason);
}
而我們救援模式的實作RescueParty,里面也繼承并實作了PackageHealthObserver,
/**
* Handle mitigation action for package failures. This observer will be register to Package
* Watchdog and will receive calls about package failures. This observer is persistent so it
* may choose to mitigate failures for packages it has not explicitly asked to observe.
*/
public static class RescuePartyObserver implements PackageHealthObserver {
@Override
public boolean execute(@Nullable VersionedPackage failedPackage,
@FailureReasons int failureReason) {
if (isDisabled()) {
return false;
}
if (failureReason == PackageWatchdog.FAILURE_REASON_APP_CRASH
|| failureReason == PackageWatchdog.FAILURE_REASON_APP_NOT_RESPONDING) {
int triggerUid = getPackageUid(mContext, failedPackage.getPackageName());
incrementRescueLevel(triggerUid);
executeRescueLevel(mContext,
failedPackage == null ? null : failedPackage.getPackageName());
return true;
} else {
return false;
}
}
}
incrementRescueLevel的實作主要是去調整救援的等級;
executeRescueLevel是去執行救援操作
/**
* Escalate to the next rescue level. After incrementing the level you'll
* probably want to call {@link #executeRescueLevel(Context, String)}.
*/
private static void incrementRescueLevel(int triggerUid) {
final int level = getNextRescueLevel();
SystemProperties.set(PROP_RESCUE_LEVEL, Integer.toString(level));
EventLogTags.writeRescueLevel(level, triggerUid);
logCriticalInfo(Log.WARN, "Incremented rescue level to "
+ levelToString(level) + " triggered by UID " + triggerUid);
}
incrementRescueLevel是去呼叫getNextRescueLevel來進行計數;
/**
* Get the next rescue level. This indicates the next level of mitigation that may be taken.
*/
private static int getNextRescueLevel() {
return MathUtils.constrain(SystemProperties.getInt(PROP_RESCUE_LEVEL, LEVEL_NONE) + 1,
LEVEL_NONE, LEVEL_FACTORY_RESET);
}
實作原理也很簡單,每次對于計數+1.
private static void executeRescueLevel(Context context, @Nullable String failedPackage) {
final int level = SystemProperties.getInt(PROP_RESCUE_LEVEL, LEVEL_NONE);
if (level == LEVEL_NONE) return;
Slog.w(TAG, "Attempting rescue level " + levelToString(level));
try {
executeRescueLevelInternal(context, level, failedPackage);
EventLogTags.writeRescueSuccess(level);
logCriticalInfo(Log.DEBUG,
"Finished rescue level " + levelToString(level));
} catch (Throwable t) {
logRescueException(level, t);
}
}
executeRescueLevel函式則是將當前的level和failedPackage進行傳遞,到executeRescueLevelInternal進行實作,
private static void executeRescueLevelInternal(Context context, int level, @Nullable
String failedPackage) throws Exception {
FrameworkStatsLog.write(FrameworkStatsLog.RESCUE_PARTY_RESET_REPORTED, level);
switch (level) {
case LEVEL_RESET_SETTINGS_UNTRUSTED_DEFAULTS:
resetAllSettings(context, Settings.RESET_MODE_UNTRUSTED_DEFAULTS, failedPackage);
break;
case LEVEL_RESET_SETTINGS_UNTRUSTED_CHANGES:
resetAllSettings(context, Settings.RESET_MODE_UNTRUSTED_CHANGES, failedPackage);
break;
case LEVEL_RESET_SETTINGS_TRUSTED_DEFAULTS:
resetAllSettings(context, Settings.RESET_MODE_TRUSTED_DEFAULTS, failedPackage);
break;
case LEVEL_FACTORY_RESET:
// Request the reboot from a separate thread to avoid deadlock on PackageWatchdog
// when device shutting down.
Runnable runnable = new Runnable() {
@Override
public void run() {
try {
RecoverySystem.rebootPromptAndWipeUserData(context, TAG);
} catch (Throwable t) {
logRescueException(level, t);
}
}
};
Thread thread = new Thread(runnable);
thread.start();
break;
}
}
在FactoryReset之前,進行的都是resetAllSettings的操作,
private static void resetAllSettings(Context context, int mode, @Nullable String failedPackage)
throws Exception {
// Try our best to reset all settings possible, and once finished
// rethrow any exception that we encountered
Exception res = null;
final ContentResolver resolver = context.getContentResolver();
try {
resetDeviceConfig(context, mode, failedPackage);
} catch (Exception e) {
res = new RuntimeException("Failed to reset config settings", e);
}
try {
Settings.Global.resetToDefaultsAsUser(resolver, null, mode, UserHandle.USER_SYSTEM);
} catch (Exception e) {
res = new RuntimeException("Failed to reset global settings", e);
}
for (int userId : getAllUserIds()) {
try {
Settings.Secure.resetToDefaultsAsUser(resolver, null, mode, userId);
} catch (Exception e) {
res = new RuntimeException("Failed to reset secure settings for " + userId, e);
}
}
if (res != null) {
throw res;
}
}
系統Factory Reset級別救援實作
當觸發FactoryReset的條件時, 也就是到達五次的時候,會進入下面的操作:
// Request the reboot from a separate thread to avoid deadlock on PackageWatchdog
// when device shutting down.
Runnable runnable = new Runnable() {
@Override
public void run() {
try {
RecoverySystem.rebootPromptAndWipeUserData(context, TAG);
} catch (Throwable t) {
logRescueException(level, t);
}
}
};
Thread thread = new Thread(runnable);
thread.start();
break;
將會呼叫RecoverySystem.rebootPromptAndWipeUserData來進行FactoryReset的操作,
也就是進入Factory Reset的界面了,
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/218830.html
標籤:AI
