Android5.1--SystemServer进程
2015-10-13 16:32
399 查看
这篇博客作为自己的学习心得记录下来,欢迎提出问题,一起学习进步。
SystemServer是Android系统的核心之一,大部分Android提供的服务都运行在这个进程里,SystemServer中运行的服务总共有60多种。为了防止应用进程对系统造成破坏,Android的应用进程没有权限访问设备的底层资源,只能通过SystemServer中的服务代理访问。通过Binder,用户进程使用systemServer中的服务并没有太多的不便之处。
下面我们重点看下SystemServer的启动过程和它的WatchDog模块。
(1)、设置属性persist.sys.dalvik.vm.lib.2的值为当前虚拟机的运行库路径。
(2)、调整时间。
(3)、调整虚拟机堆的内存。设定虚拟机堆利用率为0.8,当实际的使用率偏离设定的比率时,虚拟机在垃圾回收的时候将调整堆的大小,使实际使用率接近设定的百分比。
(4)、装载libandroid_servers.so库。这个库的源文件位于frameworks\base\services\core\jni。
(5)、调用nativeInit()方法初始化native层的Binder服务。这里native层的服务只有SensorService。
对应的native层函数是android_server_SystemServer_nativeInit(),如下:
(6)、调用createSystemContext()来获取Context,相当于创建了一个framework-res.apk的上下文环境。
(7)、创建SystemServiceManager对象mSystemServiceManager。这个对象负责系统Service的启动。
(8)、startBootstrapServices()、startCoreServices()、startOtherServices()创建并运行所有java服务。
(9)、调用Loop.loop(),进入处理消息的循环。
在startOtherService()方法中:
WatchDog是单实例的运行模式。
WatchDog对象创建后,接下来会调用init()方法进行初始化,
WatchDog中提供的两个方法addThread和addMonitor()方法分别用来增加需要监控的线程和服务。addThread()实际是创建一个和受监控对象关联的HandlerChecker对象。
在WatchDog的构造函数中,把5个公共线程加入到了监控列表中。
主线程
FgThread
UiThread。
IoThread。
DisplayThread。
SystemServer中一些重要的服务拥有专用的线程来处理消息。除了这5个公共线程外,一些重要服务的专用线程也加入到了监控中,包括:
ActivityManagerService的AThread线程。
PackageManagerService的mHandlerThread变量表示的线程。
PowerManagerService的线程。
WindowManagerService的wmHandlerThread变量表示的线程。
如果一个服务需要通过WatchDog来监控,它必须首先实现WatchDog的接口Moniter:
ActivityManagerService
InputManagerService
MediaRouterService。
MountService。
NativeDaemonConnector。
NetworkManagementService。
PowerManagerService
WindowManagerService。
WatchDog运行在一个单独的线程中,如下:
(1)调用scheduleCheckLocked()方法给所有受监控的线程发送消息。
mCompleted的值为true,表面HandlerChecker对象监控的线程或服务正常。否则就有空有问题。是否真有问题还要通过等待的时间是否超过规定时间来判断。
(2)、给受监控的线程发送完消息后,调用wait()方法让WatchDog线程睡眠一段时间。
(3)、逐个检查是否有线程或服务出问题了,一旦发现,马上杀死进程。
通过evaluateCheckerCompletionLocked()方法来检查,如下:
COMPLETED:0,表示状态良好。
WAITING:1,表示正在等待消息处理的结果。
WAITED_HALF:2,表示正在等待并且等待的时间超过了规定时间的一半。
OVERDUE:3,表示等待时间已经超过了规定的时间。
evaluateCheckerCompletionLocked()方法希望知道的是最坏的情况,因此上面代码中使用Math.max()方法来计算出所有监控对象的最坏情况。
OVERDUE状态就会杀死SystemServer进程。
SystemServer是Android系统的核心之一,大部分Android提供的服务都运行在这个进程里,SystemServer中运行的服务总共有60多种。为了防止应用进程对系统造成破坏,Android的应用进程没有权限访问设备的底层资源,只能通过SystemServer中的服务代理访问。通过Binder,用户进程使用systemServer中的服务并没有太多的不便之处。
下面我们重点看下SystemServer的启动过程和它的WatchDog模块。
SystemServer的创建过程
SystemServer的创建可以分成两部分,一部分是在Zygote进程中fork并初始化SystemServer进程,另一部分是执行SystemServer类的main()方法来启动系统的服务。创建SystemServer进程
Init.rc文件中定义的Zygote进程的启动参数包括了“--start-system-server”,因此,在ZygoteInit类的main()方法里会调用startSystemServer()方法,如下:if (startSystemServer) { startSystemServer(abiList, socketName); }
/** * Prepare the arguments and fork for the system server process. */ private static boolean startSystemServer(String abiList, String socketName) throws MethodAndArgsCaller, RuntimeException { long capabilities = posixCapabilitiesAsBits( OsConstants.CAP_BLOCK_SUSPEND, OsConstants.CAP_KILL, OsConstants.CAP_NET_ADMIN, OsConstants.CAP_NET_BIND_SERVICE, OsConstants.CAP_NET_BROADCAST, OsConstants.CAP_NET_RAW, OsConstants.CAP_SYS_MODULE, OsConstants.CAP_SYS_NICE, OsConstants.CAP_SYS_RESOURCE, OsConstants.CAP_SYS_TIME, OsConstants.CAP_SYS_TTY_CONFIG ); /* Hardcoded command line to start the system server */ String args[] = {//指定forkSystemServer所需要的参数 "--setuid=1000",//SYSTEM_UID "--setgid=1000", "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1032,3001,3002,3003,3006,3007", "--capabilities=" + capabilities + "," + capabilities, "--runtime-init", "--nice-name=system_server",//进程的名字 "com.android.server.SystemServer",//systemServer类的位置 }; ZygoteConnection.Arguments parsedArgs = null; int pid; try { parsedArgs = new ZygoteConnection.Arguments(args);//根据指定的参数,生成一个ZygoteConnection.Arguments对象 ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);//根据ro.debuggable属性值,设置parseArgs.debugFlags的值 ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs); /* Request to fork the system server process */ pid = Zygote.forkSystemServer( parsedArgs.uid, parsedArgs.gid, parsedArgs.gids, parsedArgs.debugFlags, null, parsedArgs.permittedCapabilities, parsedArgs.effectiveCapabilities); } catch (IllegalArgumentException ex) { throw new RuntimeException(ex); } /* For child process */ if (pid == 0) { if (hasSecondZygote(abiList)) { waitForSecondaryZygote(socketName); } handleSystemServerProcess(parsedArgs); } return true; }在startSystemServer()方法中,主要做了3件事,一是为SystemServer准备启动参数,从参数可以看到:SystemServer的进程Id和组Id都被指定为1000.SystemServer的执行类是com.android.server.SystemServer。二是调用Zygote类的forkSystemServer()来fork出SystemServer子进程。Zygote会检查SystemServer是否启动成功,如果不成功,Zygote进程会自动退出,重新启动一遍。三是fork出systemServer后,调用handleSystemServerProcess()来初始化SystemServer进程。如下:
/** * Finish remaining work for the newly forked system server process. */ private static void handleSystemServerProcess( ZygoteConnection.Arguments parsedArgs) throws ZygoteInit.MethodAndArgsCaller { closeServerSocket();//Close and clean up zygote sockets. // set umask to 0077 so new files and directories will default to owner-only permissions. Os.umask(S_IRWXG | S_IRWXO); if (parsedArgs.niceName != null) { Process.setArgV0(parsedArgs.niceName);//修改进程名 } final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH"); if (systemServerClasspath != null) { performSystemServerDexOpt(systemServerClasspath); } if (parsedArgs.invokeWith != null) {//invokeWith通常为空 String[] args = parsedArgs.remainingArgs; // If we have a non-null system server class path, we'll have to duplicate the // existing arguments and append the classpath to it. ART will handle the classpath // correctly when we exec a new process. if (systemServerClasspath != null) { String[] amendedArgs = new String[args.length + 2]; amendedArgs[0] = "-cp"; amendedArgs[1] = systemServerClasspath; System.arraycopy(parsedArgs.remainingArgs, 0, amendedArgs, 2, parsedArgs.remainingArgs.length); } WrapperInit.execApplication(parsedArgs.invokeWith, parsedArgs.niceName, parsedArgs.targetSdkVersion, null, args); } else { ClassLoader cl = null; if (systemServerClasspath != null) { cl = new PathClassLoader(systemServerClasspath, ClassLoader.getSystemClassLoader()); Thread.currentThread().setContextClassLoader(cl); } /* * Pass the remaining arguments to SystemServer. */ RuntimeInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl); } /* should never reach here */ }handleSystemServerProcess()方法首先关闭了从zygote继承来的socket,接着将systemServer进程的umask设为0077(S_IRWXG|S_IRWXO),这样systemServer创建的文件的属性就是0700,只有systemServer进程可以访问。同时,进程名称修改为system_server。因为参数invokeWith通常为null,还是通过RuntimeInit的zygoteInit()方法来调用systemServer类的mian()方法。
SystemServer初始化
systemServer是java类,入口是mian()方法,如下:/** * The main entry point from zygote. */ public static void main(String[] args) { new SystemServer().run(); } public SystemServer() { // Check for factory test mode. mFactoryTestMode = FactoryTest.getMode(); } private void run() { // If a device's clock is before 1970 (before 0), a lot of // APIs crash dealing with negative numbers, notably // java.io.File#setLastModified, so instead we fake it and // hope that time from cell towers or NTP fixes it shortly. if (System.currentTimeMillis() < EARLIEST_SUPPORTED_TIME) { Slog.w(TAG, "System clock is before 1970; setting to 1970."); SystemClock.setCurrentTimeMillis(EARLIEST_SUPPORTED_TIME); }//调整系统时间 // Here we go! Slog.i(TAG, "Entered the Android system server!"); EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN, SystemClock.uptimeMillis()); // In case the runtime switched since last boot (such as when // the old runtime was removed in an OTA), set the system // property so that it is in sync. We can't do this in // libnativehelper's JniInvocation::Init code where we already // had to fallback to a different runtime because it is // running as root and we need to be the system user to set // the property. http://b/11463182 SystemProperties.set("persist.sys.dalvik.vm.lib.2", VMRuntime.getRuntime().vmLibrary());//设置当前虚拟机的运行库路径 // Enable the sampling profiler. if (SamplingProfilerIntegration.isEnabled()) { SamplingProfilerIntegration.start(); mProfilerSnapshotTimer = new Timer(); mProfilerSnapshotTimer.schedule(new TimerTask() { @Override public void run() { SamplingProfilerIntegration.writeSnapshot("system_server", null); } }, SNAPSHOT_INTERVAL, SNAPSHOT_INTERVAL); } // Mmmmmm... more memory! VMRuntime.getRuntime().clearGrowthLimit(); // The system server has to run all of the time, so it needs to be // as efficient as possible with its memory usage. VMRuntime.getRuntime().setTargetHeapUtilization(0.8f); // Some devices rely on runtime fingerprint generation, so make sure // we've defined it before booting further. Build.ensureFingerprintProperty(); // Within the system server, it is an error to access Environment paths without // explicitly specifying a user. Environment.setUserRequired(true); // Ensure binder calls into the system always run at foreground priority. BinderInternal.disableBackgroundScheduling(true); // Prepare the main looper thread (this thread). android.os.Process.setThreadPriority( android.os.Process.THREAD_PRIORITY_FOREGROUND); android.os.Process.setCanSelfBackground(false); Looper.prepareMainLooper(); // Initialize native services. System.loadLibrary("android_servers");//装载libandroid_servers.so nativeInit();//启动native层的service,Called to initialize native system services. // Check whether we failed to shut down last time we tried. // This call may not return. performPendingShutdown(); // Initialize the system context. createSystemContext(); // Create the system service manager. mSystemServiceManager = new SystemServiceManager(mSystemContext); LocalServices.addService(SystemServiceManager.class, mSystemServiceManager); // Start services. try {//启动所有的java service startBootstrapServices(); startCoreServices(); startOtherServices(); } catch (Throwable ex) { Slog.e("System", "******************************************"); Slog.e("System", "************ Failure starting system services", ex); // throw ex; } // For debug builds, log event loop stalls to dropbox for analysis. if (StrictMode.conditionallyEnableDebugLogging()) { Slog.i(TAG, "Enabled StrictMode for system server main thread."); } // Loop forever. Looper.loop(); throw new RuntimeException("Main thread loop unexpectedly exited"); }main()方法的主要作用是:
(1)、设置属性persist.sys.dalvik.vm.lib.2的值为当前虚拟机的运行库路径。
(2)、调整时间。
(3)、调整虚拟机堆的内存。设定虚拟机堆利用率为0.8,当实际的使用率偏离设定的比率时,虚拟机在垃圾回收的时候将调整堆的大小,使实际使用率接近设定的百分比。
(4)、装载libandroid_servers.so库。这个库的源文件位于frameworks\base\services\core\jni。
(5)、调用nativeInit()方法初始化native层的Binder服务。这里native层的服务只有SensorService。
对应的native层函数是android_server_SystemServer_nativeInit(),如下:
static void android_server_SystemServer_nativeInit(JNIEnv* env, jobject clazz) { char propBuf[PROPERTY_VALUE_MAX]; property_get("system_init.startsensorservice", propBuf, "1"); if (strcmp(propBuf, "1") == 0) { // Start the sensor service SensorService::instantiate(); } }如果属性system_init.startsensorservice设置为1,就启动SensorService服务。SensorService提供的是各种传感器的服务,也是SystemServer中唯一的本地服务。
(6)、调用createSystemContext()来获取Context,相当于创建了一个framework-res.apk的上下文环境。
(7)、创建SystemServiceManager对象mSystemServiceManager。这个对象负责系统Service的启动。
(8)、startBootstrapServices()、startCoreServices()、startOtherServices()创建并运行所有java服务。
(9)、调用Loop.loop(),进入处理消息的循环。
SystemServer中的Watchdog
Android开发了WatchDog类作为软件看门狗来监控SystemServer进程,一旦发现问题,WatchDog会杀死SystemServer进程。SystemServer的父进程Zygote接收到SystemServer的死亡信号后,会杀死自己。Zygote进程的死亡信号传递到Init进程后,Init进程会杀死Zygote进程所有子进程并重启Zygote。这样整个手机相当于重启一遍。启动WatchDog
在SystemServer中创建WatchDog对象,如下:在startOtherService()方法中:
Slog.i(TAG, "Init Watchdog"); final Watchdog watchdog = Watchdog.getInstance(); watchdog.init(context, mActivityManagerService);
WatchDog是单实例的运行模式。
public static Watchdog getInstance() { if (sWatchdog == null) { sWatchdog = new Watchdog(); } return sWatchdog; } private Watchdog() { super("watchdog"); // Initialize handler checkers for each common thread we want to check. Note // that we are not currently checking the background thread, since it can // potentially hold longer running operations with no guarantees about the timeliness // of operations there. // The shared foreground thread is the main checker. It is where we // will also dispatch monitor checks and do other work. mMonitorChecker = new HandlerChecker(FgThread.getHandler(), "foreground thread", DEFAULT_TIMEOUT); mHandlerCheckers.add(mMonitorChecker); // Add checker for main thread. We only do a quick check since there // can be UI running on the thread. mHandlerCheckers.add(new HandlerChecker(new Handler(Looper.getMainLooper()), "main thread", DEFAULT_TIMEOUT)); // Add checker for shared UI thread. mHandlerCheckers.add(new HandlerChecker(UiThread.getHandler(), "ui thread", DEFAULT_TIMEOUT)); // And also check IO thread. mHandlerCheckers.add(new HandlerChecker(IoThread.getHandler(), "i/o thread", DEFAULT_TIMEOUT)); // And the display thread. mHandlerCheckers.add(new HandlerChecker(DisplayThread.getHandler(), "display thread", DEFAULT_TIMEOUT)); }WatchDog的构造方法主要工作是创建几个HandlerChecker对象,并把他们保存到数组列表mHandlerCheckers中。每个HandlerChecker对象对应一个被监控的线程。HandlerCHecker类从Handler类派生,在构造时就和被监控的线程关联在一起。
WatchDog对象创建后,接下来会调用init()方法进行初始化,
public void init(Context context, ActivityManagerService activity) { mResolver = context.getContentResolver(); mActivity = activity; context.registerReceiver(new RebootRequestReceiver(), new IntentFilter(Intent.ACTION_REBOOT), android.Manifest.permission.REBOOT, null); }注册了RebootRequestReceiver,监听重启的intent:ACTION_REBOOT。
WatchDog监控的服务和线程
WatchDog主要监控线程。如果一个线程陷入了死循环或者和其他的线程相互死锁了,WatchDog需要有办法识别出它们。WatchDog中提供的两个方法addThread和addMonitor()方法分别用来增加需要监控的线程和服务。addThread()实际是创建一个和受监控对象关联的HandlerChecker对象。
public void addThread(Handler thread) { addThread(thread, DEFAULT_TIMEOUT); } public void addThread(Handler thread, long timeoutMillis) { synchronized (this) { if (isAlive()) { throw new RuntimeException("Threads can't be added once the Watchdog is running"); } final String name = thread.getLooper().getThread().getName(); mHandlerCheckers.add(new HandlerChecker(thread, name, timeoutMillis)); } }对服务的监控也是由HandlerChecker对象来完成。一个HandlerChecker对象就可以检查所有服务。因此,WatchDog让mMonitorChecker对象来完成这个任务。
public void addMonitor(Monitor monitor) { synchronized (this) { if (isAlive()) { throw new RuntimeException("Monitors can't be added once the Watchdog is running"); } mMonitorChecker.addMonitor(monitor); } }WatchDog的addMonitor()方法只是调用了mMonitorChecker对象的addMonitor方法。
在WatchDog的构造函数中,把5个公共线程加入到了监控列表中。
主线程
FgThread
UiThread。
IoThread。
DisplayThread。
SystemServer中一些重要的服务拥有专用的线程来处理消息。除了这5个公共线程外,一些重要服务的专用线程也加入到了监控中,包括:
ActivityManagerService的AThread线程。
PackageManagerService的mHandlerThread变量表示的线程。
PowerManagerService的线程。
WindowManagerService的wmHandlerThread变量表示的线程。
如果一个服务需要通过WatchDog来监控,它必须首先实现WatchDog的接口Moniter:
public interface Monitor { void monitor(); }然后还要调用WatchDog类的addMonitor()方法把它自己加入到WatchDog的服务监控列表中。在systemServer中实现了Moniter接口并调用了addMonitor()方法的服务有。
ActivityManagerService
InputManagerService
MediaRouterService。
MountService。
NativeDaemonConnector。
NetworkManagementService。
PowerManagerService
WindowManagerService。
WatchDog监控的原理
通过给现场发送消息可以判断一个线程是否运行正常,如果发送的消息不能在规定的时间内得到处理,就表明线程被不正常占用了。WatchDog运行在一个单独的线程中,如下:
public void run() { boolean waitedHalf = false; while (true) { final ArrayList<HandlerChecker> blockedCheckers; final String subject; final boolean allowRestart; int debuggerWasConnected = 0; synchronized (this) { long timeout = CHECK_INTERVAL; // Make sure we (re)spin the checkers that have become idle within // this wait-and-check interval给监控的线程发送消息 for (int i=0; i<mHandlerCheckers.size(); i++) { HandlerChecker hc = mHandlerCheckers.get(i); hc.scheduleCheckLocked(); } if (debuggerWasConnected > 0) { debuggerWasConnected--; } // NOTE: We use uptimeMillis() here because we do not want to increment the time we // wait while asleep. If the device is asleep then the thing that we are waiting // to timeout on is asleep as well and won't have a chance to run, causing a false // positive on when to kill things. long start = SystemClock.uptimeMillis(); while (timeout > 0) {//睡眠一段时间 if (Debug.isDebuggerConnected()) { debuggerWasConnected = 2; } try { wait(timeout); } catch (InterruptedException e) { Log.wtf(TAG, e); } if (Debug.isDebuggerConnected()) { debuggerWasConnected = 2; } timeout = CHECK_INTERVAL - (SystemClock.uptimeMillis() - start); } //检查是否有线程或服务出问题 final int waitState = evaluateCheckerCompletionLocked(); if (waitState == COMPLETED) { // The monitors have returned; reset waitedHalf = false; continue; } else if (waitState == WAITING) { // still waiting but within their configured intervals; back off and recheck continue; } else if (waitState == WAITED_HALF) { if (!waitedHalf) { // We've waited half the deadlock-detection interval. Pull a stack // trace and wait another half. ArrayList<Integer> pids = new ArrayList<Integer>(); pids.add(Process.myPid()); ActivityManagerService.dumpStackTraces(true, pids, null, null, NATIVE_STACKS_OF_INTEREST); waitedHalf = true; } continue; } // something is overdue! blockedCheckers = getBlockedCheckersLocked(); subject = describeCheckersLocked(blockedCheckers); allowRestart = mAllowRestart; } // If we got here, that means that the system is most likely hung. // First collect stack traces from all threads of the system process. // Then kill this process so that the system will restart. EventLog.writeEvent(EventLogTags.WATCHDOG, subject); ArrayList<Integer> pids = new ArrayList<Integer>(); pids.add(Process.myPid()); if (mPhonePid > 0) pids.add(mPhonePid); // Pass !waitedHalf so that just in case we somehow wind up here without having // dumped the halfway stacks, we properly re-initialize the trace file. final File stack = ActivityManagerService.dumpStackTraces( !waitedHalf, pids, null, null, NATIVE_STACKS_OF_INTEREST); // Give some extra time to make sure the stack traces get written. // The system's been hanging for a minute, another second or two won't hurt much. SystemClock.sleep(2000); // Pull our own kernel thread stacks as well if we're configured for that if (RECORD_KERNEL_THREADS) { dumpKernelStackTraces(); } // Trigger the kernel to dump all blocked threads, and backtraces on all CPUs to the kernel log doSysRq('w'); doSysRq('l'); String tracesPath = SystemProperties.get("dalvik.vm.stack-trace-file", null); String traceFileNameAmendment = "_SystemServer_WDT" + mTraceDateFormat.format(new Date()); if (tracesPath != null && tracesPath.length() != 0) { File traceRenameFile = new File(tracesPath); String newTracesPath; int lpos = tracesPath.lastIndexOf ("."); if (-1 != lpos) newTracesPath = tracesPath.substring (0, lpos) + traceFileNameAmendment + tracesPath.substring (lpos); else newTracesPath = tracesPath + traceFileNameAmendment; traceRenameFile.renameTo(new File(newTracesPath)); tracesPath = newTracesPath; } final File newFd = new File(tracesPath); // Try to add the error to the dropbox, but assuming that the ActivityManager // itself may be deadlocked. (which has happened, causing this statement to // deadlock and the watchdog as a whole to be ineffective) Thread dropboxThread = new Thread("watchdogWriteToDropbox") { public void run() { mActivity.addErrorToDropBox( "watchdog", null, "system_server", null, null, subject, null, newFd, null); } }; dropboxThread.start(); try { dropboxThread.join(2000); // wait up to 2 seconds for it to return. } catch (InterruptedException ignored) {} // At times, when user space watchdog traces don't give an indication on // which component held a lock, because of which other threads are blocked, // (thereby causing Watchdog), crash the device to analyze RAM dumps boolean crashOnWatchdog = SystemProperties .getBoolean("persist.sys.crashOnWatchdog", false); if (crashOnWatchdog) { // wait until the above blocked threads be dumped into kernel log SystemClock.sleep(3000); // now try to crash the target try { FileWriter sysrq_trigger = new FileWriter("/proc/sysrq-trigger"); sysrq_trigger.write("c"); sysrq_trigger.close(); } catch (IOException e) { Slog.e(TAG, "Failed to write 'c' to /proc/sysrq-trigger"); Slog.e(TAG, e.getMessage()); } } IActivityController controller; synchronized (this) { controller = mController; } if (controller != null) { Slog.i(TAG, "Reporting stuck state to activity controller"); try { Binder.setDumpDisabled("Service dumps disabled due to hung system process."); // 1 = keep waiting, -1 = kill system int res = controller.systemNotResponding(subject); if (res >= 0) { Slog.i(TAG, "Activity controller requested to coninue to wait"); waitedHalf = false; continue; } } catch (RemoteException e) { } } // Only kill the process if the debugger is not attached. if (Debug.isDebuggerConnected()) { debuggerWasConnected = 2; } if (debuggerWasConnected >= 2) { Slog.w(TAG, "Debugger connected: Watchdog is *not* killing the system process"); } else if (debuggerWasConnected > 0) { Slog.w(TAG, "Debugger was connected: Watchdog is *not* killing the system process"); } else if (!allowRestart) { Slog.w(TAG, "Restart not allowed: Watchdog is *not* killing the system process"); } else { Slog.w(TAG, "*** WATCHDOG KILLING SYSTEM PROCESS: " + subject); for (int i=0; i<blockedCheckers.size(); i++) { Slog.w(TAG, blockedCheckers.get(i).getName() + " stack trace:"); StackTraceElement[] stackTrace = blockedCheckers.get(i).getThread().getStackTrace(); for (StackTraceElement element: stackTrace) { Slog.w(TAG, " at " + element); } } Slog.w(TAG, "*** GOODBYE!"); Process.killProcess(Process.myPid()); System.exit(10); } waitedHalf = false; } }run()方法中有一个无限循环,每次循环主要做3件事。
(1)调用scheduleCheckLocked()方法给所有受监控的线程发送消息。
public void scheduleCheckLocked() { if (mMonitors.size() == 0 && mHandler.getLooper().isIdling()) { // If the target looper is or just recently was idling, then // there is no reason to enqueue our checker on it since that // is as good as it not being deadlocked. This avoid having // to do a context switch to check the thread. Note that we // only do this if mCheckReboot is false and we have no // monitors, since those would need to be executed at this point. mCompleted = true; return; } if (!mCompleted) { // we already have a check in flight, so no need return; } mCompleted = false; mCurrentMonitor = null; mStartTime = SystemClock.uptimeMillis(); mHandler.postAtFrontOfQueue(this); }HandlerCHecker对象既要监控服务,又要监控某个线程;因此先判断mMonitors的size是否为0,如果为0,说明这个HandlerChecker没有监控服务,这时如果被监控线程的消息队列处于空闲状态(isIding()方法判断),则说明线程运行良好,把mCompleted设为true后就可以返回了。否则先把mCompleted设为false,并记录消息开始发送的时间,最后调用postAtFrontOfQueue()方法给被监控的线程发送一个消息。这个消息的处理方法是HandlerChecker类的run()方法,如下:
public void run() { final int size = mMonitors.size(); for (int i = 0 ; i < size ; i++) { synchronized (Watchdog.this) { mCurrentMonitor = mMonitors.get(i); } mCurrentMonitor.monitor(); } synchronized (Watchdog.this) { mCompleted = true; mCurrentMonitor = null; } }如果消息处理方法run()能够执行,说明受监控的线程本身没有问题。但是,还要检查被监控服务的状态。检查是通过调用服务中实现的monitor()方法来完成的。通常monitor()方法的实现是获取服务中的锁,如果不能得到,线程就会挂起,这样mCompleted的值就不能被设为true。
mCompleted的值为true,表面HandlerChecker对象监控的线程或服务正常。否则就有空有问题。是否真有问题还要通过等待的时间是否超过规定时间来判断。
(2)、给受监控的线程发送完消息后,调用wait()方法让WatchDog线程睡眠一段时间。
(3)、逐个检查是否有线程或服务出问题了,一旦发现,马上杀死进程。
通过evaluateCheckerCompletionLocked()方法来检查,如下:
private int evaluateCheckerCompletionLocked() { int state = COMPLETED; for (int i=0; i<mHandlerCheckers.size(); i++) { HandlerChecker hc = mHandlerCheckers.get(i); state = Math.max(state, hc.getCompletionStateLocked()); } return state; }evaluateCheckerCompletionLocked()通过调用每个检查对象的getCompletionStatelocked()方法来得到对象的状态值。状态值有4种。
COMPLETED:0,表示状态良好。
WAITING:1,表示正在等待消息处理的结果。
WAITED_HALF:2,表示正在等待并且等待的时间超过了规定时间的一半。
OVERDUE:3,表示等待时间已经超过了规定的时间。
evaluateCheckerCompletionLocked()方法希望知道的是最坏的情况,因此上面代码中使用Math.max()方法来计算出所有监控对象的最坏情况。
OVERDUE状态就会杀死SystemServer进程。
相关文章推荐
- (转载)了解Android 4.1,之三:黄油项目 —— 运作机理及新鲜玩意
- 安卓开发资料
- 在AndroidStudio中使用Lambda表达式
- Android下拦截、监听返回键和home键
- android 资源网址总结
- Android系统自带样式(android:theme)
- android studio下 SHA1的获取
- Android弹出对话框
- 倍数提高工作效率的 Android Studio 奇技
- android 日期弹窗
- 运算符
- Android studio安装配置
- Android_01_按钮点击事件的四种写法
- 深入理解Android的startservice和bindservice
- android Android大图片裁剪终极解决方案
- android attrs.xml declare-styleable 属性
- android adb被占用解决办法 ADB server didn't ACK
- 中科院开源协会镜像站 Android SDK镜像测试发布
- Android_01_电话拨号器
- 两分钟彻底让你明白Android Activity生命周期(图文)!