A Systematical Study on APM Libraries for Android

Selected APM.

1~5: Tingyun, BaiduAPM, UMeng, Mobile Tencent , OpenInstall

6~10: New Relic, App Dynamics, OneAPM, GrowingIO, Google Firebase

11~15; Dynatrace, Site24*7, AppPulse Mobile,CA Mobile. Apteligent

16~20: Flurry, AppsFlyer, Yandex Metrica , Adjust, Ironsource

21-25: Countly, Sentry, AndroidGodEye, BlackCancary, ArgusAPM


Subject Apps.

The list of Android apps we used can be found at here. We only show the package name and meta-data of apps instead of displaying all apks due to storage restriction.

Results.

The detail of apps can be found here

https://drive.google.com/open?id=1WKHlMtdkD801wpxLClqfNOCqkLTGuWtQ

500 thousand apps in total.

Which APMs support privacy deployment?

Data can be found here:

https://drive.google.com/open?id=1uorUXdgvIqUptppKBVNDSua0cObX1beQbtSjrLJLgQs

Is data encrypted during transition? Data can be found here

https://drive.google.com/open?id=1uqH3LQ8Y534TY2PO77v905Al99qyB9DNBQpegCKr-Tg

Upload interval data can be found here

https://drive.google.com/open?id=1QqcOwn9CyAwTfQdD_UFBWthB3s2rodM0qY2m0pno5J0

The data for privacy policy can be found here

https://drive.google.com/open?id=1pFtYhts1_XGb9d0NU8yMm5lyg3ZfwQ-z

(For the privacy policy, only apps that have privacy policy && use APMs are collected. )

Tool.

The tool will be available here after paper notification.

Supplementary

  1. OkHttp:

In practice, some apps use a popular HTTP client named OKHttp to build network requests. OkHttp provides interceptors for developers to monitor, rewrite, and retry calls (a.k.a. network requests). The interceptor enables developers to compute the time spent for obtaining responses. The interceptors are registered either as application interceptors or as network interceptors. The application interceptors capture the communications between the OkHttp client and apps. The network interceptors capture the communications between the OkHttp client and the server. APMs leverage such interceptions to capture network requests. Registering application interceptors can be accomplished by calling method addInterceptor(). Similarly, with the method addNetworkInterceptor(), network interceptors are registered.

  1. Capturing Crash in Native Code

In Android, developers can use C/C++ to develop the native part of an app (e.g., the native library) or develop the entire app with native code. Therefore, crashes can happen in native code. APMs handle native crashes with the following steps: installing a signal handler, extracting the stack traces, and building the symbol files.

Installing a signal handler. When crashes occur in native code, an error signal will be generated [35], [36]. Hence, APMs can capture the crash by using the method sigaction to install a signal handler, which captures and processes the error signal. Fig. 2 shows an example of registering an error signal handler. In Fig. 2, the sigaction() method allows the calling process to specify an action to be performed for a specific signal. That is, when the signal SIGSEGV is captured, the function signal_handle is then triggered.


Extracting the stack traces. Recall the sample code in Fig. 2, in method init(), handler, an instance of structure sigaction, is defined to deal with the error signal. To specify a signal processing method, the field sa_sigaction of handler is set to method signal_handle. Note that, vcontext, the last parameter of method signal_handle, represents the thread context. After invoking method sigaction, the error signal raised in native code is then handled by method signal_handle.

In this case, the method signal_handle() will process the received signal. The signal handler receives the signal number, information of the signal delivery, and information of the thread context. The thread context (void *vcontext) is a pointer, whose type is ucontext_t. Therefore, it is possible to produce a rudimentary profiling infrastructure with times and machine context information. In sigaction, the signal value and delegated processing instance are defined. Next, we illustrate the usage of sigaction with Fig. 2. The field sa_sigaction is used to set the handing method (i.e., signal_handle).

After receiving the signals, the APM copies the crashed process to a daemon process, which shares an address space with the crashed process. This allows the APM to trace the crashed process. Then, it records the status of the current thread, loaded executable libraries, and other information into the minidump. This is because the above process is conducted in a daemon process, which shares an address space with the crashed process. Therefore, it can only collect the information from the crashed process with ptrace method

Program counter and relative memory location. The thread context (*vcontext) contains information about the context of signal and the PC (program counter) at crashed point. However, the PC is an absolute value, which cannot be used directly. Furthermore, APMs inspect the /proc/self/maps file to get the memory range of the program segments. Then the crashed method can be found by checking the offset and the range of each program segment with this PC. By now, it is possible to recover the crash point from the program. However, the information is recorded by machine code, which is not human readable. The next step is to build the symbol file and translate the crash information in machine code to human-readable stack traces.

Symbol file. Even if we can obtain the saved context at the crash point from an instance of ucontext_t, it is also a machine-specific representation. Therefore, to generate human-readable stack traces, symbol files should be extracted as well. This is because the records in a symbol file provide the mapping between machine code and source.

Construct Human-readable Stack Traces. With the symbol files and the PC recovered, APMs can translate the machine code into human-readable stack traces. Interested readers are referred to Google’s Breakpad [37] framework for more details. Breakpad is a mature and open-source library for debugging and analyzing crashes for C/C++ program. It is used in most commercial APMs (e.g., UMeng, Tingyun, Sentry). With the constructed human-readable stack traces, developers can fix the bugs in C code.