NDroid: Towards Tracking Information FlowsAcross Multiple Android Contexts
This makes a bidirectional data flow through multiplecontexts, i.e., the Java context and the native context, in Androidapps. Unfortunately, this interaction brings serious challenges toexisting dynamic analysis systems, which fail to capture the dataflow across different contexts. In this paper, we first performeda large-scale study on apps using native code and reported someobservations. Then, we identified several scenarios where dataflow cannot be tracked by existing systems, leading touncaughtinformation leakage. Based on these insights, we designed andimplemented an efficient dynamic taint analysis systemthat could track the data flow between both Java context andnative context. The evaluation of ral apps demonstrated theeffectiveness ofin identifying information leakage withreasonable performance overhead.Code Shoppy
The popularity of Android platform is evident from thetremendous number of activated devices and available apps.As of May 2017, there are around 72.68% smartphone runningAndroid system [1]. At the same time, for better performancereason and compatibility of legacy code, developers tend touse native code in their apps and interface with Java codethrough the JNI bridge. Developers can even create an entireapp using native code since Android 2.3.Recent years witnessed a considerable increase in the num-ber of apps using native libraries. For example, from 204,040applications collected in May.-Jun. 2011 from several markets,Zhou et al. observed that 4.52% of them used native code [2].This percentage increased to 9.42% in 118,318 apps collectedby the same authors in Sep.-Oct. 2011 [3]. This trend is furtherconfirmed by the findings that 24% apps crawled from Asianthird-party mobile markets contain native code [4].However, the popularity of native code in apps bringsserious challenges to existing dynamic analysis systems. First,although there are many systems for analyzing apps or de-tecting malware [2], [3], [5], only a few of them inspect thenative libraries in apps [6], [7], and none of them scrutinizesthe interactions between an app’s Java code and native code.This leads to a security loophole, which could be abused bymalware to evade detection.
This makes a bidirectional data flow through multiplecontexts, i.e., the Java context and the native context, in Androidapps. Unfortunately, this interaction brings serious challenges toexisting dynamic analysis systems, which fail to capture the dataflow across different contexts. In this paper, we first performeda large-scale study on apps using native code and reported someobservations. Then, we identified several scenarios where dataflow cannot be tracked by existing systems, leading touncaughtinformation leakage. Based on these insights, we designed andimplemented an efficient dynamic taint analysis systemthat could track the data flow between both Java context andnative context. The evaluation of ral apps demonstrated theeffectiveness ofin identifying information leakage withreasonable performance overhead.Code Shoppy
The popularity of Android platform is evident from thetremendous number of activated devices and available apps.As of May 2017, there are around 72.68% smartphone runningAndroid system [1]. At the same time, for better performancereason and compatibility of legacy code, developers tend touse native code in their apps and interface with Java codethrough the JNI bridge. Developers can even create an entireapp using native code since Android 2.3.Recent years witnessed a considerable increase in the num-ber of apps using native libraries. For example, from 204,040applications collected in May.-Jun. 2011 from several markets,Zhou et al. observed that 4.52% of them used native code [2].This percentage increased to 9.42% in 118,318 apps collectedby the same authors in Sep.-Oct. 2011 [3]. This trend is furtherconfirmed by the findings that 24% apps crawled from Asianthird-party mobile markets contain native code [4].However, the popularity of native code in apps bringsserious challenges to existing dynamic analysis systems. First,although there are many systems for analyzing apps or de-tecting malware [2], [3], [5], only a few of them inspect thenative libraries in apps [6], [7], and none of them scrutinizesthe interactions between an app’s Java code and native code.This leads to a security loophole, which could be abused bymalware to evade detection.
Stack StructureAs shown in , TaintDroid modifiesDVM’s stack structure to increase stack size for storing taintlabels related to registers. For method invocation, TaintDroidfirst stores the taint labels interleaved with the parameters atthe current stack frame’s outs area. Then it allocates stack slotsfor callee’s local variables and lets the frame pointer point tothe new method’s first local variable. After that, TaintDroidallocates aStackSaveAreaon the top of the stack for savingthe caller’s information.When a method returns, TaintDroid will save the returnvalue’s taint label into current thread’sInterpSaveState. Ifthe target is a native method, TaintDroid will store both theparameters’ taint labels and the return value’s taint label thatis appended to the parameters. The return value’s taint labelis set by JNI Call Bridge according to TatintDroid’s taintpropagation policy, because native code cannot directly accessthe return value’s taint label. The retrun value’s taint label willalso be copied to current thread’sInterpSaveStateafter thenative method returns.Taint StorageFor ArrayObject and StringObject that containan array of chars, TaintDroid sets a taint label in the arrayobject. For class static field and class instance field, thetaint labels are stored interleaved with variables in Class’s orObject’s instance data area. For other Java objects, TaintDroidonly keeps the taint label of their references.Taint PropagationThe taint propagation policy is a set ofrules that define when and how taint should be propagated.TaintDroid adds taints to the sources of sensitive information(GPS data, SMS messages, IMSI, IMEI, etc.) of an Androiddevice. The taint labels in TaintDroid are represented by 32bitintegers, each bit of a taint label indicates one type of sensitiveinformation, and different types of sensitive information arecombined by the union operation of different taint labels.TaintDroid tracks the taints of primitive type variables andobject references according to the logic of eachDVMinstruc-tion.When a native method is called, TaintDroid adopts thetaint propagation policy that the return value will be tainted ifany parameter is tainted.Code Shoppy
A. Instrumentation ManagerWhen an app sends sensitive data to its own native codeby invoking native methods, the data first goes through theJNI bridge before it steps into native codes, then native codeswill handle the data and possibly invoke system library calls.Therefore, the JNI bridge, apps’ third party native librariesand system libraries must be instrumented in order to traceinformation flows through JNI.As shown in Fig. 5, for an app’s own native code (i.e.,libNDroidDemo.so), the instrumentation manager instrumentsit at two different levels: (1) basic block level (i.e., indicatedfopenlibc.soInstructionMethod......call fopenlibNDroidDemo.socall NewObjectlibdvm.sodvmCallJniMethodNewObjectdvmAllocObjectINSN_BEGINBLOCK_ENDBLOCK_BEGINCALLRETURNFig. 5.Instrumentation ManagerbyBLOCKENDarrow) – if a block of code ends at invokingsystem library method or JNI API call, we do instrumentationat the end of it; (2) instruction level (i.e., indicated byINSNBEGINarrow) – each instruction is instrumented beforebeing executed. By doing so, whenever an app’s native codecalls system library methods and JNI APIs we are interested in(e.g.,open(), NewObject(), etc.), we can conduct analysis be-fore and after they are invoked. Note that system libraries andJNI bridge are not instrumented all the time. Instead, we onlyinstrument them when they are used by an app’s own nativecode. However, certain methods (e.g.,dvmCallJniMethod(),dvmAllocObject(), etc.) related to JNI bridge are instrumentedat both beginnings/ends of their first/last basic blocks. Detailsabout these methods will be discussed in Section V-B.It is necessary to know the offsets of the methods that needinstrumentation. Since it is time-consuming to calculate thoseoffsets manually, we prepare scripts to disassemble libraries(e.g.,libc.so, libm.so, libdvm.so, etc.), extract offsets, andgenerate template codes for handlers in following subsections.
No comments:
Post a Comment