Android Payload Obfuscation

Hello Guys, i wanted to learn about android payload obfuscation techniques, if anyone have any resources, please share them.

This might answer your question.

By default, ProGuard renames paths, class names, methods and variables using the alphabet. Thus, spotting strings such as ‘a/a/a;->a’ in the smali code is a strong indication that the sample has been obfuscated using ProGuard. Of course, this simplistic method of detection is not infallible because ProGuard can be configured to use any replacement dictionary you wish using the options -obfuscationdictionary, -classobfuscationdictionary and -packageobfuscationdictionary. For instance, Android/GinMaster.L uses a custom dictionary, where the strings were probably generated randomly using something like RANDOM.ORG - String Generator.

The replacement of path names, class names, methods and variables cannot be undone. However, usually the reversing of ProGuard-ed samples isn’t too difficult because the strings and code layout are not modified. The work is very similar to reversing an application coded by a beginner (poor choice of variable names etc.).

Working on DexGuard-ed samples is much more difficult. [2] lists the obfuscator’s features. The main reason why DexGuard-obfuscated samples are more difficult to work with is because the class and method names are replaced with non-ASCII characters and strings are encrypted. Tools such as JD-GUI [3] and Androguard [4] are more difficult to use (e.g. difficult to get name completion). It is as if reverse engineers have had their senses dulled: text strings and even some familiar function calls and patterns no longer exist to guide the analyst to the more interesting parts of the code.

Fortunately, no obfuscator is perfect. [5] clarifies parts of how DexGuard works. Meanwhile, we provide a code snippet that can be used to detect it, and three different ways to help with the reversing of DexGuard-ed samples.

First, its detection – i.e. identifying the use of DexGuard on a sample – is usually fairly visual: the repetitive use of non ASCII characters gives it away. The code snippet below lists non-ASCII smali files in smali disassembled code.

$ find . -type f -name “*.smali” -print | perl -ne ‘print if /[$^$ [:ascii:]]/’

Second, its reversing can be made easier by using the following tools or techniques:

  • DexGuard decryption python script. [6] provides a script template that can be applied to each DexGuard-ed sample. The script decrypts encrypted strings, which makes reversing easier. However, this tool only works with samples that use old versions of DexGuard, not the more recent ones.
  • Logging. A reverse engineer can disassemble the sample with baksmali [7], insert calls to Android logging functions (see below), recompile the application (smali), and run it.

invoke-static {v1, v2}, Landroid/util/Log;->e( Ljava/lang/String;Ljava/lang/String;)I

This displays corresponding strings in Android logs. It is an archaic, but simple and useful debugging technique. Nevertheless, this technique requires modification of the malicious sample – a practice anti-virus analysts are usually not authorized (or willing) to perform for ethical and security reasons.

  • String renaming. To work around the problems caused by non-ASCII characters, all strings can auto-matically be renamed to a dummy ASCII string. To do this, we enhanced Hidex [8]. Originally, this tool was created to demonstrate the feasibility of hiding methods in a DEX file (see Section 4 and [9]). However, progressively, it has evolved into a small DEX utility tool that can be used for the following:

    • To list strings (option --show-strings).
    • To automatically rename non-ASCII strings (option --rename-strings). This is what we use, for instance, in the case of DexGuard. Each string that contains non-ASCII characters is replaced automatically by a unique string generated only with ASCII characters and which is the same size as the original string. ( In theory, there are cases where we should fall short of replacement strings and thus fail to do the renaming. For example, if a sample has more strings of a single character than possible ASCII characters, the replacement is impossible. In practice, we have never encountered this limitation.) The replacement string must meet the aforementioned requirements of uniqueness and size, to conform to the DEX file format. For proper replacement, note that string size (UTF16 size field of string data item) is in UTF16 code units, not in bytes. Please refer to [10].There is one constraint that Hidex does not currently handle: the ordering of strings. In DEX files, strings must be ordered alphabetically. Renaming the strings usually breaks the correct ordering. Consequently, Android will refuse to load the modified classes.dex file. In the case of reverse engineering malware, this is not a real problem (perhaps it is even more secure/ethically correct) because Android reversing tools such as baksmali, apktool, dex2jar and Androguard do not enforce correct ordering of strings either. Thus, they are able to disassemble the modified classes.dex without any problem.
    • To parse DEX headers and detect headers hiding additional information (see Section 2.4).
    • To detect potential hidden methods (option --detect).