Hi. In this post we’ll take a quick look into what APK files look like. You will also find out what the obfuscation process is, how to perform it (on both code and strings) and how to decompile your code. Let’s do some reverse engineering then ;)
I believe it won’t be any surprise to say that an APK is basically a zip file. We can extract it with any zip file archiver and find almost all files we put there in unchanged form.
The code we put into
*.java files is transformed. We can roughly depict this process as
What we need to know is that the classes.dex file is compiled to an assembly-like language, but… is roughly translatable back to java. Unless we use obfuscation (although it might be illegal to reverse engineer our code), we’re basically open-sourcing our code as we can assume that the potential attacker will be able to almost copy-paste our code. Apart from possibility of leaking the code out itself, it may e.g. lead to someone reuploading your application with modified content - for instance unlocked achievements or premium services. Obfuscation is also highly recommended if you’re developing an application with DRM. It can also be useful if you’re writing a game and want to slow down cheaters that write trainers ;) I will show simple decompilation and obfuscation techniques later on in the post.
In Android we usually use a tool called ProGuard, which is responsible for obfuscation and optimization of our code. It has a wide variety of options, but we have to be careful which ones we want to pick. I’m not going to elaborate in detail, just one remark: be sure to thoroughly test your application in release mode, especially after changing ProGuard config. ProGuard can sometimes break your code up if used improperly. It can cause errors e.g. in code that uses reflection or in libraries that do – e.g. json parsers.
We can assume that every file in
res directory is easily accessible to everyone that installs our application (including all drawables, xml files, fonts, raw files - you name it). We have to remember not to
store any confidential files there. If it’s absolutely essential to store a confidential file there, we should keep it encrypted and decrypt it in run-time as it will be harder to reverse engineer.
In order to perform a decompilation, you obviously need to have an APK file. You can e.g pull your application’s APK off the device by calling
Now that you have an APK file available, there are various methods to disassemble/decompile code - I will show the easiest and most natural for a Java developer. You only need a zip file archiver (like 7zip, for example), dex2jar and jd-gui tools. The procedure is simple and works as follows:
All resources will be available in the extracted folder. And your decompiled APK might look like:
Generally speaking, log output should only be visible in debug code, never in production code. Log files can give away the attacker much information about class/method responsibility and properties. For example, given an obfuscated method:
without thorough investigation we don’t know what’s going on, but let’s suppose our “magic line” is:
and suddenly everything is clear :) So, if data you’re logging is confidential you should remember to make sure your log calls are absent in your decompiled code. One of possible ways
to achieve this is to use optimized ProGuard, which will remove all calls to Log.* methods. You can do this by using
proguard-android-optimize.txt in your
buildTypes.release Gradle config.
proguard-rules.pro you should specify:
If you don’t want to use optimized mode (because it may mess something up) and don’t care too much about leaking the information you wanted logged, you can write a simple wrapper, which will check if you’re in release mode or not. You can also try using third-party helper libraries, although they sometimes leave behind parameters like in our “magic line” example above, so be sure to double-check before deployment.
ProGuard does not obfuscate Strings in any way - variable’s name may be changed, but the value remains. String constants can tell the attacker very much about the obfuscated code – that’s why sometimes we should try to obfuscate them by creating them on-the-fly. There are also various cases in which it is the recommended way (e.g. when handing in-app billing).
The easiest and most naive way of decrypting strings is to basically obfuscate them with a simple two-way algorithm like ROT-13. Problem is, that it’s only the slightest of obstacles for the attacker as even obfuscated code of such a simple algorithm is very easy to reverse engineer. We can try using one-way algorithm like AES for encryption, but there’s a catch… we need a key. And the key also has to be present in the code, one way or another. And even if we do manage to hide the key safely somehow, tools exist that try to semi-run the code and reveal original strings. There are other tricks that can slow the attacker down (split strings into parts, evaluate part by part in various places in program, use NDK), but one question emerges: „how badly do I want to slow his progress down?”.
Of course, there are tools that can do the obfuscation for you (like DexGuard, Stringer). Problem is, that most of those libraries are commercial and they’re not magic wands - same rules apply. Don’t get me wrong, they certainly do get the job done well! Just pointing out a reminder here – it’s an ongoing race between people trying to obfuscate the code and crackers, just like with PC game cracking. String-obfuscating libraries can take the weight off your back - sure, but sooner or later they will most likely also be reverse engineered.
As our example, we’re not going to use any super-sophisticated algorithm, just an example so you can get the idea how the code before and after decompilation can change.
Let’s suppose we have a JSON file and we want to show value of
price field in a toast. Let’s look at non-obfuscated example code.
Our source code:
The code is pretty straightforward – we added the JSON data as a String, we’re parsing it and extracting the „price” field. After decompilation code is basically a one-liner:
Now let’s try and obfuscate the code a bit, starting with the field name. The simplest of ways, and out first step would be to save the „price” String as a int array (cast each character as an int). We’ll get:
But that’s too obvious; let’s try and put another small obstacle in a way to reverse-engineer the String. Let’s assume that those int values above are degrees (they are in range
so there should be no trigonometric problems) and change them to radians. We’ll get
As we can see, our variables now are in
<1.8,2> range. Let’s add some noise - redundant random doubles that are in this range:
And our „price” String looks way less readable. Now let’s write the code to extract our String from the array. What we need to do is get proper indexes from the table (15-19),
iterate through them and for each array element calculate and round degree value, cast to char and concatenate. This way, we changed
price String to
Now let’s do something with the JSON file, which looks way too obvious. I manually encrypted the JSON file using AESCrypt-Android.
To simplify the code, I used
price as a passphrase. Now the JSON looks like
Now to get our price value, we need the
price String, which is now both a field name and a key to the JSON. Then we need to decrypt the JSON and get the
price field value
from it. Let’s see how out method looks like now:
Now let’s build and decompile our project, see how the code looks:
price String, now described with
EncryptedPriceTag class looks better now:
We can change how output code will be distributed by manipulating by adding/removing
static keyword next to the members of
EncryptedPriceTag class and by changing ProGuard settings.
We can e.g. make the make
get method appear inlined in
MainActivity The code is far from perfectly safe - keep in mind though, it was just an example so you can get the basic idea.
We have to always remember at the back of your mind about how the code will be looking after decompilation. We have to put ourselves in potential attacker’s boots and defend ourselves against threats. Of course this post barely touched the surface of the problem, but I hope it gave you a rough idea and will encourage you to keep looking :)