Friday, February 7, 2014

IDAScope v1.1: YARA scanning


Today I integrated something in the master branch of IDAscope that I myself liked to have available for quite some time: Seamless scanning with YARA signatures from within IDA for the win!

Late December 2013 YARA's author Victor M. Alvarez made us a christmas gift with his YARA 2.0 release.
I read the release notes but didn't realize the implications of "YARA has experiencied an almost complete rewrite for version 2.0" at that point in time.

Around mid December, I tasked one of my student assistants (Christopher Kannen) with developing a minimum / pure-python version of YARA. The goal of this task would have been to enable its use in IDA and work arond the issues experienced in the past.
Last week, he finished the code for loading YARA rules into convenient objects. When he was about to start with implementing basic matching, I became aware that importing YARA in the IDA Python shell no longer failed.
Happy day, this meant that the desired functionality could be immediately integrated into IDAscope with full native support for YARA rule files, avoiding any side-effects.

Here is a screenshot of the widget in action:



I used some binary (BISCUIT, 268eef019bf65b2987e945afaf29643f) from @snowfl0w's Contagio Malware Dump collection of APT1 stuff and the signatures as provided by AlienVaultLabs for developing/testing and the demo screenshot. Keep up the good work!

However one would assume Christopher's work is now useless and was wasted time. ;)
No! It comes in handy as I will outline in the description up next.

Fiddling with YARA in Python

Everyone who has ever played with YARA and Python is probably familiar with its basic usage, like (examples taken from YARA's manual):
import yara rules = yara.compile(filepath="/path") suspicious = "some data to be scanned" rules.match(data=suspicious)
Since YARA is intended to be fast, the "rules" object potentially contains multiple signatures from a single file compiled into one object.
I always missed the ability here to inspect those signatures loaded in detail, like having access to their names and strings of individual signatures. Maybe it's possible, I just never managed to do so.
A probably lesser known but cool feature of YARA are match callbacks. It comes in pretty handy as a workaround in this context:

import yara def cb(data): print data yara.CALLBACK_CONTINUE rules = yara.compile(filepath="/path") suspicious = "some data to be scanned" rules.match(data=suspicious, callback=cb)
Each time the callback is fired, we receive a dictionary "data" like this one:
{ 'tags': ['foo', 'bar'], 'matches': True, 'namespace': 'default', 'rule': 'my_rule', 'meta': {}, 'strings': [(81, '$a', 'abc'), (141, '$b', 'def')] }
As you can see, we can derive from that data, which signatures from the rules object have just been run against the target input and what their individual outcome is.
We can also derive information about partial matches by checking the "matches" and "strings" entries.

This is basically what I now use in the IDAscope widget to derive the scores and detailed views for signatures.
Christopher's rule loader additionally allows to read the signatures as given in the source file, thus comparing which of the strings from the callbacks are matched and which are not.

Combining all of those parts results in the extension just added to IDAscope.

If you want to use it, make sure to install YARA Python first and adjust the paths specified in ./idacope/config.py to your local collection of signature files.

Should you find any errors, please blame here or via mail.

Here is IDAscope v1.1 (mirror) as a downloadable snapshot from the repository (commit f3d58ad). If the latest extensions should prove to be stable and usable, I might not even need to push another version like last time, lol.

Development Plans

I'm not entirely sure if I am going to push this IDAscope widget further than its current functionality.

Instead, a full-blown interactive YARA editor (plugin) seems more attractive to me right now. So basically something independent from IDAscope, since the other tabs may be of less interest to a signature writer.
If it is not going to be too heavy code-wise, I might opt later on to integrate it back in IDAscope. I'm open for feedback in this matter.

But first: enjoy YARA in your IDA! :)

Wednesday, January 29, 2014

PyBox Relaunch

A much too long time has passed since I blogged the last time. I guess the main reason for this is that I've been pretty busy with #DAYJOB for the last half year and while I did several things I considered blog-worthy, I just didn't put in the extra effort to go for an appropriate write-up.

This is not going to be a late new year resolution but I sincerely want to be more active again in terms of releases. I'll be likely going for smaller, incremental posts (like I did during the main IDAscope development) as these are easier to bring to an acceptable level of quality. If there is interest, I might also start covering more concrete malware analysis content but I have been reluctant towards this so far.

Today marks a milestone for an old project of mine. I have migrated the repository of PyBox from googlecode to bitbucket since googlecode has been more or less killed by disabling new downloads earlier this month (Git > SVN anyway).
I wanted to do the migration for half a year, but now I finally had the time needed to accompany it with a little story of what PyBox is and how we got there. In the same run, I compiled the DLL and pydasm for the two most recent versions of Python 2.7.
I hope that there is the one or the other interesting aspect in the code that might find usage elsewhere. Maybe PyBox by itself is interesting enough to know about as well. ;)

History of PyBox

Back in 2010 when Felix Leder was still at University of Bonn, he thought it would be great to have a highly customizable analysis framework / sandboxing tool for daily malware analysis. I guess he was inspired by the outcome of the Project Honeynet Google summer of code (GSOC) project Cuckoo sandbox for which he was an advisor at that time. :)
The idea behind Cuckoo has always been to inject a DLL into a target process and have that DLL serve as a platform for monitoring. In Cuckoo, the DLL is setting up a number of hooks for interesting Windows API functions. Later during execution, when hooks are triggered, the logging results in a sequence of calls to their target functions including the respective parameters.
PyBox is based on the same idea. As with Cuckoo, a DLL is injected into a target process to serve as a platform. However, upon injection, the PyBox DLL starts a fully fledged Python interpreter within the target process, allowing the execution of arbitrary Python scripts within the context of that process. Since Python is a great language for rapid prototying, this approach allowed us to quickly design analysis modules, e.g. custom sandboxes, tailored to certain aspects of chosen malware families.
Lately I've noticed that a similar concept is being realized by Frida, but using Javascript instead of Python.

PyBox architecture

In the following, I'll explain the architecture of PyBox when being used as a sandbox, its original intended use case.

Injection

As mentioned earlier, the core of PyBox is being injected as a DLL (./DLL/PyBox.cpp) into a target process, so we first need an injector. It is located at ./src/injector.py and the approach realized here is pretty straightforward. If the target process does not exist yet, start it (optionally suspended). For injection, first get a handle to the process (kernel32!OpenProcess), allocate some memory in it (kernel32!VirtualAllocEx) to store a string holding the path to our PyBox DLL (kernel32!WriteProcessMemory). Finally, use our good old friend kernel32!CreateRemoteThread to start a thread within the target process with kernel32!LoadLibraryA and the PyBox DLL path as argument.
For some side tasks, we use the module ProcessRigger (./src/process_rigger.py). For example, in order to easily perform follow-up tasks, it's nice to grant our injector and the target process the privilege SE_DEBUG. A more interesting functionality implemented in ProcessRigger is its ability to execute an arbitrary API call in the context of the target remote process. For this, we dynamically generate and write a short shellcode to the target process, consisting of the expected number of push instructions as arguments (either immediates or pointer to strings / structures) and the desired call to the target API as well as a consecutive call to kernel32!ExitThread. Nothing new, but useful.
For PyBox, we only use this to set some environment variables in the context of the target process in which PyBox is injected, but since I think the concept has potential for more, here is sample code and a diagram:


PyBox DLL

When the DllMain() of PyBox is loaded in the target process, it will first check the presence of said environment variables, proceed to open a file for logging and then initialize the Python interpreter. The PyBox DLL additionally makes itself available to the interpreter environment as an embedded module, enabling easy access to some native system functionality, like access to the process environment block (PEB), enumeration of exports for other DLLs, hook/callback handling and emergency termination. Finally, the PyBox DLL will hand over control to the target "box" starter script (example: ./src/starter.py) which then executes the desired analysis functionality.

PyBox API

As just mentioned, PyBox is intended to be used with independent "boxes" that are specialized for certain purposes. These boxes are powered by the functionality provided through the PyBox API.
First, there is MemoryManager (./src/pybox/memorymanager.py), granting access to memory manipulation functions in the process address space via Python ctypes. A bunch of convenience functions automatically handles read/write permissions of given memory to enable read/write operations.
Next, there is the ModuleManager (./src/pybox/emodules.py), which enumerates all other loaded DLL files (= executable modules) in the target process' address space in preparation of hooking. The enumeration is done through the embedded module provided through the PyBox DLL itself, in order to speed up this procedure.
The PyHookManager (./src/pybox/hooking.py) provides an interface to PyBox' hooking functionality: add and remove hooks, check if an address is already hooked (similar addresses can be hooked with multiple hooks, which are then executed in chain), and selecting the appropriate hook through its function find_and_execute(). The reason for this last function is that all hooks are first pointing to the same callback address in the PyBox DLL which handles mutual exclusion (as well as the Global Interpreter Lock (GIL) of Python) prior to transferring control to the hook code implemented in Python.
There exist three classes of hooks: PyFunctionEntryHook, PyReturnAddressHook, and PyHookClone. When a new PyFunctionEntryHook is created for a target address (e.g. an Windows API function), up to 20 bytes of memory are read and disassembled via pydasm. The reason for this is that we need to overwrite 5 bytes for a jmp instruction to our hook trampoline (PyTrampoline) while preserving the integrity of the modified code. If more than 5 bytes are taken, the rest is padded with NOPs { 90 }. For most Windows API functions, we run into no trouble as they usually start with "move edi, edi; push ebp; mov ebp, esp" which sums up to exactly 5 bytes but this may not the case for arbitrary other functions. A PyReturnAddressHook is realized by overwriting the original return address of a function with a PyTrampoline address. A PyHookClone is used when hooking one address with multiple hooks and references the original first hook.
The PyTrampoline is a dynamically generated shellcode, preparing the call of a hooking function. It is optionally prefixed with a "jmp self" { EB FE }, which turns out useful when writing a box for unpacking. It can be used with the intention to intercept the control flow before OEP is reached (in order to attach a debugger and proceed manually). Next the current register state is saved (PUSHAD { 60 }) and an identifier for our hook is pushed (this allows to differ multiple hooks on the same address). Hooks can optionally be used with their own parameters, so these are pushed { 68 11223344 } now. Finally, a call to the hook function is made { E8 ca11bacc }. When the hook returns, the register state is restored { 61 } and any original opcode bytes saved during overwriting the target address when setting up the hook are executed. Finally we jump back to the instruction behind the originally hooked address { E9 001dc0de }.
In summary, hook execution looks like this:
When a hook is called, it can access and modify the current function execution context (return addr, stack) and register context (from EAX to EDI) through two respective objects passed to it as argument.
Besides these modules, there is also an unfinished dumper module and a ProcTrack module, which hooks API functions for spawning new processes and injects the current box into these when triggered.

Box Scripts

I have included two sample boxes in the version pushed to the repository.

The first example is the standard sandbox (stdbox), which hooks a range of interesting API functions one would be interesting in when tracing the execution of malware. I have gone for a harmless example and traced the creation of a new file on disk through notepad.exe (uploaded here).
Most of the lines in the log file are noise, important are these:

2014-01-29 12:05:59,878 - INFO - kernel32.dll.CreateFileW(\
C:\Documents and Settings\redacted\Desktop\test.txt, 0x80000000, 3, 0, 3, 0x00000080,0
[...]
2014-01-29 12:05:59,898 - INFO - kernel32.dll.CreateFileW(\

C:\Documents and Settings\redacted\Desktop\test.txt, 0xc0000000, 3, 0, 4, 0x00000080,0)
2014-01-29 12:05:59,898 - INFO - kernel32.dll.WriteFile(\

0x00000138, 0x000e0db8, 0x0000000c, 0x0007faf0, 0x00000000)

As you can see filename is "test.txt" on the desktop. The first call to CreateFileW() with GENERIC_READ access and CREATE_ALWAYS+CREATE_NEW flags create the file. The second call with GENERIC_READ+GENERIC_WRITE access and OPEN_ALWAYS flags opens the file for writing. This is ultimately followed by WriteFile() putting the bytes "just a test" in there.

The second example is the more useful RWX box. This box will only log calls that being made from RWX memory. The idea behind this would be e.g. PyBox injection into explorer.exe in order to monitor the behaviour of malware self-injecting into that process. Once again here is an example, this time of Citadel injecting into explorer.exe.
  • In lines 8-813 you can observe Citadel creating its dynamic import table within the context of explorer.exe.
  • In lines 817-1137, Citadel creating hooks itself for a range of Windows API functions.
  • In lines 1138-1271 you can see Citadel starting some threads and guess about their intention (ObtainUserAgentString -> mimic target system's browser, WABOpen -> crawl address book for email addresses).
  • From line 1272 on, you can observe Citadel searching for other processes to inject into.
  • Finally, from line 1372 on, Citadel found a target and injects itself.

I have included a startup batch file for both boxes so you can easily try them out.
These boxes are rather generic and simple, but it should be easy to imagine that more powerful use cases for automation can be covered (conditional controlling and patching of "interesting" memory).
More advanced stuff was shown by Felix in his talk at Troopers conference in 2011, e.g. how to intercept network payloads before they enter an SSL connection (FireFox) or how to analyze obfuscated PDF exploits by pyboxing Acrobat.

Conclusion

So that's my short intro to the PyBox framework. Notice that it only works up to WinXP since we are practicting evil process/memory voodoo here and there that does not like modern memory protection mechanisms.
Currently, there is also no intention to further pursue development of this framework unless there will be unexpectedly huge interest in this. ;)
Hopefully some of the design or implementation details may be of interest to some of you so the PyBox spirit can live on!

Wednesday, June 5, 2013

Small IDAscope update

I should probably visit more conferences. When heading out, it's always nice to get some fresh input from people. I'm currently at CyCon (Tallinn, Estonia) where I'll give a talk about the malware analysis workflow we are using at Fraunhofer FKIE. I'll add the corresponding paper to pnx.tf as soon as I get the approval.

The motivation for the latest little addition to IDAscope has to be credited to Hugo Teso, whom I met yesterday. I'm really looking forward to his Friday talk on aviation security, which will be related to the one he gave at HITB this year.

Semantics Profiles

The latest commit to IDAscope contains an extension to the "Function Inspection" part. It is now possible to quickly switch between different profiles for semantic definitions, which should open up this part of IDAscope for better usage in analysing ring0 stuff as well as POSIX or even other platforms.



IDAscope will automatically load all semantics profiles it finds in the newly created subfolder:
  • \idascope\data\semantics
Currently there are two profiles: First, the already known win-ring3 profile, which resembles the default that you know from "Function Inspection". Second, a placeholder file for POSIX. Only placeholder, because I did not take the time to add any specifications for it yet.
Actually, I'd love to see contributions from volunteers who have been actively using this feature or had told me were looking forward to this extension. :) Otherwise I might add new profiles myself if I have a good day or something.

The specification for a profile looks like this and should be easy to extend (JSON):


{
    "author": "pnX",
    "creation_date": "05.06.2013",
    "name": "posix",
    "reference": "http://pnx-tf.blogspot.com/2013/06/small-idascope-update.html",
    "comment": "template for POSIX semantics",
    "renaming_seperator": "_",
    "default_neutral_color": "0xCCCCCC",
    "default_base_color": "0xB3DfFF",
    "default_highlight_color": "0x33A7FF",
    "semantic_groups": [{
        "tag": "grouptag0",
        "name": "group0",
        "base_color": "0xB3B3FF",
        "highlight_color": "0x333377"
    }, {
        "tag": "grouptag1",
        "name": "group1",
        "base_color": "0xB3DFFF",
        "highlight_color": "0x33A7FF"
    }],
    "semantic_definitions": [{
        "tag": "socket",
        "name": "socket",
        "group": "group0",
        "api_names": ["socket", "open", "close", "connect", "listen", "bind"]
    }]
}


Future

Regarding future updates on IDAscope, I obviously took almost half a year off from developing until now. I hope that I will be able to spend more time on improving IDAscope in the near future.
It's especially sad that I was not able to finish the interactive call-graph feature yet that I had previewed in one of the previous blog posts. I am still inclined to finish this, just to give the time I already spent on it more value.
In parallel I will likely also start refactoring the storage of the internal analysis results. I have something in mind that could be done with it but it's not yet publishable, so I'll leave it with that.

If you have other ideas or feature requests, please let me know.

Saturday, January 5, 2013

IDAscope progress

Originally, I only wanted to give a short update on the stuff I did to IDAscope at the very end of the 29c3 post... Apparently I created enough content to let this be a post of its own. 
So now I want to cover up the recent activities around IDAscope from the last month or so. I'm currently working on graphing stuff, as some of you might have seen on Twitter already but I will cover this in an extra post and full detail when it has reached a presentable (release-worthy, that is) state.

Late November I had some free time to push IDAscope a bit forward. As can be seen from the commit history, most changes were bugfixes, covering:

  • Update to renaming wrappers (thanks Branko).
  • A small bugfix for xrange() beyond 0xFFFFFFFF.
  • The usage of the results generated by Tarjan's algorithm for finding strongly connected components was implemented incorrectly and would not cover basic blocks in nested or non-trivial loops.
  • The Counting of semantic API hits in FunctionInspection was incorrect under certain circumstances.
  • IDAscope can now properly used as an IDA plugin. It can be dropped into the plugins folder, allows autostarting with the loading of a new binary and can be started via IDA's Menu.
  • Config file format was changed from JSON to Python for easier parsing and the ability to comment entries within the file.
  • Semantic tags can now be grouped within the definitions file.
  • Entries in FunctionInspection widget can now be shown as groups and filtered customly.
Filtering looks like this:



Probably more interesting is the visual feature I am working on.

Graphing Function Relationship

My current progress on graphing includes being able to extract the structure of arbitrary functions and their referenced children from IDA and generating a graph layout based on this information. However, nodes can still be moved freely around once the calculated layout has been "unlocked". Incoming and outgoing references are coloured green/red to improve the navigation. API calls are not shown but shall be nested within the display of their respective calling function (red box to expand and show these API calls). The graph can be dragged around, navigated with keyboard and seamlessly zoomed in and out.
At the moment, it looks like this:



Before I actually fill this with more functionality such as actions upon clicks (move to function, rename function, displaying API calls within function, optional colouring, you name it, ...) I have to solve other, more essential issues. :)

When displaying graphs of functions with a lot of children, I run into the same issues as you all experienced with the WinGraph overviews:

 
You don't really get the structure any longer and everything becomes unreadable. However, having this window open besides your one function view already is a benefit, I guess. Furthermore, removing API calls from the set of nodes being graphed improved the situation a bit as well but I am not satisfied yet.

A property of these large graphs is that their aspect ratio is massively out of order, they are much wider than high. This can likely be fixed by patching the graph layout algorithm I am currently using. Again, thanks to bdcht for providing his lib grandalf!

While relationship between functions is probably easier to grasp in my graphs already...




... I want to work towards something that is really helpful for browsing functions and recognizing patterns among their relationship.

Right now it's too "alpha" to show around some code already but please contact me if you have ideas you want to see embedded into this or see potential for improvement!

We'll see where I end up with this.
Make sure to check out the repository from time to time to keep up with the additions and improvements. Larger releases are announced here in the blog, shorter ones on Twitter.

29c3 Trip Report

I want to start the new year with a short trip report of my visit to the 29th Chaos Communication Congress (29c3) in Hamburg, Germany.

It was my first attendance of Germany's largest hacker conference and mostly met my expectations. Prior to travelling and judging from the "Fahrplan" (that's how the overview of scheduled talks is called) hardcore tech talks had only a minor role this year which was kind of sad. So from that point of view it was a bit disappointing for me personally, as I had experienced two great technical conferences in 2012 already, REcon and a very familial and special one on binary occultism.

However, with about 6000 attendees, 29c3 was a great chance to meet people again that I knew from before, fill up some formerly pure digital contacts with real life interactions and randomly get to know new people. Shouts out to all of you who enjoyed the time spent together as much as I did.

Apart from that, in one of the workshops and over the days I obtained some basic skills of lock picking. I never thought I would enjoy it that much, but lock picking pretty much resembles my activities of reverse engineering but projected onto physical objects with a additional need for manual / mechanical skill. It's pretty much like Mastermind, I guess. I immediately bought myself some equipment and practice "materials". Maybe I will blog about my progress at a later time as well.

To be useful to my readers, here is my personal selection of some talks I visited (in chronological order) and which I would like to highlight because of their awesomeness.

SCADA Strangelove

The talk I appreciated the most on the first day was one of the last to be held, given by Sergey Gordeychik, Gleb Gritsai, and Denis Baranov (Project's Twitter). While we all now that SCADA still has a lot of potential for future catastrophes, this talk gave a nice overview on how (NOT) hard it actually seems to pwn SCADA equipment. Very scary.

Aaron Portnoy's recent adventure into SCADA software already gave a nice impression on the state of security but this talk completed the picture in a very entertaining way.



Many Tamagotchis Were Harmed in the Making of this Presentation

On the one hand, with a title like this, it was only natural to join this talk as it implicated low level focus and hardware hacking. My expectations were more than met when Natalie Silvanovich explained here journey towards making her Tamagotchis the happiest in the world and finally achieved it by setting the respective variable to 0xFFFF. ;)

Due to Tamagotchis being hip during my time in school, I well remembered those little plastic eggs (never owned one). Natalie outlined the evolution of the devices since the 90ies, showing pluggable bonus devices and explaining the IR communication capabilities of recent releases. 

Having reversed the IR protocol gave her already plenty power to mess with the little creatures but left some aspects unresolved that required reversing the chip. She continued by detailing her attempts to uncover and identify the micro controller. In probably numerous hours of work she was able to fiddle around with the EEPROM and Figure ROM, finally being able to extract some data such as the animations stored in this memory.

The talk was very informative and was presented in an awesome way.



How I met your pointer

First: I have to admit that I didn't visit that one in person because I was a bit late at the lecture hall, but hey, I watched the live stream!
I kind of knew Carlos Garcia Prado only from Twitter before his talk. He was the first person I followed because I wanted to stay up to date on his Daemon Enterprises challenges during the time he published them. :)

His talk's topic was using binary instrumentation targeting client / server software in order to improve fuzzing. 
He started out with a very short introduction to fuzzing as being a technique to cause crashes in proprietary software by feeding pseudo-randomly crafted and thus hopefully invalid but acceptable content to interfaces.

Next, he validly argued against dumb fuzzing. As alternative approach he came up with a comparison to biotech / protein manipulation. The binary equivalent in that sense would be interfering with a programs DNA (code) and partly using / altering it to create custom behaviour. 

He achieved this by combining hooking and instrumentation, namely through using Detours and PIN. Detours is used to intercept execution and save / manipulate program state, PIN is used to differentially debug the program to spot interesting parts / functionality.

He finally gave a demo showing his framework's functionality on a little network based crackme.

He spiced up his presentation by including tons of pictures from various movies and series on whose characters' identities he asked the audience about. Correct answers were gifted with pieces of chocolate. I would have loved to see that in person, on the stream it looks like he was throwing pretty hard. :)



Page Fault Liberation Army or Gained in Translation

This talk by Julian Bangert and Sergey Bratus gave an excellent insight on how weird x86 actually is. 

Julian constructed a Turing complete machine just based on the behaviour of the trap flag. Instruction set completeness is achieved by only one instruction, case-dependent representing an arithmetic (SP decrement) or branching (CPU double fault) operation. This is enough to represent arbitrary programs. 

It's noticeable that using double faults in such a way is very transcendental as such an error under normal circumstances is most likely connected to a buggy kernel and will lead to a reboot in case the DF handler fails (= triple fault). Therefore, it's pretty impressive how Julian has abused the specifications of x86 to create this weird machine. 

Don't expect this behaviour to be demonstrated easily, out of the emulation systems tried by Julian (QEMU, Bochs, Simics, KVM, PLTSim), only Bochs was able to show this functionality properly.