Wednesday, June 5, 2013

Small IDAscope update

I should probably visit more conferences. When heading out, it's always nice to get some fresh input from people. I'm currently at CyCon (Tallinn, Estonia) where I'll give a talk about the malware analysis workflow we are using at Fraunhofer FKIE. I'll add the corresponding paper to pnx.tf as soon as I get the approval.

The motivation for the latest little addition to IDAscope has to be credited to Hugo Teso, whom I met yesterday. I'm really looking forward to his Friday talk on aviation security, which will be related to the one he gave at HITB this year.

Semantics Profiles

The latest commit to IDAscope contains an extension to the "Function Inspection" part. It is now possible to quickly switch between different profiles for semantic definitions, which should open up this part of IDAscope for better usage in analysing ring0 stuff as well as POSIX or even other platforms.



IDAscope will automatically load all semantics profiles it finds in the newly created subfolder:
  • \idascope\data\semantics
Currently there are two profiles: First, the already known win-ring3 profile, which resembles the default that you know from "Function Inspection". Second, a placeholder file for POSIX. Only placeholder, because I did not take the time to add any specifications for it yet.
Actually, I'd love to see contributions from volunteers who have been actively using this feature or had told me were looking forward to this extension. :) Otherwise I might add new profiles myself if I have a good day or something.

The specification for a profile looks like this and should be easy to extend (JSON):


{
    "author": "pnX",
    "creation_date": "05.06.2013",
    "name": "posix",
    "reference": "http://pnx-tf.blogspot.com/2013/06/small-idascope-update.html",
    "comment": "template for POSIX semantics",
    "renaming_seperator": "_",
    "default_neutral_color": "0xCCCCCC",
    "default_base_color": "0xB3DfFF",
    "default_highlight_color": "0x33A7FF",
    "semantic_groups": [{
        "tag": "grouptag0",
        "name": "group0",
        "base_color": "0xB3B3FF",
        "highlight_color": "0x333377"
    }, {
        "tag": "grouptag1",
        "name": "group1",
        "base_color": "0xB3DFFF",
        "highlight_color": "0x33A7FF"
    }],
    "semantic_definitions": [{
        "tag": "socket",
        "name": "socket",
        "group": "group0",
        "api_names": ["socket", "open", "close", "connect", "listen", "bind"]
    }]
}


Future

Regarding future updates on IDAscope, I obviously took almost half a year off from developing until now. I hope that I will be able to spend more time on improving IDAscope in the near future.
It's especially sad that I was not able to finish the interactive call-graph feature yet that I had previewed in one of the previous blog posts. I am still inclined to finish this, just to give the time I already spent on it more value.
In parallel I will likely also start refactoring the storage of the internal analysis results. I have something in mind that could be done with it but it's not yet publishable, so I'll leave it with that.

If you have other ideas or feature requests, please let me know.

Saturday, January 5, 2013

IDAscope progress

Originally, I only wanted to give a short update on the stuff I did to IDAscope at the very end of the 29c3 post... Apparently I created enough content to let this be a post of its own. 
So now I want to cover up the recent activities around IDAscope from the last month or so. I'm currently working on graphing stuff, as some of you might have seen on Twitter already but I will cover this in an extra post and full detail when it has reached a presentable (release-worthy, that is) state.

Late November I had some free time to push IDAscope a bit forward. As can be seen from the commit history, most changes were bugfixes, covering:

  • Update to renaming wrappers (thanks Branko).
  • A small bugfix for xrange() beyond 0xFFFFFFFF.
  • The usage of the results generated by Tarjan's algorithm for finding strongly connected components was implemented incorrectly and would not cover basic blocks in nested or non-trivial loops.
  • The Counting of semantic API hits in FunctionInspection was incorrect under certain circumstances.
  • IDAscope can now properly used as an IDA plugin. It can be dropped into the plugins folder, allows autostarting with the loading of a new binary and can be started via IDA's Menu.
  • Config file format was changed from JSON to Python for easier parsing and the ability to comment entries within the file.
  • Semantic tags can now be grouped within the definitions file.
  • Entries in FunctionInspection widget can now be shown as groups and filtered customly.
Filtering looks like this:



Probably more interesting is the visual feature I am working on.

Graphing Function Relationship

My current progress on graphing includes being able to extract the structure of arbitrary functions and their referenced children from IDA and generating a graph layout based on this information. However, nodes can still be moved freely around once the calculated layout has been "unlocked". Incoming and outgoing references are coloured green/red to improve the navigation. API calls are not shown but shall be nested within the display of their respective calling function (red box to expand and show these API calls). The graph can be dragged around, navigated with keyboard and seamlessly zoomed in and out.
At the moment, it looks like this:



Before I actually fill this with more functionality such as actions upon clicks (move to function, rename function, displaying API calls within function, optional colouring, you name it, ...) I have to solve other, more essential issues. :)

When displaying graphs of functions with a lot of children, I run into the same issues as you all experienced with the WinGraph overviews:

 
You don't really get the structure any longer and everything becomes unreadable. However, having this window open besides your one function view already is a benefit, I guess. Furthermore, removing API calls from the set of nodes being graphed improved the situation a bit as well but I am not satisfied yet.

A property of these large graphs is that their aspect ratio is massively out of order, they are much wider than high. This can likely be fixed by patching the graph layout algorithm I am currently using. Again, thanks to bdcht for providing his lib grandalf!

While relationship between functions is probably easier to grasp in my graphs already...




... I want to work towards something that is really helpful for browsing functions and recognizing patterns among their relationship.

Right now it's too "alpha" to show around some code already but please contact me if you have ideas you want to see embedded into this or see potential for improvement!

We'll see where I end up with this.
Make sure to check out the repository from time to time to keep up with the additions and improvements. Larger releases are announced here in the blog, shorter ones on Twitter.

29c3 Trip Report

I want to start the new year with a short trip report of my visit to the 29th Chaos Communication Congress (29c3) in Hamburg, Germany.

It was my first attendance of Germany's largest hacker conference and mostly met my expectations. Prior to travelling and judging from the "Fahrplan" (that's how the overview of scheduled talks is called) hardcore tech talks had only a minor role this year which was kind of sad. So from that point of view it was a bit disappointing for me personally, as I had experienced two great technical conferences in 2012 already, REcon and a very familial and special one on binary occultism.

However, with about 6000 attendees, 29c3 was a great chance to meet people again that I knew from before, fill up some formerly pure digital contacts with real life interactions and randomly get to know new people. Shouts out to all of you who enjoyed the time spent together as much as I did.

Apart from that, in one of the workshops and over the days I obtained some basic skills of lock picking. I never thought I would enjoy it that much, but lock picking pretty much resembles my activities of reverse engineering but projected onto physical objects with a additional need for manual / mechanical skill. It's pretty much like Mastermind, I guess. I immediately bought myself some equipment and practice "materials". Maybe I will blog about my progress at a later time as well.

To be useful to my readers, here is my personal selection of some talks I visited (in chronological order) and which I would like to highlight because of their awesomeness.

SCADA Strangelove

The talk I appreciated the most on the first day was one of the last to be held, given by Sergey Gordeychik, Gleb Gritsai, and Denis Baranov (Project's Twitter). While we all now that SCADA still has a lot of potential for future catastrophes, this talk gave a nice overview on how (NOT) hard it actually seems to pwn SCADA equipment. Very scary.

Aaron Portnoy's recent adventure into SCADA software already gave a nice impression on the state of security but this talk completed the picture in a very entertaining way.



Many Tamagotchis Were Harmed in the Making of this Presentation

On the one hand, with a title like this, it was only natural to join this talk as it implicated low level focus and hardware hacking. My expectations were more than met when Natalie Silvanovich explained here journey towards making her Tamagotchis the happiest in the world and finally achieved it by setting the respective variable to 0xFFFF. ;)

Due to Tamagotchis being hip during my time in school, I well remembered those little plastic eggs (never owned one). Natalie outlined the evolution of the devices since the 90ies, showing pluggable bonus devices and explaining the IR communication capabilities of recent releases. 

Having reversed the IR protocol gave her already plenty power to mess with the little creatures but left some aspects unresolved that required reversing the chip. She continued by detailing her attempts to uncover and identify the micro controller. In probably numerous hours of work she was able to fiddle around with the EEPROM and Figure ROM, finally being able to extract some data such as the animations stored in this memory.

The talk was very informative and was presented in an awesome way.



How I met your pointer

First: I have to admit that I didn't visit that one in person because I was a bit late at the lecture hall, but hey, I watched the live stream!
I kind of knew Carlos Garcia Prado only from Twitter before his talk. He was the first person I followed because I wanted to stay up to date on his Daemon Enterprises challenges during the time he published them. :)

His talk's topic was using binary instrumentation targeting client / server software in order to improve fuzzing. 
He started out with a very short introduction to fuzzing as being a technique to cause crashes in proprietary software by feeding pseudo-randomly crafted and thus hopefully invalid but acceptable content to interfaces.

Next, he validly argued against dumb fuzzing. As alternative approach he came up with a comparison to biotech / protein manipulation. The binary equivalent in that sense would be interfering with a programs DNA (code) and partly using / altering it to create custom behaviour. 

He achieved this by combining hooking and instrumentation, namely through using Detours and PIN. Detours is used to intercept execution and save / manipulate program state, PIN is used to differentially debug the program to spot interesting parts / functionality.

He finally gave a demo showing his framework's functionality on a little network based crackme.

He spiced up his presentation by including tons of pictures from various movies and series on whose characters' identities he asked the audience about. Correct answers were gifted with pieces of chocolate. I would have loved to see that in person, on the stream it looks like he was throwing pretty hard. :)



Page Fault Liberation Army or Gained in Translation

This talk by Julian Bangert and Sergey Bratus gave an excellent insight on how weird x86 actually is. 

Julian constructed a Turing complete machine just based on the behaviour of the trap flag. Instruction set completeness is achieved by only one instruction, case-dependent representing an arithmetic (SP decrement) or branching (CPU double fault) operation. This is enough to represent arbitrary programs. 

It's noticeable that using double faults in such a way is very transcendental as such an error under normal circumstances is most likely connected to a buggy kernel and will lead to a reboot in case the DF handler fails (= triple fault). Therefore, it's pretty impressive how Julian has abused the specifications of x86 to create this weird machine. 

Don't expect this behaviour to be demonstrated easily, out of the emulation systems tried by Julian (QEMU, Bochs, Simics, KVM, PLTSim), only Bochs was able to show this functionality properly.