Tuesday, September 18, 2012

IDAscope: *fixed*

Today I learned that I am bad at releasing code.
The initial release had a stupid string conversion bug that I introduced when trying to make the console output a bit prettier.
It's fixed now, the release link should now point to the new version.


Loop Awareness in Crypto Identification


However, you might be interested in downloading the new version I just pushed to the repo. It contains a new check box in Crypto Identification that allows to choose only those basic blocks that are contained in loops.

This new option again reduces dramatically false positives and supports a faster finding of potential crypto routines.

The following screenshot shows the option in action:

Loop awareness in the Crypto Identification widget.

Previously I would have considered the settings shown in the sliders as rather "wide" and thus giving lots of misleading / uninteresting blocks. In fact, without the additional limiting to looped blocks it would have shown 194 instead of the 28 blocks it shows now.


Tarjan's Algorithm


The calculation of the blocks in loops is also amazingly fast. Luckily it didn't introduce a noticeable overhead. That would have surely been the case if I had tried to implement this naively. ;)

Thankfully there is Tarjan's algorithm known from graph theory for detecting strongly connected components.

A function graph can be treated as a directed graph (N, V) with N being the nodes and V (N x N) being the vertices. N is represented by all basic blocks of the function. The set of vertices can be collected by looping over all blocks (e.g. accessible via idaapi.FlowChart()) and obtaining their successors generated by idaapi.BasicBlock.succs().

Tarjan's algorithm is executed once on every node of the graph, treating it as a potential "root" node. The paths leaving the root node are explored via depth-first search (DFS), meaning successive nodes are immediately explored as well as long as they have not been explored earlier. 
While traversing the graph, two steps are performed per encountered node. First, the node is assigned both a "DFS index" that represents the depth in which the node has been found and a "low link", which is the smallest "DFS index" reachable among this node's successors. Second, the node is stored on a stack.
When no more nodes can be explored, the stack is walked backwards. Whenever a node is encountered that has an equal "DFS index" and "low link", this node is the root of a strongly connected component and all elements on the stack above it are members of its component. 
The algorithm has linear complexity in nodes and vertices: O(|N| + |V|) and thus is very efficient.

There is a nice visualization explaining the algorithm in Wikipedia (German).

The algorithm does not allow to differ single, unlooped blocks from single, trivially looped blocks (blocks with a loop by pointing to themselves). Therefore,
 the set of successors of a block has to be additionally checked for the identical block.

I will try to provide some theoretical background on the implementation details of IDAscope once in a while and hope it's not too boring. :)

Monday, September 17, 2012

IDAscope Release!

Good News for all people interested in IDAscope.
I submitted the extension to the Hex-Rays contest two days ago and today I want to follow up promises and release a first public version.

You can find the repository we will use to publish future updates here:

https://bitbucket.org/daniel_plohmann/simplifire.idascope/

There is a prepared download for version 1.0. You can also checkout via git from the repo (right now they have the same content).

For installation instructions, feature description and usage, check out the manual located in IDAscope/documentation/manual/.

A big THANK YOU to all people who did a beta test of IDAscope. You provided very helpful feedback!

We hope it turns out to be an useful addition to IDA Pro. Development does not stop here of course, as we plan to continue the project to integrate more features.

We are open to feedback, so don't hesitate to tell us your opinion!

UPDATE 18.09.2012: There was a bug in the initial release that has been fixed. Check out the following blog post for details and a new feature!

Saturday, September 1, 2012

IDAscope beta update

Nothing much to blog about. Therefore, only a short update on IDAscope's progress.
I just pushed out a second beta version to the people that expressed interest in testing it. If you are interested, too, this announcement is still valid. ;)

Here is a list of changes/fixes included with the second beta:

Function Inspection:
- Added functionality to create functions from unrecognized code. This function will first try to find and convert function prologues (push ebp; mov ebp, esp) and then convert the remaining undefined code.
- Added functionality to identify and rename potential wrappers (small functions with exactly one call referencing an API function). Thanks to Branko Spasojevic for this contribution.

WinAPI:
- Fixed path resolution for html files, should work on non-Windows operating systems now, too. Thanks to Sascha Rommelfangen for fixing this, I only have IDA versions on Windows available so I could hardly debug this.
- Included a back/forward button to allow easier browsing of visited articles.

Crypto Identification:
- Adjusted default parameters to a tighter set, resulting in less false positives on startup.
- Added some crypto signatures (CRC32 generator, TEA/XTEA/XXTEA).

The public release will be in two weeks from now.