Pharos publishes a number of utilities, one of which is OOanalyzer. This tool uses prolog to build inferences that help reconstruct objects. At the moment this tool only works on 32 bit Windows applications. Still, let's try it out:
The easiest way to get going with this is to use their provided Docker image. Going roughly by the documentation,
$ sudo apt install docker.io$ sudo docker pull seipharos/pharos$ mkdir /chose/your/path/hostdir$ sudo docker run --rm -it -v /chose/your/path/hostdir:/dir seipharos/pharoswill launch an interactive session in which the host directory /chose/your/path/hostdir is mapped to /dir inside the container.
OOanalyzer is then found in /usr/local/bin and more tools and files can be found in /root/pharos
There are many sample executables shipped to test the functionality with under /root/pharos/tests
For example
~# /usr/local/bin/ooanalyzer /root/pharos/tests/ooex_vs2010/Release/oo.exe --json oo.exe.jsonOPTI[INFO ]: Analyzing executable: /root/pharos/tests/ooex_vs2010/Release/oo.exeOPTI[INFO ]: OOAnalyzer version 1.0.OPTI[INFO ]: ROSE stock partitioning took 1.94562 seconds.OPTI[INFO ]: Partitioned 3051 bytes, 980 instructions, 321 basic blocks, 0 data blocks and 90 functions.OPTI[INFO ]: Pharos function partitioning took 2.18172 seconds.OPTI[INFO ]: Partitioned 4096 bytes, 1104 instructions, 363 basic blocks, 14 data blocks and 108 functions.OPTI[INFO ]: Function analysis complete, analyzed 56 functions in 3.39674 seconds.OPTI[INFO ]: OOAnalyzer analysis complete, found: 3 classes, 8 methods, 0 virtual calls, and 0 usage instructions.OPTI[INFO ]: Successfully exported to JSON file 'oo.exe.json'.OPTI[INFO ]: OOAnalyzer analysis complete.This created a oo.exe.json file that can be imported into IDA or Ghidra - more on that later.
I found, however, that on some real-world executables the default settings fail to work. Let's have a look at V9.exe, an MFC executable about 19 MB large.
~# /usr/local/bin/ooanalyzer /dir/V9.exe --json V9.exe.jsonOPTI[INFO ]: Analyzing executable: /dir/V9.exeOPTI[INFO ]: OOAnalyzer version 1.0.OPTI[INFO ]: ROSE stock partitioning took 1899.79 seconds.OPTI[INFO ]: Partitioned 4425314 bytes, 1156333 instructions, 261239 basic blocks, 536 data blocks and 18759 functions.OOAN[FATAL]: Partitioner absolute memory exceeded: 7391.06 secs CPU, 8000.04 MB memory, 7399.29 secs elapsedOOAN[FATAL]: Exiting prematurely, increase --partitioner-timeout and try again.additionally, there is another parameter, --log, which helps diagnose any problems - more on that later. Thus, to process our file, we increase the defaults:
/usr/local/bin/ooanalyzer /dir/V9.exe --json V9.json --timeout 1000000 --maximum-memory 1500000 --partitioner-timeout 1000000 --log='APID(all)'and after a very, very long time, it finishes - even though a number of error messages show up in the log.
FSEM[ERROR]: Analysis of function 0x0079D90E failed: relative CPU time exceededOPTI[INFO ]: Function analysis complete, analyzed 49805 functions in 36946.6 seconds.PLOG[ERROR]: OOAnalyzer has been running for over an hour in Prolog mode. We have found that for, complex executables, SWI Prolog often outperforms XSB Prolog. You may wish to dump the .facts file for your executable using the -F option of ooanalyzer, and then run the oodebugrun-swipl script in share/pharos/oorules of your build directory. You will need to install swipl and ensure it is on your PATH.OPTI[INFO ]: OOAnalyzer analysis complete, found: 1330 classes, 7818 methods, 121 virtual calls, and 17326 usage instructions.OPTI[INFO ]: Successfully exported to JSON file 'V9alt.json'.OPTI[INFO ]: OOAnalyzer analysis complete.root@8f638bf7cf9a:/dir# Since we get a warning about prolog performance, we would like to try it the other way (XSB prolog), but the software authors informed me that they currently don't support creating the json file for the plugin from XSB, so we have to stay with this for now.
There are a number of warnings/errors output by the software. For example, ooanalyzer does not know about setupapi.dll
APID[TRACE]: API Lookup: SETUPAPI:SetupDiGetClassDevsAAPID[WHERE]: JSON API database /usr/local/share/pharos/apidb/pharos-api-additions.json has no data for DLL: SETUPAPIAPID[WHERE]: SQLite API database /usr/local/share/pharos/apidb/pharos-apidb.sqlite has no data for DLL: SETUPAPIAPID[WHERE]: Decorated name parser has no data for DLL: SETUPAPIAPID[WARN ]: API database has no data for DLL: SETUPAPIAPID[TRACE]: API Lookup: SETUPAPI:SetupDiEnumDeviceInterfacesbut ooanalyzer does provide a mechanism to teach it. The JSON API database /usr/local/share/pharos/apidb/pharos-api-additions.json .
The default file has got an example:
{ "config": { "exports": [{"dll": "OBSCURE32.DLL","export_name": "SomeFunction","display_name": "SomeFunction","convention": "stdcall","parameters": [{"name": "dwFirstParam", "type": "DWORD", "inout": "in"}],"type": "void","ordinal": 123}]}}so we can use wine to help ooanalyzer along a bit:
http://www.mit.edu/afs.new/sipb/project/wine/arch/i386_rhel4/lib/wine/libsetupapi.def
with a little script
#!/bin/bashfilename='libsetupapi.def'echo "{ \"config\": { \"exports\": [" > setupapi.jsonwhile read line; do echo $line echo '{"dll": "setupapi.dll",' >> setupapi.json name=$(echo $line|awk '{ print $1}') # extract function name name="${name%%@*}" # strip extra echo '"export_name": "'$name'",' >> setupapi.json echo '"display_name": "'$name'",' >> setupapi.json echo '"convention": "stdcall",' >> setupapi.json echo '"parameters": [' >> setupapi.json delta=$(echo $line|awk '{ print $1}') # extract function delta delta="${delta#*@}" echo ' {"delta": "'$delta'"}' >> setupapi.json echo '],' >> setupapi.json echo '"type": "UnknownReturn",' >> setupapi.json ordinal=$(echo $line|awk '{ print $2}') # extract function ordinal echo $ordinal ordinal="${ordinal#*@}" # strip extra echo '"ordinal": '$ordinal'' >> setupapi.json echo '},' >> setupapi.jsondone < $filenametruncate -s-1 setupapi.json #remove last commatruncate -s-1 setupapi.json #remove last commaecho "" >> setupapi.jsonecho "]}}" >> setupapi.jsonwe just need to replace pharos-api-additions.json with setupapi.json
and then do the same thing with the other dlls that are missing, combining all of them together into a single pharos-api-additions.json
The MFC42.dll json needs to be generated a bit differently because of the mangled names and object oriented nature of the calls:
#!/bin/bashfilename='MFC42.DEF'echo "{ \"config\": { \"exports\": [" > mfc42.jsonwhile read line; do if [[ $line != \;* ]] then echo $line mangled=$(echo $line|awk '{ print $1}') #echo $mangled ordinal=$(echo $line|awk '{ print $3}') #echo $ordinal demangled=`apilookup --pretty=4 --json=- mfc42.dll:"$mangled"` #echo "$demangled" ordinalInserted=`echo "$demangled" | sed 's/\"export_name\":/"ordinal\": '$ordinal' ,\n\"export_name\":/'` ordinalInserted=`echo "$ordinalInserted" | sed 's/\"dll\": \"mfc42\"/\"dll\": \"MFC42.DLL\"/'` ordinalInserted="${ordinalInserted:1}" # strip first char ordinalInserted="${ordinalInserted::-1}" # strip last char echo "$ordinalInserted" >> mfc42.json echo "$ordinalInserted" echo "," >> mfc42.json fidone < $filenametruncate -s-1 mfc42.json #remove last commatruncate -s-1 mfc42.json #remove last commaecho "]}}" >> mfc42.jsonUpdate: the winmm.json, winspool.json, mfc42.json and setupapi.json I made have now been merged to the contrib folder of the project, so they can be used just by --apidb contrib/winspool.json etc.
The --json option (using XSB prolog) usually works ok, but in my test case here it eventually succumbs to some error and it's a wontfix :https://github.com/cmu-sei/pharos/issues/90
Update: the --json option should now work as the project moved to using SWI prolog by default.
So we have to create a .facts file first and then use SWI prolog to move on.
Also, ooanalyzer does not always find the new and delete method addresses, so we have to pass those manually https://github.com/cmu-sei/pharos/issues/85
First:
$ /usr/local/bin/ooanalyzer ./V9.exe --timeout 1000000 --per-function-timeout 1000000 --partitioner-timeout 1000000 --maximum-memory 1000000 --per-function-maximum-memory 1000000 --threads 2 --apidb contrib/winmm.json --apidb contrib/winspool.json --apidb contrib/mfc42.json --apidb contrib/setupapi.json --log='APID(all)' --prolog-facts V9.exe.ooanalyzer.facts --new-method 0x8431b2 --delete-method 0x8431ac> status.txt 2>&1and then as a second step, we sort the .facts file as specified here: https://github.com/cmu-sei/pharos/tree/master/share/prolog/oorules
awk -F\( '{print $1}' V9.exe.ooanalyzer.facts | sort | uniq -cAnd then create the json file
/usr/local/share/pharos/prolog/oorules/oodebugrun V9.exe.ooanalyzer.facts > ooprog.logfrom which we now extract the final determinations
grep ^final ooprog.log >ooprog-results_V9.pland finally
# /usr/local/share/pharos/prolog/oorules/oojson ooprog-results_V9.pl > V9.ooanalyzer.jsonwhich produces a V9.ooanalyzer.json file that can be opened in the ooanalyzer plugin for Ghidra or IDA
Here a function inside an executable as decompiled by stock ghidra:
And here the same function after using ooanalyzer
OOanalyzer found that this function is a class member function, determined the hierarchy, changed the call convention to thiscall and labelled the this pointer in the code. This can be very labor saving.
References: