Practical Malware Analysis - Conclusion Lab Write-up
Extra - YARA: Malware Identification
This section is independent of the PMA Labs. It contains resources to expand on malware analysis practices by creating Yara rules for detecting different variants or families of malware.
For those getting started Florian Roth has some excellent guides which is what we’ll be basing this section off of for simplicity and efficiency:
- How to Write Simple but Sound Yara Rules
- How to Write Simple but Sound Yara Rules – Part 2
- How to Write Simple but Sound Yara Rules – Part 3
First we’ll look at using YarGen as a starting point. This is a Yara generation tool created by Florian Roth.
We download the latest release, extract it, open a terminal in this directory, and install requirements on our Linux OS of choice (in this scenario we’ve used Sift Workstation created by SANS). We’re also going to ensure we have pip (Pip Installs Packages) installed for python3.6 which as the name implies is used for managing packages used by python.
In the below example we’ve used explicitly python 3.6 to avoid some bugs present in python 3.5.
sudo python3.6 -m pip install --upgrade pip
sudo pip3.6 install -r requirements.txt
Next we’ll update the definitions of known goodware strings which is used to help avoid false positives in our created rules. This may take a bit of time.
python3.6 yarGen.py --update
Note: When attempting to run yarGen.py I experienced an issue with ‘etree’ being imported from lxml. A reinstall of the latest version of lxml at the time of wrtiting fixed the issue.
sudo pip3.6 install lxml==4.6.1
From here we’re able to start generating a Yara rule. A simple example of generating a rule for Chapter_1 binaries is shown below.
python3.6 yarGen.py -a "@CyberRaiju" -r "PMA Labs - Chapter 1" -m BinaryCollection/Chapter_1L/ -o PMA-Lab01.yar
Parameters used are below:
- a: set the author field.
- r: set the reference field.
- m: malware path (this must be a folder).
- o: output file for yara rule.
The end result is a staring Yara rule that we can then perform some ‘post-processing’ on. Some items of interest have been highlighted.
Obviously there’s some good indicators pulled from the binaries in question. What we need to do is look at defining specific strings (in this case they become ‘x’) which in themselves should be unique to this family of malware, and combined strings (in this case they become ‘s’) which may come up in other binaries, but the combination of them would be common across this malware.
- x: specific strings that can be used as an indicator by themselves.
- s: strings that may appear across multiple samples, but together would become an indicator.
Let’s see if we can get some hits with Yara. In this example we go and modify the rules for ‘Lab01-01.exe’, ‘Lab01-01.dll’, and ‘Lab01-04.exe’ while leaving the others as generated (post-processing is important, but we’re going to do this on a subset today). The aim is to remove some ‘noise’ strings that may have been included but hold no value to uniquely identifying this malware, change the conditions to suit what will cause a hit on this yara rule, and even create some more generic rules based on these strings.
After refining some of the rules they now look like the below. Areas have been highlighted where major modifications have occurred.
As you can see we’ve made some minor, but deliberate changes. From here we can place all the binaries provided as part of the PMA Labs into a folder called ‘All’ and see if we get any hits. We need to install Yara if we haven’t already.
sudo apt install yara
yara PMA-Lab01.yar BinaryCollection/All -s
This will run a standard yara scan across all binaries in the ‘BinaryCollection/All’ folder, and output the strings that matched.
Success. If you look carefully you can see that although we only had rules created for binaries in Lab01, we’ve successfully identified binaries with similar attributes at the below:
- BinaryCollection/All/Lab07-03.exe: Hit on Yara Rule - Lab01_01_2
- BinaryCollection/All/Lab18-02.exe: Hit on Yara Rule - Lab01_03
- BinaryCollection/All/Lab12-04.exe: Hit on Yara Rule - Lab01_04
- BinaryCollection/All/Lab07-03.dll: Hit on Yara Rule - Lab01_01
The results proved we didn’t mess anything up and the Yara rules worked. Of course if the binaries are the same, then of course our yara rules will match, so lets see whether this is the case.
sha256sum BinaryCollection/All/Lab01-01.exe;sha256sum BinaryCollection/All/Lab07-03.exe
sha256sum BinaryCollection/All/Lab01-03.exe;sha256sum BinaryCollection/All/Lab18-02.exe
sha256sum BinaryCollection/All/Lab01-04.exe;sha256sum BinaryCollection/All/Lab12-04.exe
sha256sum BinaryCollection/All/Lab01-01.dll;sha256sum BinaryCollection/All/Lab07-03.dll
The results of this show that 2 of the hits were for the same binary that had just been renamed, and two were from unique binaries that we’ve successfully determined have similar characteristics.
It’s entirely possible that these may be false positives which is why it’s useful to run them over a repository of clean binaries, luckily YarGen has already done a lot of this hard work for us. If we wanted to go a step further we could then use this to hunt for hits across a public malware repository such as Hybrid Analysis which is powered by MalQuery. This would help to confirm that our rule wasn’t overzealous in its detections.
Congratulations, you’ve just used the power of Yara with some assistance from yarGen to create your first Yara rule designed to detect malware used in Lab01 of PMA.
For those who are interested in the Yara rules created for closer analysis, they can be found here:
Tutorial Conclusion
This concludes my write-up of ‘Practical Malware Analysis’ by Michael Sikorski, and Andrew Honig, published by No Starch Press.
Although this has covered a lot, it really is still an introductory to reversing malware and binaries, and there’s so much that can be touched on. From unpacking modern protectors which have virtualisation included in the packer themselves, to malware which utilises direct syscalls, there’s always more tips and tricks which can be learnt.
If you’ve found this valuable or have any feedback, please feel free to let me know, share, or buy a coffee to show your gratitude.