Chapter 21. 64-Bit Malware
- x64 or x86-64 (previously known as EM64T) is the most popular implementation of 64-bit code on Windows (Intel).
- Other implementations include AMD64 which was one of the first 64-bit code implementations on Windows (AMD).
- The vast majority of currently supported Windows versions are available in 32-bit and 64-bit versions.
- 64-bit operating systems support both 64-bit and 32-bit programs.
- Not all debugging, decompiling, or disassembling tools will support 64-bit programs.
Reasons for Compiling 64-bit Malware:
- 32-bit code cannot be run inside 64-bit applications or vice versa.
- Any kernel drivers or kernel code must be compiled for the same type of OS it is running under (e.g. 32-bit OS needs to be compiled as 32-bit, 64-bit OS needs to be compiled as 64-bit).
- Any plug-ins, shared libraries, injected DLLs or other code which is running within a process needs to be the same as the process it is running within. For example an executable compiled as 64-bit will require 64-bit code to properly run within it.
- Further to the above, this extends to shellcode which may be used. The shellcode needs to be written specifically for the same process it is intended to be run within (32-bit for 32-bit, 64-bit for 64-bit).
Differences in x64 Architecture:
In x64 Architectures the following differences apply when compared to x86:
- General purpose registers have grown in size and have had the first character ‘E’ replaced with ‘R’, with the 32-bit registers still being available. For example RBX is the 64-bit version of EBX.
- All pointers and addresses are 64-bits.
- RDI, RSI, RBP, and RSP general purpose registers now have byte support by adding ‘L’ to allow the lowest 8-bits. For example if we are discussing ‘RDI’, DIL accesses the lowest 8-bits, DI accesses the lowest 16-bits, EDI accesses the lowest 32-bits, and RDI accesses the full 64-bits.
- Special purpose registers such as the Instruction Pointer have been renamed similar to general purpose registers. For example RIP is the 64-bit instruction pointer which used to be EIP on a 32-bit system.
- There’s twice as many general purpose registers. These are labeled R8 to R15 (QWORD). 32-bit versions can be accessed as R8D to R15D (DWORD). 16-bit versions can be accessed as R8W to R15W (WORD). 8-bit versions can be accessed as R8L to R15L.
- More registers can be found by revisiting Chapter 4 (A Crash Course in x86 Disassembly).
- 64-bit code supports RIP-relative addressing or Instruction pointer-relative addressing. This means data can be accessed based on an offset from the current instruction pointer, whereas in 32-bit it requires absolute addressing if it is not data at an offset to a register.
- This is most applicable to Position-Independent Code (PIC) and shellcode discussed in earlier chapters.
- This may not appear different in something such as IDA (as the disassembler has done the work of automatically resolving this) but it is shown when examining raw opcodes used. The raw opcode doesn’t contain the address specified, but rather an offset to the current instruction pointer.
Differences in the x64 Calling Convention and Stack:
Note: These general rules apply to anything a compiler has generated. There may be cases where these aren’t followed if hand-crafted assembly has been used.
- 64-bit calling process is similar to ‘fastcall’ mentioned in Chapter 6. The first 4 parameters of any given call are passed in RCX, RDX, R8, and R9 registers with additional ones stored on the stack.
- In 32-bit pop/push instructions can be used to allocate space on the stack at any time during a function. In 64-bit, functions cannot allocate space on the stack in the middle of their function. Put simply, the stack only grows at the start of the function and stays the same throughout the entire function.
- 64-bit exception handlers require this to be followed. Not following this can cause a crash if an exception occurs.
- There’s no easy way to tell if a register was populated before a function for the purpose of it being passed to the function, or for another reason.
Leaf and Nonleaf Functions:
- 64-bit stack has leaf and nonleaf functions.
- A function which calls another function is called a nonleaf function (sometimes called a frame function).
- All other functions are leaf functions.
- Nonleaf functions need to allocate 0x20 bytes of space on the stack whenever a function is called as this is used to store RCX, RDX, R8, and R9 in that space if required.
- If more than 0x20 is allocated we know there are local stack variables in play.
Prologue and Epilogue 64-bit Code:
- Windows 64-bit assembly code has a ‘prologue’ and ‘epilogue’ at the beginning and end of a function respectively.
- Any ‘mov’ instructions at the start of a ‘prologue’ are storing parameters passed to the function.
64-bit Exception Handling:
- Structured exception handlers don’t use the stack in x64 like they do in x86.
- In x86 they are often accessed via a pointer to fs:.
- In x64 these are coded into the PE file itself.
- The .pdata section contains a _IMAGE_RUNTIME_FUNCTION_ENTRY structure for every function which stores the start and end of that function, in addition to a pointer for their associated exception-handlers.
Windows 32-bit Run on a 64-bit OS:
- Subsystem exists on 64-bit operating systems called WOW64 which is used to allow 32-bit code to run on the 64-bit OS.
- WOW64 uses 32-bit mode of x64 processors.
- WOW64 needs extra work-arounds to support the registry and file system.
- ‘SYSTEMROOT’ is generally accessed to locate required DLLs by programs (usually \Windows\System32 on your drive). A separate location is required to support 32-bit DLLs (usually \Windows\SysWOW64 on your drive).
- The reason WOW64 hosts 32-bit binaries, and System32 hosts 64-bit binaries, is for compatibility reasons due to System32 being the default location throughout Windows.
- On a 64-bit OS if a 32-bit binary is run, the OS will automatically redirect all requests for the System32 directory to SysWOW64.
- On a 64-bit OS if a 32-bit binary is run, the OS will automatically redirect all requests for the registry key HKEY_LOCAL_MACHINE\Software to HKEY_LOCAL_MACHINE\Software\Wow6432Node.
- A 32-bit binary can still access \Windows\Sysnative and be redirected to \Windows\System32.
- The ‘IsWow64Process’ function can be used to determine if code is running in a 64-bit process.
- The ‘Wow64DisableWow64FsRedirection’ function can be used to disable OS redirections in the current process thread.
- API calls to registry functions such as ‘RegCreateKeyEx’, ‘RegOpenKeyEx’, and ‘RegDeleteKeyEx’ also now have a flag to specify if it should access the 32-bit or 64-bit version of the registry.
64-bit Clues on Malware Functionality:
Note: This applies to anything a compiler has generated. There may be cases where these aren’t followed if hand-crafted assembly has been used.
- When examining x64 code it’s easier to determine if something is a pointer.
- This is because all new pointers must be 64-bits, and as such must be in ‘R’ based registers.
- If something is being moved into a 32-bit register such as ECX, it is not a pointer.
- If something is being moved into a 64-bit register such as RCX, it is potentially a pointer. An exception would be if a QWORD variable has been defined, but most developers would likely not need a variable that large and would instead opt to use a DWORD (32-bit) variable.
Analyze the code in Lab21-01.exe. This lab is similar to Lab 9-2, but tweaked and compiled for a 64-bit system.
What happens when you run this program without any parameters?
If we attempt to run this in a x86 (32-bit) OS, we’re presented with an error message that it is not compatible with this version of windows as it has been compiled for a 64-bit OS.
Attempting to run this in a 64-bit OS with a tool such as procmon running reveals that it simply exits and doesn’t do anything of interest. A number of events are still shown based on registry keys queried and files loaded, as part of normal process execution, but nothing stands out as interesting.
Depending on your version of IDA Pro, main may not be recognized automatically. How can you identify the call to the main function?
If we open this in IDA Free 7.0 as a standard AMD64 PE file…
We find that we’re dumped into the main function located at 0x1400010C0.
If we instead open this in another disassembler which doesn’t identify the main function, in this case we’ll open it in Cutter, if we enable offset visibility in Cutter preferences, we can see we start at 0x140001750.
To identify the call to our main function we will need to look for a call, likely after any ‘GetCommandLineA’ checks which may be present. If we examine the underlying structure of this PE file using Detect-It-Easy (DIE), we can see it was created in C++.
If we examine what a main function would look like in C++ using the C++ language reference, we find that it must pass ‘argc’, ‘argv’, and ‘envp’ if it is passing environment variables.
Due to this we know that any call to the main function would need to take 3 parameters, 2 of these being envp and argv, and another which is an integer. If we continue to examine our disassembly, we only find one call to function 0x1400010C0 which takes 3 parameters.
In this instance ecx is our integer, because we know it is a 32-bit register being accessed, common amongst integer declarations, and 2 other 64-bit register declarations, RDX, and R8. If we examine this same function in IDA Free 7.0, we can see these have been identified as envp, argv, and argc, in addition to the main function being labelled.
What is being stored on the stack in the instructions from 0x0000000140001150 to 0x0000000140001161?
If we jump to ‘0x140001150’, we can see that some large hexadecimal values are being stored on the stack.
If we press ‘R’ on these to convert them to an ASCII string, we find the following.
At first glance it looks like the string ‘.lcoexe’ is being stored on the stack; however, this is because x86 and x64 assembly is little-endian (reversed). As IDA has interpreted this as a hex value rather than a string, converting it results in backwards values. If we reverse this we find the following string stored on the stack.
This is the same value we saw in Lab09-02. Given this, it’s possible the program needs to be called ocl.exe in order for it to run correctly.
How can you get this program to run its payload without changing the filename of the executable?
If we examine 0x14000120C we can see that a string comparison looks to take place which is likely looking for very specific conditions to be met to allow the malware to run (possibly checking if it is named ocl.exe).
When this comparison fails, the program makes a jump to 0x14000120E, which then makes a jump past the primary functions of this malware at 0x140001213, to loc_1400013D7.
One way we can make this program run its payload without changing the filename is to ensure that even after it fails this check, instead of jumping to loc_1400013D7, it flows right into the primary function which triggers the payload.
If we run this in a debugger such as x64dbg (at this point we’re introducing a newer, more robust debugger which supports 64-bit debugging), we can find the jump located at 0x140001213.
From here it can be modified to instead perform no operation, effectively allowing the program to flow into its payload.
If we set a breakpoint at our new NOP values, we can use F9 to run the program and see it hits them without issue. If we then hold F8 to step over functions, we will begin to see a decoding routine runs which gives us a known C2.
At this point we can be confident that the check used to determine if the filename of the executable is correct has been bypassed.
Which two strings are being compared by the call to strncmp at 0x0000000140001205?
Using x64dbg we can easily create a breakpoint at 0x140001205 and see the two strings being compared stored in RCX and RDX. After setting a breakpoint and pressing F9, to run the program until we hit it, we can see the values being compared are the binary name (Lab21-01.exe) and the string jzm.exe.
Based on this we know that some transformations must be occurring on the string ocl.exe before being used in this comparison.
Does the function at 0x00000001400013C8 take any parameters?
Jumping to the function at 0x1400013C8, it isn’t immediately obvious in IDA or x64dbg how many parameters it takes, but what we do see is RBX being moved into RCX.
Because we know RCX, RDX, R8, and R9 are the first 4 parameters of any given function call in a 64-bit OS, we know that whatever is within RBX at the time of this call will be passed to the function at 0x1400013C8 (sub_140001000). By looking at what is being passed to this in IDA prior to the call, we can see that it is RAX, or more specifically a pointer to the socket returned by WSASocketA.
By viewing the start of sub_140001000 in IDA we can also see that rcx is being stored back into rbx which is then being used for the standard input, output, and error destination meaning that all output will be redirected to this socket.
Based on this we know that the function at 0x1400013C8 takes 1 parameter, the socket to our C2.
How many arguments are passed to the call to CreateProcess at 0x0000000140001093? How do you know?
It’s not immediately clear how many arguments are passed to the call to CreateProcessA at 0x140001093.
Given IDA has identified this as CreateProcessA though, we can double click on it and see how many arguments are expected to be passed to this call.
In this case we can see there are 10 arguments which are expected to be passed to it.
Because this is documented, we know that these 10 arguments need to be passed to CreateProcessA
Analyze the malware found in Lab21-02.exe on both x86 and x64 virtual machines. This malware is similar to Lab12-01.exe, with an added x64 component.
What is interesting about the malware’s resource sections?
Because we know this malware is similar to Lab12-01.exe, we can compare the malware’s resource sections to that of Lab12-01.exe and see if there’s any noticable differences. By opening both of these in pestudio, we can see that Lab21-02.exe has an added section called .rsrc and a number of extra imports and strings.
The added section is a resource section, and if we examine what is contained within it, we can see three interesting binaries named x64, x64DLL, and x86.
Although this malware is similar to Lab12-01.exe, we’re not entirely sure how similar it is. To do this we should look at similarities between our sample ‘Lab12-01.exe’ and ‘Lab21-02.exe’. We can utilise a disassembler plugin such as BinDiff to get this information at a glance. This requires a disassembled binary be present from either a paid pro version of IDA, or the free alternative Ghidra.
If we want to install this plugin into Ghidra, we first must install it on our OS from the provided msi file. From here we can run Ghidra and click File > Install Extensions.
The extension we need to install is ghidra_BinExport.zip as shown below.
After installing this, we need to restart Ghidra and disassemble both Lab12-01.exe, and Lab21-02.exe. We should also set these to automatically analyze the selected binary.
Once analysis is complete we can use File > Export, and select ‘Binary BinExport for BinDiff’.
Once this is complete for both our binaries, we can compare the 2 BinExport files using BinDiff. We will need to first setup a workspace.
From here we can compare our 2 BinDiff exports.
The end result is a number of graphs and functions we can drill down into and see what has changed, and in fact a lot has changed between these 2 binaries.
Is this malware compiled for x64 or x86?
Using our BinDiff above we have a graph which tells us what this malware is compiled for.
- x86-32 (32-bit)
We can also use a number of other tools such as peview, PE-bear, pestudio or DIE to get the same information as this is stored in the File Header.
How does the malware determine the type of environment in which it is running?
Comparing this to Lab12-01.exe, our previous analysis revealed that the main method contains a number of checks and operations before the main functionality begins. If we go to the start of the program we may see evidence of analysis failure.
In this instance it’s not a big issue and we can safely ignore that. Scrolling down in IDA, we know that any checks to determine what type of environment it is running in is likely to occur after a call to ‘GetCommandLineA’. We soon find this call, and 5 subsequent calls to examine of interest.
One thing to note is that 3 of these look to directly lead to sub_4017D0 which seems to prematurely throw an error and terminate the malware. One of these calls looks more interesting than the others though which is ‘sub_401260’. If we examine this we can see a reference to ‘IsWow64Process’.
In the above we see the malware is attempting to resolve the location of the ‘IsWow64Process’ export within Kernel32.dll. Where this is found it will then get its own process ID and execute IsWow64Process which has dynamically been resolved (dword_40A7A8).
From this we can tell the malware attempts to resolve and call ‘IsWow64Process’ to determine if it is running on a 64-bit or 32-bit OS.
What does this malware do differently in an x64 environment versus an x86 environment?
If we examine code flow after the check for ‘IsWow64Process’, depending on whether or not this returned true or false in [ebp+var_10], a different number of actions will be taken.
If it returns true, then the malware assumes it is running in an x64 (64-bit) environment. This is due to it being compiled for an x86 (32-bit) environment and needing to be run under WOW64 when executing on a 64-bit OS.
64-bit (x64 Actions):
If we examine the 64-bit case at a glance, we can see the following actions taken:
First we see 2 calls to ‘sub_401000’ which is associated with getting 2 different files from the binary resource section and saving them to disk, these binaries are then saved as ‘Lab21-02x.exe’ and ‘Lab21-02x.dll’ before Lab21-02x.exe is launched and the program terminates.
32-bit (x86 Actions):
If we examine the 32-bit case at a glance, we can see the following actions taken:
First we see a call to ‘sub_401000’ which is once again associated with getting a file from the binary resource section and saving it to disk (Lab21-02.dll). From here we then see the malware allocating the extracted DLL into a buffer for later use, and attempting to grant itself Debug Privileges. From here it then opens a handle to a process with the name ‘explorer.exe’.
Looking further at the malware, we can see that after getting a handle to a process with the name explorer.exe, it will attempt to open the process, allocate memory, write the dropped DLL into that process memory, and then create a remote thread to run and execute the injected DLL.
Based on this we know that the malware will attempt to run one of the dropped binaries in a x64 environment, whereas a x86 environment it will attempt to inject the dropped binary into a process called explorer.exe.
Which files does the malware drop when running on an x86 machine? Where would you find the file or files?
From the above analysis we know that the file dropped when run on on x86 machine will be ‘Lab21-02.dll’. Using Procmon and running the binary on an x86 system we can see this attempting to be written. In this case we haven’t run the malware as an administrator so it is unable to write the file.
This shows us that the file Lab21-02.dll is dropped to C:\Windows\System32\Lab21-02.dll when run on an x86 machine.
Which files does the malware drop when running on an x64 machine? Where would you find the file or files?
From the analysis in question 4 we know that the file dropped when run on on x64 machine will be ‘Lab21-02x.exe’ and ‘Lab21-02x.dll’. Using Procmon and running the binary on an x64 system we can see this attempting to be written. In this case we haven’t run the malware as an administrator so it is unable to write the file.
This shows us that the files Lab21-02x.exe and Lab21-02x.dll are dropped to C:\Windows\SysWOW64\Lab21-02x.exe and C:\Windows\SysWOW64\Lab21-02x.dll when run on an x64 machine.
If we take a closer look at ‘sub_401000’ which performs the file dropping on both an x64 and x86 OS, we can see how this happens.
In both cases a call is made to ‘GetSystemDirectoryA’ which should return C:\Windows\System32\; however, because this is a 32-bit binary being run on a 64-bit OS, the OS instinctively sets up a redirect to C:\Windows\SysWOW64. This is done because the SysWOW64 directory contains necessary 32-bit compiled DLLs required to allow the operating system to run 32-bit binaries seemlessly.
What type of process does the malware launch when run on an x64 system?
Based on our analysis in question 4 we know this is dropping and launching ‘Lab21-02x.exe’ on an x64 system. If we use the 32-bit debugger version of x64dbg (x32dbg), we can set a breakpoint before the process is started (for example 0x401381) and collect the dropped binaries from C:\Windows\SysWOW64.
From here we can close our debugger, terminate the process before it executes Lab21-02x.exe, and move these to the same directory as the other binaries we’re analysing. Opening both the DLL and EXE in pestudio reveals they don’t have the 32-bit flag set, and as such have been compiled specifically for a 64-bit OS.
From this we know that the malware launches a 64-bit process when run on a x64 system, after the initial 32-bit process is run to drop our 64-bit payloads.
What does the malware do?
To fully answer this question we still need to understand what happens after Lab21-02x.exe is executed, and what the payload of Lab21-02.dll is which is injected into explorer.exe. Starting with Lab21-02.dll, we can first extract this by repeating the process we took in the above question except we need to perform this on a 32-bit OS, and set a breakpoint at a different location (for example 0x401454) which occurs on the x86 path.
We know that this is similar to Lab12-01.exe which injected Lab12-01.dll into explorer.exe. If we compare the file hash of Lab12-01.dll and Lab21-02.dll, we can confirm that these are the exact same DLL.
In this instance the malware when run on an x86 system drops the required DLL which is identical to Lab12-01.dll and injects this into explorer.exe.
At this point we just need to confirm what the malware does once Lab21-02x.exe is executed. If we open this in IDA 7.0, we can see that it is attempting to get a handle on the dropped DLL file Lab21-02x.dll so it’s likely this is going to be used somewhere by a reference to [rsp+1168h+String1].
Shortly after this we see a call to ‘sub_140001090’ before ‘OpenProcess’ is called. Analysis of sub_140001090 reveals this is also looking for ‘explorer.exe’ as a process name to get a handle to.
Shortly after we see a familiar group of calls which indicate that Lab21-02x.dll (stored in [rsp+1168h+String1]) will be the buffer injected into explorer.exe.
At this point it’s beginning to look like this malware injects the same payload into explorer.exe on both 32-bit and 64-bit operating systems, except it sources that payload from different resources in the executable. To confirm this we need to examine Lab21-02x.dll which is being injected and see if it is similar to Lab21-02.dll (or Lab12-01.dll).
To do this we can open both of these in IDA and look at the StartAddress to see that Lab21-02.dll is identical only Lab21-02x.dll is compiled for a 64-bit OS.
From the above analysis we know that the malware first drops secondary payloads from its resource section, and that the resource section payload differs depending on if it is running in a 64-bit or 32-bit OS. On a 64-bit OS it will drop 2 binaries Lab21-02x.dll and Lab21-02x.exe, and then execute Lab21-02x.exe to inject Lab21-02x.dll into explorer.exe. On a 32-bit it will drop a Lab21-02.dll and inject this into explorer.exe. In both cases the injected DLL performs the same action as Lab12-01.dll which prompts the user to reboot with a message counting how many minutes have passed since it executed.
This concludes chapter 21, proceed to the next chapter.