Practical Malware Analysis - Chapter 21 Lab Write-up

21 minute read

PMALab

Chapter 21. 64-Bit Malware

Various Implementations:

x64 or x86-64 (previously known as EM64T) is the most popular implementation of 64-bit code on Windows (Intel).
Other implementations include AMD64 which was one of the first 64-bit code implementations on Windows (AMD).

Supported OS:

The vast majority of currently supported Windows versions are available in 32-bit and 64-bit versions.
64-bit operating systems support both 64-bit and 32-bit programs.
Not all debugging, decompiling, or disassembling tools will support 64-bit programs.

Reasons for Compiling 64-bit Malware:

32-bit code cannot be run inside 64-bit applications or vice versa.
Any kernel drivers or kernel code must be compiled for the same type of OS it is running under (e.g. 32-bit OS needs to be compiled as 32-bit, 64-bit OS needs to be compiled as 64-bit).
Any plug-ins, shared libraries, injected DLLs or other code which is running within a process needs to be the same as the process it is running within. For example an executable compiled as 64-bit will require 64-bit code to properly run within it.
Further to the above, this extends to shellcode which may be used. The shellcode needs to be written specifically for the same process it is intended to be run within (32-bit for 32-bit, 64-bit for 64-bit).

Differences in x64 Architecture:

In x64 Architectures the following differences apply when compared to x86:

General purpose registers have grown in size and have had the first character ‘E’ replaced with ‘R’, with the 32-bit registers still being available. For example RBX is the 64-bit version of EBX.
All pointers and addresses are 64-bits.
RDI, RSI, RBP, and RSP general purpose registers now have byte support by adding ‘L’ to allow the lowest 8-bits. For example if we are discussing ‘RDI’, DIL accesses the lowest 8-bits, DI accesses the lowest 16-bits, EDI accesses the lowest 32-bits, and RDI accesses the full 64-bits.
Special purpose registers such as the Instruction Pointer have been renamed similar to general purpose registers. For example RIP is the 64-bit instruction pointer which used to be EIP on a 32-bit system.
There’s twice as many general purpose registers. These are labeled R8 to R15 (QWORD). 32-bit versions can be accessed as R8D to R15D (DWORD). 16-bit versions can be accessed as R8W to R15W (WORD). 8-bit versions can be accessed as R8L to R15L.
- More registers can be found by revisiting Chapter 4 (A Crash Course in x86 Disassembly).
64-bit code supports RIP-relative addressing or Instruction pointer-relative addressing. This means data can be accessed based on an offset from the current instruction pointer, whereas in 32-bit it requires absolute addressing if it is not data at an offset to a register.
- This is most applicable to Position-Independent Code (PIC) and shellcode discussed in earlier chapters.
- This may not appear different in something such as IDA (as the disassembler has done the work of automatically resolving this) but it is shown when examining raw opcodes used. The raw opcode doesn’t contain the address specified, but rather an offset to the current instruction pointer.

Differences in the x64 Calling Convention and Stack:

Note: These general rules apply to anything a compiler has generated. There may be cases where these aren’t followed if hand-crafted assembly has been used.

64-bit calling process is similar to ‘fastcall’ mentioned in Chapter 6. The first 4 parameters of any given call are passed in RCX, RDX, R8, and R9 registers with additional ones stored on the stack.
In 32-bit pop/push instructions can be used to allocate space on the stack at any time during a function. In 64-bit, functions cannot allocate space on the stack in the middle of their function. Put simply, the stack only grows at the start of the function and stays the same throughout the entire function.
- 64-bit exception handlers require this to be followed. Not following this can cause a crash if an exception occurs.
There’s no easy way to tell if a register was populated before a function for the purpose of it being passed to the function, or for another reason.

Leaf and Nonleaf Functions:

64-bit stack has leaf and nonleaf functions.
- A function which calls another function is called a nonleaf function (sometimes called a frame function).
- All other functions are leaf functions.
Nonleaf functions need to allocate 0x20 bytes of space on the stack whenever a function is called as this is used to store RCX, RDX, R8, and R9 in that space if required.
- If more than 0x20 is allocated we know there are local stack variables in play.

Prologue and Epilogue 64-bit Code:

Windows 64-bit assembly code has a ‘prologue’ and ‘epilogue’ at the beginning and end of a function respectively.
Any ‘mov’ instructions at the start of a ‘prologue’ are storing parameters passed to the function.

64-bit Exception Handling:

Structured exception handlers don’t use the stack in x64 like they do in x86.
- In x86 they are often accessed via a pointer to fs:[0].
- In x64 these are coded into the PE file itself.
- The .pdata section contains a _IMAGE_RUNTIME_FUNCTION_ENTRY structure for every function which stores the start and end of that function, in addition to a pointer for their associated exception-handlers.

Windows 32-bit Run on a 64-bit OS:

Subsystem exists on 64-bit operating systems called WOW64 which is used to allow 32-bit code to run on the 64-bit OS.
- WOW64 uses 32-bit mode of x64 processors.
- WOW64 needs extra work-arounds to support the registry and file system.
‘SYSTEMROOT’ is generally accessed to locate required DLLs by programs (usually \Windows\System32 on your drive). A separate location is required to support 32-bit DLLs (usually \Windows\SysWOW64 on your drive).
The reason WOW64 hosts 32-bit binaries, and System32 hosts 64-bit binaries, is for compatibility reasons due to System32 being the default location throughout Windows.
On a 64-bit OS if a 32-bit binary is run, the OS will automatically redirect all requests for the System32 directory to SysWOW64.
On a 64-bit OS if a 32-bit binary is run, the OS will automatically redirect all requests for the registry key HKEY_LOCAL_MACHINE\Software to HKEY_LOCAL_MACHINE\Software\Wow6432Node.
A 32-bit binary can still access \Windows\Sysnative and be redirected to \Windows\System32.
The ‘IsWow64Process’ function can be used to determine if code is running in a 64-bit process.
The ‘Wow64DisableWow64FsRedirection’ function can be used to disable OS redirections in the current process thread.
API calls to registry functions such as ‘RegCreateKeyEx’, ‘RegOpenKeyEx’, and ‘RegDeleteKeyEx’ also now have a flag to specify if it should access the 32-bit or 64-bit version of the registry.

64-bit Clues on Malware Functionality:

Note: This applies to anything a compiler has generated. There may be cases where these aren’t followed if hand-crafted assembly has been used.

When examining x64 code it’s easier to determine if something is a pointer.
- This is because all new pointers must be 64-bits, and as such must be in ‘R’ based registers.
- If something is being moved into a 32-bit register such as ECX, it is not a pointer.
- If something is being moved into a 64-bit register such as RCX, it is potentially a pointer. An exception would be if a QWORD variable has been defined, but most developers would likely not need a variable that large and would instead opt to use a DWORD (32-bit) variable.

Lab 21-1

Analyze the code in Lab21-01.exe. This lab is similar to Lab 9-2, but tweaked and compiled for a 64-bit system.

Question 1

What happens when you run this program without any parameters?

Answer 1

If we attempt to run this in a x86 (32-bit) OS, we’re presented with an error message that it is not compatible with this version of windows as it has been compiled for a 64-bit OS.

Lab21-01.exe

Attempting to run this in a 64-bit OS with a tool such as procmon running reveals that it simply exits and doesn’t do anything of interest. A number of events are still shown based on registry keys queried and files loaded, as part of normal process execution, but nothing stands out as interesting.

Lab21-01.exe

Question 2

Depending on your version of IDA Pro, main may not be recognized automatically. How can you identify the call to the main function?

Answer 2

If we open this in IDA Free 7.0 as a standard AMD64 PE file…

Lab21-01.exe

We find that we’re dumped into the main function located at 0x1400010C0.

Lab21-01.exe

If we instead open this in another disassembler which doesn’t identify the main function, in this case we’ll open it in Cutter, if we enable offset visibility in Cutter preferences, we can see we start at 0x140001750.

Lab21-01.exe

To identify the call to our main function we will need to look for a call, likely after any ‘GetCommandLineA’ checks which may be present. If we examine the underlying structure of this PE file using Detect-It-Easy (DIE), we can see it was created in C++.

Lab21-01.exe

If we examine what a main function would look like in C++ using the C++ language reference, we find that it must pass ‘argc’, ‘argv’, and ‘envp’ if it is passing environment variables.

Lab21-01.exe

Due to this we know that any call to the main function would need to take 3 parameters, 2 of these being envp and argv, and another which is an integer. If we continue to examine our disassembly, we only find one call to function 0x1400010C0 which takes 3 parameters.

Lab21-01.exe

In this instance ecx is our integer, because we know it is a 32-bit register being accessed, common amongst integer declarations, and 2 other 64-bit register declarations, RDX, and R8. If we examine this same function in IDA Free 7.0, we can see these have been identified as envp, argv, and argc, in addition to the main function being labelled.

Lab21-01.exe

Question 3

What is being stored on the stack in the instructions from 0x0000000140001150 to 0x0000000140001161?

Answer 3

If we jump to ‘0x140001150’, we can see that some large hexadecimal values are being stored on the stack.

Lab21-01.exe

If we press ‘R’ on these to convert them to an ASCII string, we find the following.

Lab21-01.exe

At first glance it looks like the string ‘.lcoexe’ is being stored on the stack; however, this is because x86 and x64 assembly is little-endian (reversed). As IDA has interpreted this as a hex value rather than a string, converting it results in backwards values. If we reverse this we find the following string stored on the stack.

ocl.exe

This is the same value we saw in Lab09-02. Given this, it’s possible the program needs to be called ocl.exe in order for it to run correctly.

Question 4

How can you get this program to run its payload without changing the filename of the executable?

Answer 4

If we examine 0x14000120C we can see that a string comparison looks to take place which is likely looking for very specific conditions to be met to allow the malware to run (possibly checking if it is named ocl.exe).

Lab21-01.exe

When this comparison fails, the program makes a jump to 0x14000120E, which then makes a jump past the primary functions of this malware at 0x140001213, to loc_1400013D7.

Lab21-01.exe

One way we can make this program run its payload without changing the filename is to ensure that even after it fails this check, instead of jumping to loc_1400013D7, it flows right into the primary function which triggers the payload.

If we run this in a debugger such as x64dbg (at this point we’re introducing a newer, more robust debugger which supports 64-bit debugging), we can find the jump located at 0x140001213.

Lab21-01.exe

From here it can be modified to instead perform no operation, effectively allowing the program to flow into its payload.

Lab21-01.exe

If we set a breakpoint at our new NOP values, we can use F9 to run the program and see it hits them without issue. If we then hold F8 to step over functions, we will begin to see a decoding routine runs which gives us a known C2.

Lab21-01.exe

At this point we can be confident that the check used to determine if the filename of the executable is correct has been bypassed.

Question 5

Which two strings are being compared by the call to strncmp at 0x0000000140001205?

Answer 5

Using x64dbg we can easily create a breakpoint at 0x140001205 and see the two strings being compared stored in RCX and RDX. After setting a breakpoint and pressing F9, to run the program until we hit it, we can see the values being compared are the binary name (Lab21-01.exe) and the string jzm.exe.

Lab21-01.exe

Based on this we know that some transformations must be occurring on the string ocl.exe before being used in this comparison.

Question 6

Does the function at 0x00000001400013C8 take any parameters?

Answer 6

Jumping to the function at 0x1400013C8, it isn’t immediately obvious in IDA or x64dbg how many parameters it takes, but what we do see is RBX being moved into RCX.

Lab21-01.exe

Because we know RCX, RDX, R8, and R9 are the first 4 parameters of any given function call in a 64-bit OS, we know that whatever is within RBX at the time of this call will be passed to the function at 0x1400013C8 (sub_140001000). By looking at what is being passed to this in IDA prior to the call, we can see that it is RAX, or more specifically a pointer to the socket returned by WSASocketA.

Lab21-01.exe

By viewing the start of sub_140001000 in IDA we can also see that rcx is being stored back into rbx which is then being used for the standard input, output, and error destination meaning that all output will be redirected to this socket.

Lab21-01.exe

Based on this we know that the function at 0x1400013C8 takes 1 parameter, the socket to our C2.

Question 7

How many arguments are passed to the call to CreateProcess at 0x0000000140001093? How do you know?

Answer 7

It’s not immediately clear how many arguments are passed to the call to CreateProcessA at 0x140001093.

Lab21-01.exe

Given IDA has identified this as CreateProcessA though, we can double click on it and see how many arguments are expected to be passed to this call.

Lab21-01.exe

In this case we can see there are 10 arguments which are expected to be passed to it.

lpApplicationName
lpCommandLine
lpProcessAttributes
lpThreadAttributes
bInheritHandles
dwCreationFlags
lpEnvironment
lpCurrentDirectory
lpStartupInfo
lpProcessInformation

Because this is documented, we know that these 10 arguments need to be passed to CreateProcessA

Lab21-01.exe