Remcos RAT - Malware Analysis Lab

16 minute read

ReemcosPicture

Overview

Part 1: Preliminary Static Analysis of Starting Binary

Taking a malicious executable which has been categorised as a trojan with the name ‘MSIL/AgentTesla’ and ‘TR/AD.Remcos’ on VirusTotal, we can explore it further.

First off obtain the sample with a particular SHA256 hash:

Starting IOC (SHA256): 7a1bb4fe0f62425fdd2e163ea17d84465323c4f2df8aabb8a50b1433e7d42a9f

Analysis in pestudio reveals this is a .NET, 32-bit executable with timestomped Debugger and Compiler timestamps. It also had an original name during development of ‘tocehi.exe’; however, this may also have been tampered with.

RemcosPicture

Examining the resources section there is a resource with exceptionally large entropy which indicates it likely contains compressed or encrypted data. There’s also a repeating theme of bytes spelling out ‘PAD’ which may indicate junk data has been added as padding to the binary to make it more challenging to analyse.

RemcosPicture

Examining the imports shows this is likely dynamically invoking and loading assembly into memory in addition to possibly performing string reversal operations.

RemcosPicture

Part 2: Decompiling Binary

Opening in dnSpyEx, by right clicking the executable and using ‘Go to Entry Point’ this takes us to the start of our binary where it runs a new instance of Form1.

RemcosPicture

Examining this shows what looks to be decoy code and an instance of the form component being initialized. Of interest is that this form uses the System.Reflection class which is unusual and signifies reflective loading of code will likely occur.

RemcosPicture

Examining this shows a lot of form initialization which seems innoculous at first; however, at one point it gets a string stored within a resource object called “CFD” within the class ‘Form11’, replaces all instances of “$” with “E”, and reverses it before convering this to base16 (Hex) and loading it in as raw assembly.

RemcosPicture

By copying the resource CFD into CyberChef and performing these operations, it’s revealed this is likely a PE file being loaded into memory.

RemcosPicture

Saving this to a file, a new SHA256 hash can be obtained which has been seeen by VirusTotal and is flagged as a ‘spreader’ and ‘injector’.

IOC (SHA256): 280001013946838a651abbdee890fa4a4d49c382b7b5e78b7805caef036304e2

Of interest is that when this instance is created it is passing an object array called ‘ext’ which is defined from Form1.EXT. This is defined as a string array containing the entries “71474C547242”, “69786F”, “DoanHQTCSDL” all 3 of which are essential for later analysis.

RemcosPicture

Part 3: Examining Embedded 1st Stage Payload (Pend.dll)

Opening up in pestudio shows this had an internal name of ‘Pend.dll’ during development, is likely obfuscated using the SmartAssembly .NET obfuscator, and was compiled on May 3rd, 2023 at 04:45:48 UTC which seems far more plausible to be legitimate given the time this sample was found in the wild.

RemcosPicture

Using de4dot the binary can be deobfuscated automatically.

RemcosPicture

Looking at the main method of this binary shows more decoy code and then a call to a method named ‘xp’, passing in 3 strings string_0, string_1, string_2.

RemcosPicture

It should be noted that these 3 strings are the 3 strings seen in our original binary which were passed to this DLL upon instantiation.

This method instantiates an instance of what’s returned from the ‘oJ’ method, specifically of type ‘Munoz.Himentater’.

RemcosPicture

Examining this reveals an overly large byte array which is being GZip decompressed back into a MemoryStream to be loaded.

RemcosPicture

By changing our decompiler back into Common Intermediate Language language (IL) and examining method ‘oJ’ again, the full array bytes can be located within a defined structure, and the instructions seem similar to what was seen before, specifically defining a byte array of size 14340.

RemcosPicture

RemcosPicture

By copying this array into CyberChef, removing whitespace and line breaks, and converting from hex, it is then identified as a Gzip file.

RemcosPicture

Saving this and using a tool such as 7-zip allows it to be decompressed, noting that there’s data appended to this memory stream.

RemcosPicture

Part 3: Examining Embedded 2nd Stage Payload (Cruiser.dll)

Examining this new binary in pestudio reveals it is also likely obfuscated using SmartAssembly .NET obfuscator, and had a name of ‘Cruiser.dll’. It was also likely compiled on Monday April 10th 2023 at 10:01:02 UTC which seems plausible given it’s before the injector binary was compiled.

IOC (SHA256): 40C050C20D957D26B932FAF690F9C2933A194AA6607220103EC798F46AC03403

Examining this on VirusTotal it is flagged as a trojan with the name ‘tedy/vsntdh23’.

Repeating the same process with de4dot and decompiling this shows that it has the namespace ‘Munoz’ which contains the class ‘Himentater’ amongst others which is specifically what we’re looking for. If we consider the first stage of this malware which ran the method ‘xp’, we can see that it was using the method ‘CasualitySource’, and is first passing in string_0 (“71474C547242”) before string_1 (“69786F”).

RemcosPicture

Examining CasualitySource reveals it is a simple string operator which converts hex given to its raw ASCII format which results in the values ‘qGLTrB’ and ‘ixo’.

RemcosPicture

Part 3: Examining Embedded Steganography Binary

The next part of code involves looking at multiple binaries together and gets a bit involved. The chain of events are as follows:

  • Return a bitmap through a method called ‘jR’ via a class called ‘sZ’ from within the namespace ‘TN’ which is inside of the 1st stage payload. By using the variables string_0 (qGLTrB) and string_2 (DoanHQTCSDL), make up the targeted resource (DoanHQTCSDL.Properties.Resources.qGLTrB) and store this into a byte array after subtracting 150 pixels from its height and width.
  • Convert the returned bitmap into a byte array through a method called ‘sZ’ via a class called ‘sZ’ from within the namespace ‘TN’ which is inside of the 1st stage payload.
  • Perform operations using the ‘SearchResult’ method from within the 2nd stage payload to convert the byte array using the term variable string_1 (ixo).
  • Load the deobfuscated byte array as an assembly into memory

RemcosPicture

Although these operations can be manually reversed, it’s a bit tedious and complicated. Instead we can run the original binary in dnspyEx to get the final product. To do this:

  • Open the original executable in dnSpyEx and create a breakpoint on mscorlib.dll within the Sleep function statement checking if the AppDomainPauseManager is paused.

RemcosPicture

  • Run the binary until the breakpoint is hit and use Step Out (Shift + 11) to land at the start of the decompiled 1st stage binary (Pend.dll).
  • Create breakpoints at the operations which are retrieving the 3rd stage payload and observe the Local variable window to see the modifications occurring.
  • At the third breakpoint observe the 3rd stage binary in memory which can now be saved to disk.

RemcosPicture

RemcosPicture

Part 4: Examining 3rd Stage Payload (Discompard.dll)

This new binary crashes pestudio, and examining it in dnSpyEx shows it is posing as software from the company ‘Citroen’, has the name ‘Plant Scientist’, and is apparently copyrighted to the 2004 Citroen C5…righteo then. The binary hasn’t been seen by VirusTotal either; however, we still have luck using de4dot which detects an unknown obfuscator has been used, cleans it up and gives us something pestudio can analyse.

IOC (SHA256): ACB4301D445B5C125A8CEDD00427D6F89EC89A1F01A9D7D4E7CC183D017F984D

The binary appears to have had an internal name of ‘Discompard.dll’, and was possibly compiled on May 3rd, 2023 at 06:40:48 UTC almost 2 hours after the 1st stage was compiled.

RemcosPicture

Examining this in dnSpyEx shows a number of unknown methods and types without any clear entry point.

RemcosPicture

Locating Method To Be Invoked

Despite this we know that the binary was being dynamically loaded into memory through reflection and that it was looking for the 20th element in the returned assembly types. Reflectively loading this module into memory we can store this in an object and examine it.

$malware=$([System.Reflection.Assembly]::Load(([byte[]]@(Get-Content "C:\Users\Barry\Downloads\stage3.dll" -Encoding byte))).GetTypes()[20])
$malware.GetMethods()[29]

RemcosPicture

In the above we have an issue. Although it looks like a correct method to invoke has been pulled, the method isn’t marked static so can’t be invoked like was seen to be occurring when examining the 1st stage malware. Looking at the methods available there’s 9 different ones which can be found with the following:

$malware.GetMethods() | ? {$_.IsStatic -eq "True" -AND $_.IsPublic -eq "True"} | Select -exp Name

RemcosPicture

Based on what was seen in the 1st stage malware, there’s 2 parameters being passed to the invoke method, one of which is the object this is being run on, the other of which are parameters being passed. As the methods are static the object field is ignored. The parameters on the other hand are not, so this tells us that no parameters are to be passed to the method being invoked. By cross-correlating the methods that were seen to be static with those shown in dnspyEx, it’s seen that there’s only 2 methods this could be ‘T4Z5pBPufA’ or ‘uqH5vT69wm’.

RemcosPicture

A glance at ‘T4Z5pBPufA’ shows a single line which sets an integer to 0 and nothing more, whereas ‘uqH5vT69wm’ seems far more promising. In particular ‘uqH5vT69wm’ has a reference to the ApplicationData directory which is suspicious, isn’t called by any other method unlike ‘T4Z5pBPufA’ and to top it all off, it’s method number 29 in the dnSpyEx hierarchy which is the number that was being invoked in the 1st stage payload.

RemcosPicture

It’s currently unclear why the order in dnSpyEx shows correctly, but when dynamically loading into memory using PowerShell this order is scrambled and fails. It’s likely due to multiple methods being retrieved from other DLLs upon reflectively loading, but alas we’re on the right path again.

Debugging The Binary

The binary itself is highly obfuscated and has a number of string building operations which makes manually analysing every component of it tedious; however, the main functions can be seen by stepping through this by debugging it in dnSpyEx. Breaking at line 360 of class ‘TVmkvjWsJHbPcmFjgw’ shows the method ‘uqH5vT69wm’ evaluating what looks to be the original binary being run, and a hardcoded executable name in the Roaming AppData folder which may indicate a copy of the malware is going to be placed there.

RemcosPicture

Breaking at line 14 of the class ‘KIpLLvYdUNjv6s5VFsq’ shows the method ‘J4GMnwe6t’ returning deobfuscated instructions which confirm this suspicion as shown clearly in variables on the stack. This is also reflected in the local variables by breaking on the ‘copy’ method inside of the ‘System.IO.File’ class.

RemcosPicture

Breaking at line 109 of the class ‘TVmkvjWsJHbPcmFjgw’ shows the method ‘zJ45fVOtms’ setting permissions and file attributes on the malware which was copied into the AppData folder. Specifically it obtains the user details of who ran the binary and sets it so they only have Read, ReadAndExecute, and ReadData permissions to the malware on disk. It also sets the malware to not be indexed by Windows, and sets it to be Hidden and a seen as a critical system, binary to make it even more hidden by default.

RemcosPicture

At the end of this method it is seen that the permissions are successfully applied to the malware.

RemcosPicture

Breaking at line 13 of the class ‘CRSlWTd5bfbCGtYKOR’ shows the method ‘J4GMnwe6t’ returning a base64 string. It’s important to note that there’s a large number of classes each with a different method called ‘J4GMnwe6t’ used for deobfuscating strings.

RemcosPicture

Breaking at line 132 of the class ‘TVmkvjWsJHbPcmFjgw’ shows the method ‘Qle573GMuP’ having base64 decoded the string in local variables. We can also base64 decode it ourselves in something like CyberChef to show a XML configuration schema for a scheduled task. Of note is that the UserId field is set as [USERID] and isn’t filled in.

RemcosPicture

Breaking at line 135 of the same class shows the same method having now deobfuscated the user identity to be used. It also has built a string to a temporary file location.

RemcosPicture

Stepping through a couple more instructions shows that a file is written to the identified temporary file containing the complete scheduled task XML. It should be noted this has a hardcoded IOC of the scheduled task registration time being spoofed.

RemcosPicture

IOC Scheduled Task Date: 2014-10-25 14:27:44:8929027

Stepping over a few more functions shows strings building out the value ‘schtasks.exe’. Breaking at line 145 of the class ‘TVmkvjWsJHbPcmFjgw’ shows the method ‘Qle573GMuP’ returning a commandline which will be used with schtasks.exe to register a scheduled task and establish persistence with the task name ‘Updates\ALKgmyycVaEjJx’.

RemcosPicture

Breaking at line 64 of the class ‘iY9VXx99XrYI4WaMs1’ shows the method ‘kHZWDDrKNw’ returning yet another binary which is retrieved and deobfuscated from a resouce which can be saved for analysis.

RemcosPicture

Stepping through a little further it appears that this is being injected into a surrogate process in method ‘Er95CrvjJc’ of class ‘TVmkvjWsJHbPcmFjgw’, so we can now move onto stage 4.

Part 5: Examining 4th Stage Payload (Remcos RAT)

The 4th stage payload is the most promising yet. Examining it in pestudio shows it was likely compiled on December 20th, 2022 at 21:35:57 UTC, is created in a completely different language to the previous injectors (3 injected DLLs plus the initial injector wrapper), and it is created in C++ as opposed to .NET.

RemcosPicture

IOC SHA256: 94a4e5c7a3524175c0306c5748c719a940a7bfbe778c5a16627193a684fa10f0

Checking this binary on VirusTotal it has been categorised as a trojan with the name ‘remcos’ on VirusTotal and has a significant detection rate. This means we’ve likely finally hit the final stage payload of remcos. Further to this by examining the resources sections there is a ‘SETTINGS’ resource which is a known indicator of Remcos RAT. IT also has a high entropy level indicating it is likely compressed or encrypted.

RemcosPicture

Decrypting the Remcos RAT Configuration Resource

Leveraging a post from the team at Morphisec who have analysed a different sample in the past highlighted how Remcos RAT uses rc4 encryption on the ‘SETTINGS’ resource to encrypt to malware configuration. Specifically it uses the first byte in this resource to define the key length, the next amount of bytes up to that key length is the key, and the rest of it is the encrypted data.

RemcosPicture

Using this in CyberChef provides what looks to be a configured C2 server and port, in addition to what may be a unique identifier and a number of other fields.

RemcosPicture

IOC C2 Domain: gdyhjjdhbvxgsfe[.]gotdns[.]ch"
IOC Port: 2718
IOC Host Identifier: Rmc-JQX1JF

A publicly avaialble decoder by kevthehermit provides extra insight into this configuration file and what data it may contain.

Rebasing and Dynamically Resolving Imported Functions

Given the RAT is created in C++, leveraging x32dbg and Ghidra is a great way to uncover how it works at different parts of the program. Specifically if the base address between these are synced then it’ll make the analysis process that much smoother. After opening in x32dbg the base address can be seen and copied in the Memory Map.

RemcosPicture

Opening the memory map in Ghidra, this can be rebased by using the house icon.

RemcosPicture

A quick and dirty way of getting context on what may be resolved from or sent to a particular API is to create a conditional breakpoint which logs the address information of all registers whenever an API call of interest is made. Breaking on LoadLibraryExW and LoadLibraryA in x32dbg by running the below command can be used as a starting point.

bp LoadLibraryExW
bp LoadLibraryA

These can then be edited to include ‘Log Text’ similar to the below.

LoadLibraryExW: eax:{a:eax} ebx:{a:ebx} ecx:{a:ecx} edx:{a:edx} ebp:{a:ebp} esp:{a:esp} esi:{a:esi} edi:{a:edi}
LoadLibraryA: eax:{a:eax} ebx:{a:ebx} ecx:{a:ecx} edx:{a:edx} ebp:{a:ebp} esp:{a:esp} esi:{a:esi} edi:{a:edi}

RemcosPicture

By running the program, every time the breakpoint is hit, x32dbg will log the registers. Although it may seem noisy, this approach can quickly gain useful information. In this sample ‘LoadLibraryA’ appears to be used to get a handle on and load a number of functions at run time such as ‘GetComputerNameExW’, and ‘GetSystemTimes’ which are not explicitly imported by the RAT.

RemcosPicture

Examining Entry and Editing WinMain Function in Ghidra Decompiler

Examining the entrypoint in Ghidra shows 2 functions, ‘__security_init_cookie’ and ‘__scrt_common_main_seh’ which are part of C and C++ initialisation code.

RemcosPicture

The main remcos method exists inside of ‘__scrt_common_main_seh’ and can be examined. Using Ghidra the function call tress can be shown which at a high level reveals 2 custom functions of interest; however, only one doesn’t contain a single line and has substantial subfunctions ‘FUN_0040db10’.

RemcosPicture

Glancing at this function in Ghidra makes it apparent that this is the WinMain function given it is running all subfunctions; however, the Ghidra decompiler has failed to identify this, and it is reporting only 3 parameters being passed to the function ‘0x400000’, ‘0’, and ‘pcVar7’. By right clicking and editing the function this can be cleaned up.

RemcosPicture

Going through each datatype, this function can be fixed to more accurately replicate the WinMain function signature shown below.

int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, PSTR lpCmdLine, int nCmdShow)

RemcosPicture

The end result is as follows:

RemcosPicture

Saving this and returning to the decompiled code shows something which makes a lot more sense. The instance is getting a handle to the current executable’s DOS Header, passed parameeters are being retrieved by ‘__get_narrow_winmain_command_line’, and whether or not to show this application window is being retrieved from ‘__get_show_window_mode’.

RemcosPicture

A quick look at what’s retrieved for the show window value shows this will always return 0.

Initial Analysis of WinMain Function in Ghidra

Combing over WinMain shows a large number of functions are present. The first function ‘FUN_0041ae1a’ at a glance looks like it is dynamically importing libraries to be used at runtime based on its API calls.

RemcosPicture

Examining the function confirms these suspicions, and also provides more context to the imported functions which were seen during dynamic analysis.

RemcosPicture

This can be renamed to something more meaningful such as ‘FUN_Load_Imports’.

Decrypting SETTINGS Resource in Ghidra and x32dbg

Examining the next function ‘FUN_0040e4a3’, shows it immediately makes a call to ‘FUN_004199a9’ which appears to be getting the SETTINGS resource using FindResourceA and LoadResource, and is storing this into a byte stream to be used.

RemcosPicture

There’s also mention of FID_conflict which occurs from the ‘Function ID’ Ghidra analyser which has found multiple functions which match a computed hash during analysis. This can be resolved by using a plugin such as Andrew Strelsky’s ResolveFidScript

RemcosPicture

Using x32dbg, a breakpoint can be set at the address ‘0040E4E3’ (offset 0xE4E3). Once run, it’s shown in the memory dump of register EBX that this was in fact retrieving the RC4 key from the SETTINGS resource.

RemcosPicture

At this point the Base Pointer Register (EBP) is also set to 76 which is the RC4 key length.

RemcosPicture

Finally the entire contents of the SETTINGS resource is stored within the Destination Index Register (EDI).

RemcosPicture

‘FUN_004199a9’ can now be renamed to ‘FUN_Load_Config’ in Ghidra. Creating breakpoints on each function call and running the program in x32dbg gives some idea of later functions. Specifically only minor operations occur until ‘FUN_0040644c’ at address ‘0040E53B’ (offset 0xE53B). At this call the Source Index Register points to memory containing the RC4 encrypted content, and EBX contains the RC4 key.

RemcosPicture

Looking into this function it runs a couple of other functions, but of particular interest is function ‘FUN_004063b0’ which is performing some sort of iterative looping operation with a noted array of 256 integers having been defined which is being used in subsequent XOR operations.

RemcosPicture

Knowing how RC4 works makes identifying this function as the Pseudo-random generation algorithm (PRGA) much easier. Comparing the pseudocode to Ghidra’s decompiled output these operations can mostly be seen.

RemcosPicture

The function immediately prior to the PRGA function is part of a necessary Key-scheduling algorithm (KSA) that takes place during RC4 encryption and decryption (FUN_0040632b), and can be noted by the use of 0x100 (256) which is the max keylength that is defined in an array used in ‘FUN_004063b0’. At a glance it’s more difficult to determine what is occurring here based solely on Ghidra’s decompiled interpretation; however, the function graph helps to see common looping trends.

RemcosPicture

Jumping over to x32dbg, a breakpoint can be placed at address ‘0040643A’ (offset 643A) to see how this impacts the RC4 encrypted data on the stack. On first run it can be seen that the first byte changes to a ‘g’ in ascii.

RemcosPicture

Breaking outside of this loop at address ‘0040642B’ (offset 0x642B) shows the decrypted content on the stack.

RemcosPicture

From this ‘FUN_0040644c’ can be renamed to ‘FUN_RC4_PRGA’, ‘FUN_0040632b’ can be renamed to ‘FUN_RC4_KSA’ and ‘FUN_0040e4a3’ can be renamed to ‘FUN_Load_Decrypt_Config’ in Ghidra.

TBA

The next major operation which occurs is a comparison checking to see whether ‘-l’ is being passed to Remcos as part of a string comparison at address 0040DB76 (offset 0xDB76)

RemcosPicture

MORE TBA