Bell·Curve.
On payload development as a bell curve — and why the smartest move and the dumbest one end up saying the same thing.
Malware development — or, more honestly, payload development — is a bell curve, especially with respect to Red Teaming. When you develop payloads, there are different layers of abstraction included and each one has its own little caveats. The more sophisticated a payload is, the more it diverges from the benign, and the more it diverges from the benign, the easier it is to fingerprint its abnormality. It reflects the exact polar side.
The flow I'd like to follow when I try to create a new payload without spending too much resources into crafting novel techniques is classify the techniques and the primitives abused. If you can master the fundamentals, you can get 90% of the things done in any field. It should not be any different for maldev either. Below, I will try to classify known maldev techniques and their detections along with the primitives of abuse. This methodology has been very useful for me to craft payloads on-the-fly to evade against well known EDRs while leveraging well known techniques with trivial tweaks.
I would assume that any/all maldev starts with basic implementation of process injection. Now how the shellcode is injected (written) and executed is what differentiates the techniques. Regardless of that the primitives remain the same i.e., Read, Write and Execute. And all techniques utilize one or more of these, the detections are also built around the same.
| Technique Family | Scope | Primitive | Detection Focus |
|---|---|---|---|
| Remote Thread Injection | Cross Process | Write + Execute | Remote thread creation |
| APC Injection | Cross Process | Execute | APC telemetry |
| Thread Context Hijacking | Cross Process | Execute | Thread state anomalies |
| Process Hollowing | Cross Process | Write + Execute | Image inconsistency |
| Reflective Loading | Local Process | Read + Write + Execute | Unbacked modules |
| Manual Mapping | Local/Cross | Read + Write + Execute | Loader anomalies |
| Section Mapping | Cross Process | Write + Execute | Shared executable memory |
| Module Stomping | Local/Cross | Write | PE corruption |
| Syscall-based Execution | Local/Cross | Execute | Syscall provenance |
| Threadless Injection | Cross Process | Execute | Indirect execution flow |
| Callback-based Execution | Local/Cross | Execute | Callback registration |
| LOLBin Abuse | Local Process | Execute | Trusted binary misuse |
| Hooking Techniques | Local Process | Write | Code section modification |
| Telemetry Tampering | Local Process | Write + Execute | Sensor integrity failures |
| Driver-assisted Techniques | Kernel-mediated | Write + Execute | Driver/kernel telemetry |
The least interesting ones are obviously the plain vanilla execution methods and there isn't much that can be done with those. Those are great to start learning about process injection and beyond that they serve no other purpose, atleast to me.
Lets quickly map down a brief description of the categories.
01Scope
| Category | Description | Examples |
|---|---|---|
| Local Process Injection | Payload executes inside current process | Reflective loading, local shellcode |
| Cross Process Injection | Payload executes inside remote process | Remote thread, APC, hollowing |
| Kernel-mediated Injection | Kernel assists execution/memory ops | Driver-assisted techniques |
02Memory Manipulation
| Category | Description | Examples |
|---|---|---|
| Local Process Injection | Payload executes inside current process | Reflective loading, local shellcode |
| Cross Process Injection | Payload executes inside remote process | Remote thread, APC, hollowing |
| Kernel-mediated Injection | Kernel assists execution/memory ops | Driver-assisted techniques |
03Execution Transfer
| Category | Primitive | Examples |
|---|---|---|
| New Thread Execution | Execute | Remote thread |
| Thread Hijacking | Execute | Context manipulation |
| APC-based Execution | Execute | APC/Early Bird |
| Callback-driven Execution | Execute | TLS, VEH, hooks |
| ROP/Indirect Control Flow | Execute | Gadget chains |
| Threadless Execution | Execute | Indirect dispatch |
04Loader
| Category | Description | Examples |
|---|---|---|
| Native Loader Abuse | Use OS loader normally | DLL injection |
| Reflective Loading | Self-loading PE | Reflective DLL |
| Manual Mapping | Reimplement loader | PE manual map |
| PE-less Execution | No PE semantics | Raw shellcode |
| Image Mutation | Alter loaded image | Hollowing/stomping |
05Trust Abuse
| Category | Description | Examples |
|---|---|---|
| LOLBins | Abuse signed binaries | MSBUILD, Rundll32-class abuse |
| COM/Service Abuse | Trusted subsystem execution | COM hijack |
| Signed Module Abuse | Trusted module execution | Module proxying |
| Scripting Host Abuse | Trusted interpreters | PowerShell/WMI |
06Telemetry Manipulation
| Category | Purpose | Examples |
|---|---|---|
| Syscall Manipulation | Reduce userland visibility | Direct/indirect syscalls |
| Stack Manipulation | Distort execution provenance | Stack spoofing |
| Hook Evasion | Avoid instrumentation | Unhooking |
| Sleep Obfuscation | Conceal dormant payloads | Gargoyle-style |
| Sensor Tampering | Blind detections | ETW/AMSI tampering |
07Module Stomping
I will choose module stomping because it only has one primitive that I need to worry about. The entire technique of module stomping depends on over'writing' an existing module (dll) with a malicious one. When performed correctly, you do not have to worry about the typical telemetry sources like api hooks as we replace the entire section. Another beauty of stomping is that no new thread needs to be created since we are already writing to executable section of memory.
The detection around this technique is pretty straightforward as well, the defences check whether the on disk file matches the in memory loaded bytes. The straightforward solution to this should either be replacing the disk file or the module to match one other. The former is very easier provided that:
- you can modify the target dll
- the dll is not signed
So you can try to find processes that loaded dlls from user writeable directories and modify them to match your in memory payload. Very vital detail here is that you should not be dropping non-obfuscated, well known payloads on disc. So your Loader should do something like below:
replace a dll on disk -> start the target process -> stomp Module
I am not going to give out the names of the DLLs, if there are any in the first place that do this. But if you can already write a DLL to disk, then why even worry about stomping. You can simply load a dll into a legitimate binary. Might sound sophisticated but it is very dumb and straightforward solution which might actually work. (Remember the bell curve)
The second option and the widely adapted solution to get around the detection is to make sure your memory looks the same as the disk, the opposite way. So now, we have to ensure once our shellcode is executed, it must revert the memory back to the original module that was residing. And also we do not want our shellcode to be readable by defensive products. So the ideal move here is to encrypt/encode the shellcode during execution. Here are a few descriptive steps:
- Copy the original module into a separate buffer (in memory)
- Stomp the module and decrypt the module so that it can execute
- Encrypt the module once execution is finished
- Replace the stomped module with the original module
This is just one way you could approach the problem of module stomping and evading detection. Even this would probably not be sufficient but it is a decent base to build upon. The main point I want to make here is that the bell curve is very real and it is very easy to fall into the trap of over engineering your payloads and techniques. The more you diverge from the benign, the easier it is to detect. So sometimes, the best solution is the simplest one. I have had success for my stage 0/1 payloads by just being normal without any fancy moves. For stage 2 payloads, however, there needs to be some sort of engineering to be done to either blind the telemetry sources or entirely evade.
It is this exact challenge of knowing where to abstract and where to be explicit that makes this process very enjoyable for me. I find that it hits that sweet spot of Zen where you need to embrace the paradoxical nature of being dumb and creative at the same time.
08Callback Execution
Lets look into another technique. This one is callback based code execution. I like the concept of this because it abuses deception by leveraging a legitimate api call which in return calls the shellcode residing at the given address (function pointer). The call stack looks original but the detection around this is hardcoded list of APIs. Overtime, I have noticed many maldevs have found less common APIs that facilitate this.
The theory behind this is very straightforward as well — identify windows API from dlls that implement callbacks. The callback functions are easy to spot as they have a distinct function definition. One can script their way to identify callback functions and implement custom callback based execution vectors.