Off-Topic: With this blogpost I may waste my chance for a good aprils fool but neither the less, I just publish it today. On the other hand this technique is, while real, so niche, that it maybe could be seen as a joke to somebody…
Run dynamic code in C on Windows
Lately I’m experimenting with running machinecode from within an C program. I found some
examples of this technique for Linux systems using mmap
and mprotect
, but nothing for
Windows. So in this short post I wan’t to write about how this works on windows. Maybe there
are already some nice guides about this topic, but I didn’t found them. Writing about this
myself helps me to document my work and maybe it helps somebody else.
I really have a problem with finding a name for this technique. If you know a common name for it, please write me on Twitter. Until then I stay with “running dynamic code” :)
What I want to achieve
I want to be able to supply some raw opcodes to a function and let them be executed at runtime. This can be useful for example in a JIT compiler, where the sourcode is compiled just before the program (or even function) is run (just-in-time). In such a compiler the required opcodes are normally generated on the fly and not saved in a (executable) file but stored in memory.
Another usage example would be a compressed or encrypted executable. There are some niche usecases for this, like some maleware that wan’t to disguise their presence, or systems with really low persistent memory but enough RAM to decompress the executable at runtime. A compressed or encrypted executable would need an uncompressed loader that first uncompresses/decrypts the main application and then launches it.
At the moment I’m just interested in this technique by itself without any real usecase. But maybe in the future I might look into JIT compiling and executable compression.
Limitations
First a disclaimer: This technique is more of an expert/niche thing and should not be used without a second thought. You may even introduce a (many) security flaws or hard to debug code. In most cases there are better ways to solve an problem than this. For exmple, if you wan’t an application wich allows for user-modification like plugins or mods, you could use a dynmic library (.dll/.so/.dylib) which is loaded at runtime. Or even an external but production ready scripting language like Lua.
Another downside of this technique is that it is heavily unportable. At least, you need to generate the right opcodes for the processor architecture you’re using. For example, the opcodes for an x86-64 processor are completly different then for an ARM processor. On some architiectures you need to apply extra steps like clearing a special cache for this to work. Some architectures even don’t support this technique directly, like AVR. AVR uses different memory for code and data, and you would need to reflash your code memory on the microcontroller at runtime. This is indeed possible, but it would havily reduce the lifetime of such a microcontroller, as their flash memory isn’t really optimized for this usecase and has normaly a guranteed lifetime of about 10,000 write-erase cycles 1.
But even if you know on what processor it runs, you may need to account for differences in the operating system, espacially if you want to directly call standard library functions.
If you want to use this technique in production I would recommend to look for a library which helps with running code in a more cross-platform manner.
How it works (on Windows)
Lets get our hands dirty! In this post I will only look at the function which runs the opcodes but you can find the full sourcecode on GitHub.
#include <stdio.h>
#include <stdbool.h>
#include <Windows.h>
/**
* Helper function to execute dynamic code.
*
* @param code a pointer to the code to execute
* @param length the length of the code
* @return true if the code run successfully, otherwise false
*/
bool run_code(const void* code, size_t length)
{
// Allocate new memory for code execution.
// We don't set the execute flag because we won't want to
// open doors to malicious software.
// dynMemory must be declared as volatile, as some Release
// optimizations cause problems on VirtualFree. Maybe some
// sort of reordering.
volatile LPVOID dynMemory = VirtualAlloc(NULL, length, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
if (dynMemory == NULL)
{
printf("Alloc Error: %ld\n", GetLastError());
return false;
}
// Copy code to new memory.
memcpy(dynMemory, code, length);
// Change protection.
// We remove the write flag and set the execute flag on that page.
DWORD oldProtection;
if (!VirtualProtect(dynMemory, length, PAGE_EXECUTE, &oldProtection))
{
printf("Protect Error: %ld\n", GetLastError());
return false;
}
// Call the new function in memory.
int result = ((int (*)(void))dynMemory)();
printf("Result: %d\n", result);
// At the end we free the memory.
if (!VirtualFree(dynMemory, 0, MEM_RELEASE))
{
printf("Free Error: %ld\n", GetLastError());
return false;
}
return true;
}
This function takes a pointer to some raw bytes and the length of it. It then uses the WinAPI function
VirtualAlloc
to allocate some memory pages for our code to live in. VirtualAlloc allocates enough whole pages so our
length
-long code will fit. These new pages are declared as readable and writable but not executable.
This is because a memory page should never be executable AND writable at the same time. This is called W^X
(write XOR execute) and you can read more about it here.
The result of VirtualAlloc is a pointer to our newly allocated memory and it is saved in an volatile LPVOID
pointer. The volatile keyword is needed because it disables
some optimizations that caused the program to crash when compiled with msvc in release mode. In debug mode
everything workes as expected even without volatile
. To be honest, at the moment I don’t know why it breaks
in release, maybe some sort of instruction reordering? But it is just another example why you should be careful
with this technique. If somebody knows why this breaks in release, please write me, I really would like to know.
In the next step we copy our code from the supplied pointer to the newly created, page-aligned memory. After
we copied our code, we don’t need to modifiy the memory anymore. Because of this we use
VirtualProtect
to modifiy the memory permissions to executable but not writable. The old permissions will be
saved in oldProtection
. We aren’t interested in these, but without suppling a valid pointer as the fourth
parameter to VirtualProtect, the function will fail according to the documentation.
After this we can jump into our code. We do this by casting the pointer to the code to a function pointer and then calling it directly. The syntax for this is at least cryptic so let me go through this in more detail:
((int (*)(void))dynMemory)()
Here we cast our pointer to a function pointer which doesn’t expect any parameters and returns an integer.
The (void)
denotes that this function doesn’t take parameters. The (*)
with the paranthesis is required
for a function pointer cast. The int
at the beginning denotes as expected the return type. And lastly the
paranthesis at the end is there because we want to call this function directly.
On 64bit Windows the integer return value of a function is expected to be in the EAX register, so you
opcode should put a meaningful value in this register. This is the easiest way to communicate with the
calling code from within the dynamic code. If you’re not interested in a return value, you can change the
type of the function in the cast to void. On the other side, if you want to supply the dynmic code with
values, you can define parameters instead of void. The example code on GitHub performs an addition and
subtraction and then returns the value in eax. The result is printed directly after the call in the printf
.
Make sure you don’t run code from an untrusty source. It will run with the same permissions as the calling process/thread.
Last but not least we need to free our code memory. This is done with a call to VirtualFree.
Links
Full sourcecode on GitHub: GitHub Gist