Add TOML function hook definitions

Implements the [[patches.func]] feature that was mentioned as planned
in the README. This allows defining recompiler hooks in the config file
that call specific functions at designated points.

Added FunctionHookDefinition struct supporting two modes:
- before_call: Hook at function entry
- before_vram: Hook at specific instruction address

Hook functions receive rdram and recomp_context parameters for full
access to registers and memory. The implementation reuses the existing
function_hooks map so it integrates cleanly with the current pipeline.

Includes validation for nonexistent functions, invalid addresses, and
misaligned vram values with helpful error messages.
This commit is contained in:
ApfelTeeSaft 2025-10-07 22:48:08 +02:00
parent 6888c1f1df
commit a6406d563d
4 changed files with 167 additions and 3 deletions

View file

@ -8,6 +8,7 @@ This is not the first project that uses static recompilation on game console bin
* [Overlays](#overlays)
* [How to Use](#how-to-use)
* [Single File Output Mode](#single-file-output-mode-for-patches)
* [Function Hooks](#function-hooks)
* [RSP Microcode Support](#rsp-microcode-support)
* [Planned Features](#planned-features)
* [Building](#building)
@ -29,7 +30,7 @@ For relocatable overlays, the tool will modify supported instructions possessing
Support for relocations for TLB mapping is coming in the future, which will add the ability to provide a list of MIPS32 relocations so that the runtime can relocate them on load. Combining this with the functionality used for relocatable overlays should allow running most TLB mapped code without incurring a performance penalty on every RAM access.
## How to Use
The recompiler is configured by providing a toml file in order to configure the recompiler behavior, which is the first argument provided to the recompiler. The toml is where you specify input and output file paths, as well as optionally stub out specific functions, skip recompilation of specific functions, and patch single instructions in the target binary. There is also planned functionality to be able to emit hooks in the recompiler output by adding them to the toml (the `[[patches.func]]` and `[[patches.hook]]` sections of the linked toml below), but this is currently unimplemented. Documentation on every option that the recompiler provides is not currently available, but an example toml can be found in the Zelda 64: Recompiled project [here](https://github.com/Mr-Wiseguy/Zelda64Recomp/blob/dev/us.rev1.toml).
The recompiler is configured by providing a toml file in order to configure the recompiler behavior, which is the first argument provided to the recompiler. The toml is where you specify input and output file paths, as well as optionally stub out specific functions, skip recompilation of specific functions, and patch single instructions in the target binary. Documentation on every option that the recompiler provides is not currently available, but an example toml can be found in the Zelda 64: Recompiled project [here](https://github.com/Mr-Wiseguy/Zelda64Recomp/blob/dev/us.rev1.toml).
Currently, the only way to provide the required metadata is by passing an elf file to this tool. The easiest way to get such an elf is to set up a disassembly or decompilation of the target binary, but there will be support for providing the metadata via a custom format to bypass the need to do so in the future.
@ -40,6 +41,48 @@ This mode can be combined with the functionality provided by almost all linkers
This saves a tremendous amount of time while iterating on patches for the target binary, as you can bypass rerunning the recompiler on the target binary as well as compiling the original recompiler output. An example of using this single file output mode for that purpose can be found in the Zelda 64: Recompiled project [here](https://github.com/Mr-Wiseguy/Zelda64Recomp/blob/dev/patches.toml), with the corresponding Makefile that gets used to build the elf for those patches [here](https://github.com/Mr-Wiseguy/Zelda64Recomp/blob/dev/patches/Makefile).
## Function Hooks
The recompiler supports injecting custom function calls at specific points in the recompiled code through the configuration toml. This allows you to add instrumentation, debugging, or custom behavior without modifying the original binary.
### Text-based Hooks
Text-based hooks allow you to insert arbitrary C code at specific points using the `[[patches.hook]]` section:
```toml
[[patches.hook]]
func = "osCreateThread"
before_vram = 0x80001234
text = "printf(\"Creating thread\\n\");"
```
### Function Call Hooks
Function call hooks allow you to call a specific function that you implement in your runtime using the `[[patches.func]]` section. These hooks receive the full recompilation context, giving access to all registers and memory.
Hook at the beginning of a function (before any instructions execute):
```toml
[[patches.func]]
func = "osCreateThread"
hook_func = "my_osCreateThread_hook"
before_call = true
```
Hook at a specific instruction address within a function:
```toml
[[patches.func]]
func = "osStartThread"
hook_func = "my_osStartThread_hook"
before_vram = 0x80001500
```
The hook functions should be implemented in your runtime code with the following signature:
```c
void my_osCreateThread_hook(uint8_t* rdram, recomp_context* ctx) {
// Access registers via ctx->r1, ctx->r2, etc.
printf("osCreateThread called with arg: %d\n", (int)ctx->r4);
}
```
Both hook types can be used together in the same configuration file. The `before_vram` addresses must be 4-byte aligned instruction addresses within the target function.
## RSP Microcode Support
RSP microcode can also be recompiled with this tool. Currently there is no support for recompiling RSP overlays, but it may be added in the future if desired. Documentation on how to use this functionality will be coming soon.
@ -56,4 +99,4 @@ This project can be built with CMake 3.20 or above and a C++ compiler that suppo
* [rabbitizer](https://github.com/Decompollaborate/rabbitizer) for instruction decoding/analysis
* [ELFIO](https://github.com/serge1/ELFIO) for elf parsing
* [toml11](https://github.com/ToruNiina/toml11) for toml parsing
* [fmtlib](https://github.com/fmtlib/fmt)
* [fmtlib](https://github.com/fmtlib/fmt)

View file

@ -245,6 +245,67 @@ std::vector<N64Recomp::FunctionTextHook> get_function_hooks(const toml::table* p
return ret;
}
std::vector<N64Recomp::FunctionHookDefinition> get_function_hook_definitions(const toml::table* patches_data) {
std::vector<N64Recomp::FunctionHookDefinition> ret;
// Check if the function hook definitions array exists.
const toml::node_view func_hook_def_data = (*patches_data)["func"];
if (func_hook_def_data.is_array()) {
const toml::array* func_hook_def_array = func_hook_def_data.as_array();
ret.reserve(func_hook_def_array->size());
// Copy all the hook definitions into the output vector.
func_hook_def_array->for_each([&ret](auto&& el) {
if constexpr (toml::is_table<decltype(el)>) {
const toml::table& cur_hook_def = *el.as_table();
std::optional<std::string> func_name = cur_hook_def["func"].value<std::string>();
std::optional<std::string> hook_func_name = cur_hook_def["hook_func"].value<std::string>();
std::optional<uint32_t> before_vram = cur_hook_def["before_vram"].value<uint32_t>();
// Check for "before_call" flag (defaults to false)
bool before_call = false;
std::optional<bool> before_call_opt = cur_hook_def["before_call"].value<bool>();
if (before_call_opt.has_value()) {
before_call = before_call_opt.value();
}
if (!func_name.has_value() || !hook_func_name.has_value()) {
throw toml::parse_error("Function hook definition is missing required value(s)", el.source());
}
// Either before_vram or before_call must be specified
if (!before_vram.has_value() && !before_call) {
throw toml::parse_error("Function hook definition must specify either before_vram or before_call", el.source());
}
// Can't specify both
if (before_vram.has_value() && before_call) {
throw toml::parse_error("Function hook definition cannot specify both before_vram and before_call", el.source());
}
if (before_vram.has_value() && before_vram.value() & 0b11) {
// Not properly aligned, so throw an error (and make it look like a normal toml one).
throw toml::parse_error("before_vram is not word-aligned", el.source());
}
ret.push_back(N64Recomp::FunctionHookDefinition{
.func_name = func_name.value(),
.hook_func_name = hook_func_name.value(),
.before_vram = before_vram.has_value() ? (int32_t)before_vram.value() : 0,
.before_call = before_call,
});
}
else {
throw toml::parse_error("Invalid function hook definition entry", el.source());
}
});
}
return ret;
}
void get_mdebug_mappings(const toml::array* mdebug_mappings_array,
std::unordered_map<std::string, std::string>& mdebug_text_map,
std::unordered_map<std::string, std::string>& mdebug_data_map,
@ -462,6 +523,9 @@ N64Recomp::Config::Config(const char* path) {
// Function hooks (optional)
function_hooks = get_function_hooks(table);
// Function hook definitions (optional)
function_hook_definitions = get_function_hook_definitions(table);
}
// Use trace mode if enabled (optional)

View file

@ -19,6 +19,13 @@ namespace N64Recomp {
std::string text;
};
struct FunctionHookDefinition {
std::string func_name;
std::string hook_func_name;
int32_t before_vram;
bool before_call; // If true, hook before function call; if false, hook at specific vram
};
struct FunctionSize {
std::string func_name;
uint32_t size_bytes;
@ -60,6 +67,7 @@ namespace N64Recomp {
std::vector<std::string> renamed_funcs;
std::vector<InstructionPatch> instruction_patches;
std::vector<FunctionTextHook> function_hooks;
std::vector<FunctionHookDefinition> function_hook_definitions;
std::vector<FunctionSize> manual_func_sizes;
std::vector<ManualFunction> manual_functions;
std::string bss_section_suffix;
@ -77,4 +85,4 @@ namespace N64Recomp {
};
}
#endif
#endif

View file

@ -581,6 +581,55 @@ int main(int argc, char** argv) {
func.function_hooks[instruction_index] = patch.text;
}
// Apply function hook definitions from config
for (const N64Recomp::FunctionHookDefinition& hook_def : config.function_hook_definitions) {
// Check if the specified function exists.
auto func_find = context.functions_by_name.find(hook_def.func_name);
if (func_find == context.functions_by_name.end()) {
// Function doesn't exist, present an error to the user instead of silently failing.
exit_failure(fmt::format("Function {} has a hook definition but does not exist!", hook_def.func_name));
}
N64Recomp::Function& func = context.functions[func_find->second];
int32_t func_vram = func.vram;
int32_t hook_vram = 0;
if (hook_def.before_call) {
// Hook at the beginning of the function (before first instruction)
hook_vram = -1;
} else {
// Hook at specific vram address
hook_vram = hook_def.before_vram;
// Check that the function actually contains this vram address.
if (hook_vram < func_vram || hook_vram >= func_vram + func.words.size() * sizeof(func.words[0])) {
exit_failure(fmt::format("Function {} has a hook definition for vram 0x{:08X} but doesn't contain that vram address!",
hook_def.func_name, (uint32_t)hook_vram));
}
}
// Calculate the instruction index.
size_t instruction_index = -1;
if (hook_vram != -1) {
instruction_index = (static_cast<size_t>(hook_vram) - func_vram) / sizeof(uint32_t);
}
// Check if a function hook already exists for that instruction index.
auto hook_find = func.function_hooks.find(instruction_index);
if (hook_find != func.function_hooks.end()) {
exit_failure(fmt::format("Function {} already has a function hook for vram 0x{:08X}!",
hook_def.func_name, hook_vram == -1 ? func_vram : (uint32_t)hook_vram));
}
// Generate the hook call text
std::string hook_text = fmt::format("{}(rdram, ctx);", hook_def.hook_func_name);
func.function_hooks[instruction_index] = hook_text;
// Add the hook function to the header file declarations
fmt::print(func_header_file, "void {}(uint8_t* rdram, recomp_context* ctx);\n", hook_def.hook_func_name);
}
std::ofstream current_output_file;
size_t output_file_count = 0;
size_t cur_file_function_count = 0;