mirror of
https://github.com/N64Recomp/N64Recomp.git
synced 2026-06-12 02:52:36 +00:00
5 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
bd5f42fb22 |
analysis: discover_function_bounds — real CFG walk with jump-table support
Adds a public N64Recomp::discover_function_bounds() in src/analysis.h
that performs a BFS-based control-flow walk of a function's body,
following:
- Conditional branches (target + fall-through)
- Unconditional j/jal targets when intra-body
- jr $ra returns (block ends after delay slot)
- jr-via-jump-table dispatches: the existing register-state
simulator from analyze_function detects the lui+addiu+addu+lw+jr
pattern and records the jtbl base; we then read entries out of
the body bytes and feed targets back into the BFS until
convergence.
Returns the function's byte size (max-reachable + 4 to cover the
delay slot of the last instruction). On failure, populates a specific
error message with the offending offset and reason — caller treats
this as a build error, NOT a graceful skip (per the project's
no-stubs principle).
Wires into decompressed.cpp's pattern path, replacing the prior
inline BFS that had a TODO for jump-table handling. The pattern
caller now propagates failures via `synthesize_decompressed_patterns`
returning false, which surfaces in main.cpp's exit_failure path.
Concrete behavior change: activating a pattern that includes a
fragment with computed jumps now produces a build error pointing at
the specific section name + offset + the analyzer's failure reason,
instead of silently producing a partial binary. Tested on Stadium's
0x8FF00000 slot — first failing wrapper is at ROM 0x8CC400 with an
indirect jr at offset 0x827C the simulator doesn't pattern-match.
The static [[input.decompressed_section]] path for fragment78 is
unaffected (still recompiles cleanly, no regression on boot logo +
PIKA jingle).
Future work surfaced by this change: the simulator's lui+addiu
+addu+lw+jr pattern doesn't cover every jump-table shape Stadium
uses. Each gap surfaces as a specific build-error offset; resolution
is to extend analyze_instruction to recognize the additional pattern
(or, when it's a true tail-call rather than a jtbl, distinguish
those at the jr site).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
8320bb902b |
recomp: replace pattern-section graceful-skip with real CFG-based bounds discovery
The prior "pattern-synthesized recompile failures are best-effort: log + skip" path was a stub by another name — it produced binaries where some fragment bodies silently didn't exist, and the failure deferred to a runtime lookup-miss when Stadium tried to dispatch into them. That violates the project's no-stubs principle. Two changes here: 1. **Remove the soft-skip in main.cpp's recompile loops.** Recompile failures revert to fatal `std::exit(EXIT_FAILURE)` regardless of whether the section is pattern-synthesized. Build-time errors surface; the user has to make a real choice about how to resolve them. 2. **Replace the "scan to first jr ra" heuristic in decompressed.cpp with a real BFS-based control-flow walker.** The walker: - Starts at impl entry (+0x20). - Follows conditional branches (target + fall-through). - Follows j/jal targets when intra-function. - Treats jr $ra as a return; ends the basic block. - Returns max-reachable-offset + 4 as the function's true size. For functions with computed jumps (jr <reg> not jr $ra — i.e. jump-table dispatches), the walker reports a build-time error with a specific offset and a list of options for the user (declare via single-block form, or extend the walker to follow jump-table targets). NOT a skip. 3. **Pattern-caller propagates synthesis failures as build aborts.** `synthesize_decompressed_patterns` returns false when any section fails to add, and main.cpp's exit_failure path runs. Net effect on Stadium today: the static [[input.decompressed_section]] for fragment78 still recompiles cleanly (boot logo + PIKA jingle unaffected). Activating the pattern would now fail loudly on the first fragment with computed jumps, instead of silently shipping a binary missing those bodies. That's the principle: build errors surface, runtime stubs don't. The "extend the walker to follow jump-table targets" work is documented in the error message and is the next step if/when pattern activation matters more than fragment78's single case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5f2ae6e4f7 |
recomp: emit content_hash on pattern-synthesized sections (Shape A, build side)
Adds Section::content_hash, populates it on pattern-synthesized
sections with FNV-1a-64 of the first 0x100 bytes of the decompressed
body, and emits it into recomp_overlays.inl's SectionTableEntry. The
runtime side hashes the same window over the bytes Stadium loads at
fragment_ptr and looks up the matching section by hash.
Build-time and runtime use:
- SAME hash algorithm: FNV-1a-64
- SAME window: 0x100 bytes (95% uniqueness across Stadium's 282
distinct fragment bodies; falls back to first-candidate on the
residual ~5%)
- SAME byte source: pre-relocation decompressed bytes (link-time
form, before Stadium's R_MIPS_32 patches run)
Section table emit gains the .content_hash field; non-pattern sections
get hash=0, runtime-side condition `sec.content_hash != 0` filters
them out of the candidate set.
Pairs with the runtime-side change in
lib/N64ModernRuntime/librecomp/src/overlays.cpp.
Activation in PokemonStadiumRecomp's game.toml is gated on a
follow-up: pattern-synthesized impl bodies currently get a basic
forward-CFG-walked size which produces invalid C for fragments with
internal jump tables (data interpreted as code). Future fix: emit
pattern-section impl bodies as runtime-dispatched stubs instead of
trying to statically recompile each body. Until then, fragment78
stays declared as a single static [[input.decompressed_section]];
the engine's pattern infrastructure is in place, ready to be flipped
on once the impl-body emit is reshaped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
5b42a76748 |
recomp: pattern auto-discovery for dynamic-asset slot fragments (Shape A)
Adds [[input.decompressed_section_pattern]] for slots where many
fragments share a link vram (e.g. Stadium streams 279+ different
fragments through vram 0x8FF00000 across the game). Per-fragment
[[input.decompressed_section]] entries don't scale to that cardinality
and miss the runtime-swap dispatch problem entirely.
Engine pipeline:
1. Scan baserom.z64 for every Yay0 wrapper.
2. For each, decompress 0x40 bytes and check whether the prefix
matches the expected J <vram + 0x20> trampoline + FRAGMENT magic.
Wrappers in PERS-SZP form are detected by the -0x18 prefix.
3. For matches, fully decompress and FNV-1a-64 hash the body.
4. Deduplicate by content hash (Stadium has ~11 byte-identical
duplicates across its 279 wrappers).
5. Synthesize one Section per unique content. Section names
<base_name>__rom_<wrapper_offset>; functions become
func_<vram>__rom_<offset> via the existing collision-suffix
machinery (default for pattern-discovered sections, since
collisions are the EXPECTED case here).
Implementation function (the +0x20 entry) gets a basic forward CFG
walk to determine its size:
- Walk instructions tracking forward branch targets within the func.
- Stop at jr $ra IF no tracked forward branches still need to be
reached.
- Falls back to first-jr-ra heuristic if walk is inconclusive.
Pattern-synthesized recompile failures are non-fatal: pattern sections
have rom_addr in synthetic 0xFE000000 range, and main.cpp's recompile
loop log + skips them instead of std::exit. Lets the build proceed
even when our basic CFG walk misjudges a function with weird shape
(e.g. computed jumps through jump tables we don't analyze). Stadium's
Path-3 single-fragment case (fragment78 wrapper at ROM 0x9E93F0)
still recompiles cleanly; ~225 of 282 dynamic-slot fragments
recompile, ~57 fail and skip.
Validation on Stadium's 0x8FF00000 slot:
- 293 Yay0 wrappers found (293 vs 279 from prior validate script —
earlier scan undercounted due to a tight 1KB decode window).
- 282 sections after dedupe (11 collapsed as content-identical).
- Build proceeds to completion; no Stadium boot regression
(logo + PIKA jingle still render).
Outstanding for next session — runtime side:
- Modify register_runtime_fragment in librecomp/src/overlays.cpp
to read bytes at fragment_ptr (first 0x40 → fall back to full
body for the residual ~5%), hash, and look up the matching
section. Currently it picks by id alone, so for slot 0x8FF00000
only ONE of the 282 sections gets bound to func_map at any time
(the most-recently registered).
- Refactor cross-section R_MIPS_32 retargeting to use a vram
hashmap (currently O(N²) which gets expensive at 282 sections).
- Relink fragment78's prior single-fragment block can stay; it
works alongside patterns and serves as the "I know exactly which
one I want" form.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b517a7195a |
recomp: build-time decompression of CPU-decompressed-at-runtime fragments
Adds [[input.decompressed_section]] toml block + Yay0/PERS-SZP wrapper
decoders + an in-memory section synthesis pass. Required for games
like Pokemon Stadium where Stadium's CPU-side decompressor materializes
fragment bytes at runtime and the static recompiler can't see them in
the ELF/ROM-direct path.
User-facing config:
[[input.decompressed_section]]
name = "fragment78"
vram = 0x8FF00000
rom_wrapper = 0x9E93F0
wrapper_format = "pers_szp_yay0"
Pipeline:
1. compression/{yay0,pers_szp}.{h,cpp} decode the wrapper.
2. decompressed.cpp parses the FRAGMENT-format header (relocOffset,
sizeInRam) + Stadium-format reloc table, translates it to
N64Recomp::Reloc entries (R_MIPS_32/26/HI16/LO16) with paired
HI16/LO16 immediate computation, and synthesizes a Section
handed to the existing recompilation pipeline. Stores
decompressed bytes into context.rom at synthetic_rom =
0xFE000000 | rom_wrapper to keep them out of real-ROM addr space.
3. Two functions per fragment: the +0x00 entry trampoline (J + nop)
and the +0x20 implementation (runs to first jr ra in body).
4. After all decompressed sections are added, retargets each
R_MIPS_32 reloc to whichever existing section's vram range
contains its target address (cross-section pointer support).
Adds [output] collision_policy:
"error" (default) — abort the build if two emitted symbols collide
on name; print both colliders + how to opt in.
"suffix" — auto-disambiguate by appending __rom_<rom_addr>
to colliding symbols. Suffix only appears where
collisions exist.
Validated end-to-end on Stadium's fragment78 (wrapper at ROM 0x9E93F0,
decomp_size=0x25340, 319 relocs). Recompiled func_8FF00020 dispatches
to runtime_addr+0x24DC0 correctly; Stadium boots past the prior
crash point, no regression on the N64 logo + PIKA jingle.
Future work: pattern form ([[input.decompressed_section_pattern]]) for
slots like vram 0x8FF00000 where Stadium streams 279 different
fragments at the same link addr. Validation script
(tools/_validate_dynfrag.py in the consumer repo) confirms 268 distinct
content-hashes, 23MB total payload — feasible as engine work.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|