decompressed: detect data-only fragments by absence of jr \$ra

Stadium's dynamic-asset slot at vram 0x8FF00000 contains a mix of
fragment shapes:

  Code fragments — real MIPS function at +0x20 ending in jr \$ra
                   (and possibly more functions). Stadium dispatches
                   the +0x00 J trampoline to invoke them.

  Data fragments — pure data starting at +0x20 (tables of (tag,
                   pointer) records, animation curves, etc.). The
                   +0x00 J trampoline is a dormant placeholder that
                   Stadium NEVER actually calls. Stadium reads the
                   data directly via R_MIPS_32 pointers from elsewhere.

The previous code path attempted to recompile a function at +0x20
in EVERY synthesized section, which (a) was incorrect for data
fragments, and (b) reliably produced invalid C from data words
decoded as instructions.

Detection heuristic: scan the first 0x100 instructions of the body
for any jr \$ra (encoded as 0x03E00008). If absent, the fragment is
data-only — register the section + R_MIPS_32 relocs but emit NO
FuncEntry rows. If Stadium ever does dispatch the +0x00 J for one
of these (which shouldn't happen), the runtime LOOKUP_FUNC reports
the miss loudly — that's the correct surface, NOT a stub.

Tested on Stadium's 0x8FF00000 slot via [[input.decompressed_section_pattern]]:
  - 282 wrappers attempted
  - 62 classified as data-only (registered without impl function)
  - 220 attempted as code; first failure surfaces an analyze_function
    jump-table sizing gap (separate issue, distinct from data-only
    classification)

Static [[input.decompressed_section]] for fragment78 is unaffected
(still recompiles cleanly; boot logo + PIKA jingle still play).
The pattern stays inactive in Stadium's game.toml until the
analyze_function jtbl gap is addressed; build correctly refuses to
proceed if activated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Matthew Stanley 2026-04-28 21:02:38 -07:00
parent bd5f42fb22
commit 4f5fb0b64b

View file

@ -546,6 +546,54 @@ size_t add_decompressed_section(Context& context,
context.functions_by_name[name] = fi;
};
// Stadium has two FRAGMENT shapes that share the same +0x00..0x20
// header (J trampoline + magic + sizes):
//
// Code fragment: +0x20 is a real MIPS function ending in jr $ra
// (and possibly more functions interspersed with
// data). Stadium calls the J at +0x00 to dispatch
// into the function.
//
// Data fragment: +0x20 onwards is pure data (tables of
// (tag, pointer) records, etc.). The J at +0x00
// is a dormant placeholder that Stadium NEVER
// actually calls — Stadium reads the data
// directly. No MIPS function exists.
//
// We distinguish by scanning the first 0x100 instructions of the
// body for ANY jr $ra (0x03E00008). If absent, the fragment is
// data-only: we register the section + R_MIPS_32 relocs but emit
// NO FuncEntry rows. Stadium's dispatch never goes through a
// func_map entry for these. If something ever does call the
// entry-trampoline J, the runtime LOOKUP_FUNC reports the miss
// loudly, which is the correct surface — NOT a stub.
constexpr uint32_t IMPL_OFFSET = 0x20;
bool has_jr_ra = false;
{
const size_t scan_end = std::min<size_t>(
reloc_offset, IMPL_OFFSET + 0x400); // first 256 insns
for (size_t off = IMPL_OFFSET; off + 4 <= scan_end; off += 4) {
if (read_be_u32(blob.data() + off) == 0x03E00008u) {
has_jr_ra = true;
break;
}
}
}
if (!has_jr_ra) {
// Data-only fragment: section + relocs only, no functions.
std::fprintf(stderr,
"decompressed: section %s — data-only fragment (no jr $ra "
"in first 0x400 bytes); registered as section + relocs "
"with no FuncEntry rows. Stadium never dispatches the +0x00 "
"J trampoline for these (would surface as a runtime "
"lookup miss if it did, which is the correct diagnostic).\n",
section_name.c_str());
return section_index;
}
// Code fragment path: synthesize entry trampoline + impl function.
// (1) Entry trampoline at vram+0 (8 bytes).
std::vector<uint32_t> entry_words(2);
std::memcpy(entry_words.data(), blob.data() + 0x00, 8);
@ -560,7 +608,6 @@ size_t add_decompressed_section(Context& context,
// existing register-state simulator). On failure it reports a
// specific offset and reason; we propagate that as a build error
// — no graceful skip, no stub.
constexpr uint32_t IMPL_OFFSET = 0x20;
if (reloc_offset <= IMPL_OFFSET + 4) {
std::fprintf(stderr,
"decompressed: section %s — body too small to contain a "