wave64-on-rdna4-compute-misses-dynamic-vgpr

Status: stub. The full-length analysis is queued for a v1.0.x patch release per ADR 0018, section 5, criterion #6. The companion rule page at docs/rules/wave64-on-rdna4-compute-misses-dynamic-vgpr.md contains the canonical detection logic + GPU reasoning.

TL;DR

Per AMD's RDNA 4 deep-dives (Hot Chips 2025; Chips and Cheese RDNA 4), the new dynamic-VGPR allocation mode is wave32-only -- the per-wave s_alloc_vgpr instruction works only for the wave32 lane width. wave64 compute on RDNA 4 silently misses the per-block occupancy gain that dynamic-VGPR mode provides over the static allocation on RDNA 3.

What the rule fires on

A compute entry point declared [WaveSize(64)] (or [WaveSize(64, 64)]) under the [experimental.target = rdna4] config gate.

See the What it detects section of the rule page for the full pattern definition.

Why it matters

The full GPU-mechanism analysis lives in the Why it matters on a GPU section of the companion rule page.

Examples

The bad / good code snippets are kept canonical on the rule page; see wave64-on-rdna4-compute-misses-dynamic-vgpr.md -> Examples.

wave64-on-rdna4-compute-misses-dynamic-vgpr ​

TL;DR ​

What the rule fires on ​

Why it matters ​

Examples ​

See also ​

wave64-on-rdna4-compute-misses-dynamic-vgpr

TL;DR

What the rule fires on

Why it matters

Examples

See also