linalg-matrix-element-type-mismatch

Status: shipped (Phase 8) — see CHANGELOG.

What it detects

A linalg::*Mul chain whose matrix element type (e.g. COMPONENT_TYPE_FLOAT16, COMPONENT_TYPE_FLOAT_E4M3) is mixed with a high-precision accumulator (COMPONENT_TYPE_FLOAT32 / _FLOAT64) without an explicit conversion. Activates only on SM 6.10+ targets.

Why it matters on a GPU

The matrix-engine fetcher silently widens the matrix's elements to the accumulator's precision, performing a per-element conversion that costs throughput on every IHV's matrix engine (Blackwell 5th-gen Tensor Cores, RDNA 4 AI accelerator, Xe2 XMX, Hopper Tensor Cores). Operations that look free in code are paid for at the fetcher.

Examples

Bad

hlsl

linalg::MatrixVectorMul(COMPONENT_TYPE_FLOAT16, COMPONENT_TYPE_FLOAT32, ...);

Good

hlsl

linalg::MatrixVectorMul(COMPONENT_TYPE_FLOAT16, COMPONENT_TYPE_FLOAT16, ...);

Options

none

Fix availability

suggestion — The intended type chain is application-specific.

linalg-matrix-element-type-mismatch ​

What it detects ​

Why it matters on a GPU ​

Examples ​

Bad ​

Good ​

Options ​

Fix availability ​

See also ​