shader-clippy blog
One post per rule. Each post explains the GPU mechanism behind a lint warning in enough depth that you could derive the rule yourself.
The target reader is a graphics engineer who ships shaders to production and has not profiled them in a while. Posts assume familiarity with HLSL and shader stages.
v0.5.0 launch series — category overviews
The v0.5.0 launch shipped eight category overviews plus a preface essay. Each overview walks one rule-pack at GPU-mechanism level.
| Date | Category | Title |
|---|---|---|
| 2026-05-01 | preface | Why your HLSL is slower than it has to be |
| 2026-05-01 | math | Where the cycles go: math intrinsics on modern GPUs |
| 2026-05-01 | workgroup | Your groupshared array is bank-conflicting on RDNA |
| 2026-05-01 | control-flow | Divergent control flow is the silent killer of your shader |
| 2026-05-01 | bindings | Where root signatures and descriptor heaps quietly cost you |
| 2026-05-01 | texture | Texture sampling is doing more work than your shader admits |
| 2026-05-01 | mesh + dxr | Mesh shaders + DXR |
| 2026-05-01 | wave + helper-lane | Wave intrinsics and helper-lane traps |
| 2026-05-01 | sm 6.9 | SM 6.9: shader execution reordering, cooperative vectors, and the new ray-tracing primitives |
Per-rule deep dives
Per-rule posts (one per shipped rule) are grouped by category below. Each entry links to a stub or full post. Stubs are placeholder scaffolds queued for v1.0.x patch-release fill-in per ADR 0018 §5 criterion #6.
bindings (26 rules)
all-resources-bound-not-set(stub)bool-straddles-16b(stub)buffer-load-width-vs-cache-line(stub)byteaddressbuffer-load-misaligned(stub)byteaddressbuffer-narrow-when-typed-fits(stub)cbuffer-divergent-index(stub)cbuffer-fits-rootconstants(stub)cbuffer-large-fits-rootcbv-not-table(stub)cbuffer-padding-hole(stub)dead-store-sv-target(stub)descriptor-heap-no-non-uniform-marker(stub)descriptor-heap-type-confusion(stub)divergent-buffer-index-on-uniform-resource(stub)excess-interpolators(stub)missing-precise-on-pcf(stub)nointerpolation-mismatch(stub)non-uniform-resource-index(stub)oversized-cbuffer(stub)rov-without-earlydepthstencil(stub)rwbuffer-store-without-globallycoherent(stub)rwresource-read-only-usage(stub)static-sampler-when-dynamic-used(stub)structured-buffer-stride-mismatch(stub)structured-buffer-stride-not-cache-aligned(stub)uav-srv-implicit-transition-assumed(stub)unused-cbuffer-field(stub)
blackwell (1 rule)
control-flow (21 rules)
barrier-in-divergent-cf(stub)branch-on-uniform-missing-attribute(stub)cbuffer-load-in-loop(stub)clip-from-non-uniform-cf(stub)derivative-in-divergent-cf(stub)discard-then-work(stub)early-z-disabled-by-conditional-discard(stub)flatten-on-uniform-branch(stub)forcecase-missing-on-ps-switch(stub)groupshared-uninitialized-read(stub)loop-attribute-conflict(stub)loop-invariant-sample(stub)manual-wave-reduction-pattern(stub)quadany-quadall-opportunity(stub)redundant-computation-in-branch(stub)sample-in-loop-implicit-grad(stub)small-loop-no-unroll(stub)wave-active-all-equal-precheck(stub)wave-intrinsic-helper-lane-hazard(stub)wave-intrinsic-non-uniform(stub)wavereadlaneat-constant-non-zero-portability(stub)
cooperative-vector (6 rules)
dxr (10 rules)
anyhit-heavy-work(stub)inline-rayquery-when-pipeline-better(stub)missing-accept-first-hit(stub)missing-ray-flag-cull-non-opaque(stub)oversized-ray-payload(stub)pipeline-when-inline-better(stub)ray-flag-force-opaque-with-anyhit(stub)recursion-depth-not-declared(stub)tracerray-conditional(stub)triangle-object-positions-without-allow-data-access-flag(stub)
linalg (2 rules)
long-vectors (4 rules)
math (31 rules)
acos-without-saturate(stub)countbits-vs-manual-popcount(stub)cross-with-up-vector(stub)div-without-epsilon(stub)dot4add-opportunity(stub)dot-on-axis-aligned-vector(stub)firstbit-vs-log2-trick(stub)inv-sqrt-to-rsqrt(stub)isnormal-pre-sm69(stub)isspecialfloat-implicit-fp16-promotion(stub)length-comparison(stub)length-then-divide(stub)lerp-extremes(stub)lerp-on-bool-cond(stub)manual-distance(stub)manual-mad-decomposition(stub)manual-reflect(stub)manual-refract(stub)manual-smoothstep(stub)manual-step(stub)mul-identity(stub)pow-base-two-to-exp2(stub)pow-const-squaredpow-integer-decomposition(stub)pow-to-mul(stub)precise-missing-on-iterative-refine(stub)redundant-unorm-snorm-conversion(stub)select-vs-lerp-of-constant(stub)sin-cos-pair(stub)sqrt-of-potentially-negative(stub)wavereadlaneat-constant-zero-to-readfirst(stub)
memory (5 rules)
live-state-across-traceray(stub)redundant-texture-sample(stub)sample-use-no-interleave(stub)scratch-from-dynamic-indexing(stub)vgpr-pressure-warning(stub)
mesh (9 rules)
as-payload-over-16k(stub)dispatchmesh-grid-too-small-for-wave(stub)dispatchmesh-not-called(stub)meshlet-vertex-count-bad(stub)mesh-numthreads-over-128(stub)mesh-output-decl-exceeds-256(stub)output-count-overrun(stub)primcount-overrun-in-conditional-cf(stub)setmeshoutputcounts-in-divergent-cf(stub)
misc (3 rules)
compare-equal-float(stub)comparison-with-nan-literal(stub)redundant-precision-cast(stub)
opacity-micromaps (3 rules)
packed-math (6 rules)
manual-f32tof16(stub)min16float-in-cbuffer-roundtrip(stub)min16float-opportunity(stub)pack-clamp-on-prove-bounded(stub)pack-then-unpack-roundtrip(stub)unpack-then-repack(stub)
rdna4 (3 rules)
sampler-feedback (3 rules)
saturate-redundancy (5 rules)
clamp01-to-saturate(stub)redundant-abs(stub)redundant-normalize(stub)redundant-saturate(stub)redundant-transpose(stub)
ser (12 rules)
coherence-hint-encodes-shader-type(stub)coherence-hint-redundant-bits(stub)fromrayquery-invoke-without-shader-table(stub)hitobject-construct-outside-allowed-stages(stub)hitobject-invoke-after-recursion-cap(stub)hitobject-passed-to-non-inlined-fn(stub)hitobject-stored-in-memory(stub)maybereorderthread-outside-raygen(stub)maybereorderthread-without-payload-shrink(stub)reordercoherent-uav-missing-barrier(stub)ser-coherence-hint-bits-overflow(stub)ser-trace-then-invoke-without-reorder(stub)
sm6_10 (4 rules)
texture (13 rules)
anisotropy-without-anisotropic-filter(stub)bgra-rgba-swizzle-mismatch(stub)comparison-sampler-without-comparison-op(stub)gather-channel-narrowing(stub)gather-cmp-vs-manual-pcf(stub)manual-srgb-conversion(stub)mip-clamp-zero-on-mipped-texture(stub)samplecmp-vs-manual-compare(stub)samplegrad-with-constant-grads(stub)samplelevel-with-zero-on-mipped-tex(stub)texture-array-known-slice-uniform(stub)texture-as-buffer(stub)texture-lod-bias-without-grad(stub)
vrs (4 rules)
wave-helper-lane (7 rules)
quadany-quadall-non-quad-stage(stub)quadany-replaceable-with-derivative-uniform-branch(stub)startvertexlocation-not-vs-input(stub)waveops-include-helper-lanes-on-non-pixel(stub)wave-reduction-pixel-without-helper-attribute(stub)wavesize-fixed-on-sm68-target(stub)wavesize-range-disordered(stub)
work-graphs (6 rules)
workgroup (20 rules)
compute-dispatch-grid-shape-vs-quad(stub)groupshared-16bit-unpacked(stub)groupshared-atomic-replaceable-by-wave(stub)groupshared-dead-store(stub)groupshared-first-read-without-barrier(stub)groupshared-overwrite-before-barrier(stub)groupshared-stride-32-bank-conflict(stub)groupshared-stride-non-32-bank-conflict(stub)groupshared-too-large(stub)groupshared-union-aliased(stub)groupshared-volatile(stub)groupshared-when-registers-suffice(stub)groupshared-write-then-no-barrier-read(stub)interlocked-bin-without-wave-prereduce(stub)interlocked-float-bit-cast-trick(stub)numthreads-not-wave-aligned(stub)numthreads-too-small(stub)numwaves-anchored-cap(stub)wave-prefix-sum-vs-scan-with-atomics(stub)wavesize-attribute-missing(stub)
xe2 (1 rule)
Conventions for contributors
Every shipped rule has a corresponding docs/blog/<rule-id>.md post. Stubs are scaffolds, not authoritative content; the rule page at docs/rules/<rule-id>.md is canonical. Promote a stub to a full post by replacing the referral sections with prose written graphics-engineer to graphics-engineer. Target length 800-1500 words. License CC-BY-4.0.
See pow-const-squared for a reference full-length post.