Skip to content

sin-cos-pair

Status: stub. The full-length analysis is queued for a v1.0.x patch release per ADR 0018, section 5, criterion #6. The companion rule page at docs/rules/sin-cos-pair.md contains the canonical detection logic + GPU reasoning.

TL;DR

On AMD RDNA/RDNA 2/RDNA 3, NVIDIA Turing/Ada Lovelace, and Intel Xe-HPG, sin and cos are transcendental instructions that run on the special-function unit (TALU / transcendental ALU) at one-quarter peak VALU throughput. Two separate calls — sin(x) and cos(x) — occupy two distinct quarter-rate issue slots: on RDNA 3 that is v_sin_f32 followed by v_cos_f32, each at 1/4 rate, for a combined cost of roughly 8 full-rate ALU-equivalent cycles.

What the rule fires on

Separate calls to sin(x) and cos(x) within the same function body where both calls share the same argument expression x. The rule matches any two calls — in any order, any number of statements apart — that operate on the same syntactic argument (same identifier, same literal, or structurally identical sub-expression). It does not fire when only one of the two is present, when the arguments differ, or when the results of both calls are already combined via a sincos intrinsic call.

See the What it detects section of the rule page for the full pattern definition.

Why it matters

The full GPU-mechanism analysis lives in the Why it matters on a GPU section of the companion rule page.

Examples

The bad / good code snippets are kept canonical on the rule page; see sin-cos-pair.md -> Examples.

See also


This is a v1.0-ship stub. Full analysis pending; track issue link TBD.

© 2026 NelCit — Apache-2.0 (code), CC-BY-4.0 (docs).