How does the AI diagnostic health score work?

Weighted average of 6 categories: Frequency Balance (25%), Phase (20%), Coverage (15%), Noise Floor (15%), Reverberation (15%), Intelligibility (10%). Each scores 0-100.

Does the AI diagnostic send my audio to a server?

No raw audio is transmitted. Tier 1 runs on-device in WASM. Tier 2 sends only processed metrics (not audio) to Claude Sonnet 4.

What is the difference between Tier 1 and Tier 2 diagnostics?

Tier 1: rule-based, instant, free. Tier 2: Claude Sonnet 4 AI, deeper analysis, ~$0.003/analysis with caching, Pro only.

Can AI replace a sound engineer?

AI augments but does not replace. It excels at pattern recognition but cannot physically reposition speakers or account for artistic intent.

How much does AI analysis cost?

Tier 1: free. Tier 2: ~$0.03 cold, ~$0.003 with prompt caching. Free users see top 3 issues; Pro see full analysis.

AI in Sound System Diagnostics: From Measurement Data to Actionable Fixes

The Interpretation Gap in Audio Measurement

Professional audio measurement tools produce enormous amounts of data: frequency response curves with thousands of data points, phase traces, coherence plots, impulse responses, RT60 values across octave bands, STI scores, and SPL distributions. The data is precise. The problem is interpretation.

A sound engineer with 20 years of experience looks at a transfer function and instantly sees the 6 dB dip at 250 Hz is a comb filter from a reflection 2.3 meters away. A venue manager with no acoustics background sees a squiggly line. AI bridges this gap.

Tier 1: Rule-Based Analysis (Instant, Free)

SonaVyx's first diagnostic tier runs entirely on-device using deterministic rules. It completes in under 100 ms and costs nothing to operate. The diagnostic engine checks six categories:

1. Frequency Balance

The engine computes deviation from a target curve across four bands: sub-bass (20-80 Hz), low-mid (80-500 Hz), mid-high (500-4000 Hz), and high (4000-20000 Hz). Deviations exceeding ±6 dB are flagged as problems. A buildup above +8 dB in the 200-500 Hz range triggers a specific "muddy sound" diagnosis with a recommendation to check boundary coupling and low-mid accumulation.

2. Phase Alignment

Phase deviation from linear at the crossover frequency between system components indicates alignment issues. The engine checks for polarity inversion (180° offset) and time misalignment (frequency-dependent phase slope). A phase offset exceeding 90° at the crossover triggers a system alignment recommendation.

3. Coverage Uniformity

When multiple measurement positions are available, SPL standard deviation across positions quantifies coverage uniformity. A standard deviation above 4 dB suggests coverage issues — either speaker aiming, delay timing, or architectural interference.

4. Noise Floor

The A-weighted noise floor is compared against NC (Noise Criteria) curves appropriate for the venue type. A worship space with NC-35 ambient noise is acceptable; the same level in a recording studio (target NC-15) is a critical problem. The engine references NC, NR, and PNC curves for the comparison.

5. Reverberation

RT60 values are compared against target ranges for 10 venue types. A lecture hall should be 0.6-0.8 s; a concert hall 1.5-2.2 s. Values outside the target range by more than 30% generate a warning with a link to the treatment calculator.

6. Speech Intelligibility

STI scores below 0.50 (the "Fair" threshold per IEC 60268-16) trigger a warning. Below 0.45 is flagged as critical, especially for emergency announcement systems where regulatory minimums apply.

Health Score Computation

Each category scores 0-100 based on how far measured values deviate from targets. The overall health score is a weighted average:

Frequency Balance: 25%
Phase Alignment: 20%
Coverage: 15%
Noise Floor: 15%
Reverberation: 15%
Intelligibility: 10%

A score above 85 indicates a well-tuned system. Between 60-85 means correctable issues exist. Below 60 indicates significant problems requiring attention.

Tier 2: Claude Sonnet 4 Deep Analysis (Pro)

When the free tier teaser shows interesting findings, Pro users can request a full AI analysis. This sends processed measurement metrics (not raw audio) to Claude Sonnet 4 (claude-sonnet-4-20250514) with a structured prompt.

The AI receives:

Frequency response data (magnitude and phase)
Coherence values per frequency band
RT60 per octave band
STI score and per-band MTF
SPL statistics (Leq, Lmax, percentiles)
Problem detector results (feedback, hum, polarity, comb filtering, clipping, THD+N)
Equipment scan results (if available)
Venue type and dimensions (if provided)

Structured JSON Output

The AI returns a structured JSON response containing:

Summary: 2-3 sentence plain-English assessment
Problems: Prioritized list with severity (critical/warning/info), affected frequency range, root cause, and specific fix
EQ Recommendations: Parametric EQ bands with center frequency (Hz), gain (dB), and Q factor
Category Scores: Refined scores for each of the 6 categories

Prompt caching via cache_control: ephemeral on the system prompt reduces API cost by approximately 90% for repeated analyses, bringing the per-analysis cost to roughly $0.003 for cached requests versus $0.03 for cold requests.

Pattern Recognition vs Rule Engine

The AI tier excels at recognizing patterns the rule engine cannot encode:

Comb filter signatures: Periodic nulls in the frequency response that indicate a specific reflection delay — the AI calculates the implied distance and suggests physical remediation
Equipment-specific issues: Recognizing that a frequency response dip at 2.5 kHz combined with a THD spike matches a known crossover problem in a specific speaker model
Compound problems: When muddy low-mids AND poor intelligibility AND high RT60 all appear together, the AI prioritizes acoustic treatment over EQ because the root cause is reverberation, not system tuning

What AI Cannot Replace

AI diagnostics augment but do not replace engineering judgment. The system cannot:

Physically reposition speakers or microphones
Detect issues outside the measurement bandwidth (e.g., structural vibration below 20 Hz)
Account for artistic intent (a mixing engineer may want more low-end for a specific genre)
Verify that recommended changes were physically implemented correctly

The before/after comparison tool closes this loop — measure, apply AI recommendations, measure again, and verify improvement with objective metrics.

Privacy: On-Device First

All Tier 1 analysis runs entirely in the browser via Rust WASM. No audio data leaves the device. Tier 2 sends only processed metrics (frequency/magnitude/phase arrays, scalar values) — never raw audio samples. The privacy policy details exactly what data is transmitted.

For users who need fully offline operation, the problem detection suite (7 detectors) and all measurement tools work without any network connection using the WASM engine and optional ONNX edge ML models.