How many MTF values for full STI?

7 bands × 14 modulation frequencies = 98. STIPA reduces to 14 (2/band) with >0.95 correlation to full STI.

Male speech weighting factors?

125Hz=0.085, 250=0.127, 500=0.230, 1000=0.233, 2000=0.309 (highest), 4000=0.224, 8000=0.173.

What is redundancy correction?

Adjacent bands carry overlapping speech info. Correction prevents double-counting by subtracting geometric mean of adjacent MTI.

Can STI be predicted from RT60 + noise?

Yes, per Annex D. Simplified formula assuming exponential decay and diffuse field. SonaVyx implements as quick-check.

What codes mandate STI?

BS 5839-8: ≥0.50. EN 54-16: references IEC 60268-16. AS 1670.4: ≥0.50. NFPA 72: ~0.50 equivalent.

IEC 60268-16: The Complete Standard Behind Speech Intelligibility (STI)

Standard Structure

IEC 60268-16 Ed.5 (2020) is the definitive international standard for objective speech intelligibility measurement. It applies to any transmission channel: room acoustics, PA systems, telephone networks, hearing aids, voice alarm systems, and intercoms.

Key sections:

Clause 4: MTF definition and computation
Clause 5: STI computation from MTF values
Clause 6: Measurement procedures (full and simplified)
Annex A: STI from impulse response (computational method)
Annex D: STI from SNR and RT60 (estimation method)
Annex F: STIPA test signal specification (Table F.1)

The Modulation Transfer Function

Speech intelligibility depends on temporal modulations — the amplitude variations that carry phonemic information. These modulations occur at rates between 0.63 and 12.5 Hz (corresponding to syllable rates and consonant/vowel transitions).

The MTF m(F, f_mod) measures how well the transmission channel preserves these modulations at each octave band center frequency F and modulation frequency f_mod:

m = 1.0: Modulation perfectly preserved
m = 0.5: Modulation reduced to half its original depth
m = 0.0: Modulation completely destroyed

Two mechanisms reduce MTF: reverberation (smears modulations in time) and noise (fills in modulation troughs). The combined effect per clause 4.3:

m(F, f_mod) = 1 / (1 + (2πf_modT/13.8)) × 1 / (1 + 10^-(S-N)/10)

Where T is the reverberation time and (S-N) is the signal-to-noise ratio at band F.

From MTF to STI (Clause 5)

Step 1: Apparent SNR

Each MTF value is converted to an apparent signal-to-noise ratio: SNR_app(k,j) = 10 log₁₀(m_kj / (1 - m_kj)). This is clamped to the range -15 to +15 dB.

Step 2: Band Average

For each octave band k, the 14 apparent SNR values (or 2 for STIPA) are arithmetically averaged to produce MTI_k (Modulation Transmission Index per band).

Step 3: Redundancy Correction (Table 5)

Adjacent octave bands carry partially redundant information (the correlation between 500 Hz and 1000 Hz content is higher than between 125 Hz and 8000 Hz). The correction: STI = Σ α_k × MTI_k - Σ β_j,j+1 × √(MTI_j × MTI_j+1)

Step 4: Male Speech Weighting (Table 4)

The weighting factors α_k reflect the relative importance of each band for male speech intelligibility:

Band (Hz)	125	250	500	1000	2000	4000	8000
α (male)	0.085	0.127	0.230	0.233	0.309	0.224	0.173

The 2000 Hz band has the highest weight — this is the region containing the most speech-discriminating information (sibilants, plosives). Female speech weighting differs slightly (higher weight at 4000-8000 Hz).

STIPA: The Practical Method (Annex F)

Full STI measurement with 98 MTF values requires either an impulse response or an impractically long test signal. STIPA reduces this to 14 values — 2 per octave band — using a specially designed multi-band modulated test signal (Table F.1).

The STIPA test signal must be played for at least 15 seconds to ensure statistically reliable modulation depth extraction. SonaVyx's STI tool generates this signal in Rust WASM at 48 kHz and extracts the MTF using the full IEC 60268-16 computation chain.

STI from Impulse Response (Clause A.4)

When an impulse response is available, STI can be computed without the STIPA test signal:

m(F, f_mod) = |∫₀^∞ h_F²(t) exp(-j2πf_modt) dt| / ∫₀^∞ h_F²(t) dt

Where h_F(t) is the impulse response band-pass filtered to octave band F. This method yields the room's contribution to intelligibility without including PA system non-linearity or self-noise. It is useful for predicting STI from room acoustic measurements before the PA system is installed.

STI from RT60 and SNR (Annex D)

The simplest estimation method uses only the reverberation time and signal-to-noise ratio per band. This is an approximation — it assumes exponential decay and diffuse field conditions. SonaVyx implements this as a quick-check tool on the STI page: enter RT60 and ambient noise levels, and get an estimated STI without any acoustic measurement.

Quality Ratings

STI	CIS	Rating	Example Context
0.00-0.30	0.00-0.29	Bad	Unintelligible — emergency PA failure
0.30-0.45	0.29-0.54	Poor	Very reverberant space, high noise
0.45-0.60	0.54-0.73	Fair	Minimum acceptable for announcement
0.60-0.75	0.73-0.88	Good	Classroom, conference room target
0.75-1.00	0.88-1.00	Excellent	Studio, courtroom, critical listening

CIS (Common Intelligibility Scale) is a linear transform of STI designed to better match subjective perception. Both are reported by SonaVyx.

Regulatory Requirements

Many building codes and fire safety regulations mandate minimum STI for voice alarm systems:

BS 5839-8 (UK): STI ≥ 0.50 for Category L systems
EN 54-16 (EU): References IEC 60268-16 for VA system verification
NFPA 72 (USA): Does not mandate STI directly but references intelligibility requirements that correspond to approximately STI ≥ 0.50
AS 1670.4 (Australia): STI ≥ 0.50 for emergency warning systems

The room analysis workflow includes STI as part of the comprehensive assessment, and the AI diagnostic flags intelligibility issues with specific remediation paths.

IEC 60268-16: The Standard Behind STI