Speech Intelligibility Testing: STI and STIPA Practical Guide
TL;DR
Speech Transmission Index (STI) quantifies how well a transmission channel (room + PA system) preserves the modulations in speech that carry intelligibility. STI ranges from 0 (unintelligible) to 1 (perfect), with ratings: Bad (<0.30), Poor (0.30-0.45), Fair (0.45-0.60), Good (0.60-0.75), Excellent (>0.75). The STIPA method per IEC 60268-16 Table F.1 uses a test signal with 7 octave bands each modulated at 2 frequencies (14 modulation frequencies total). The signal is played for 15 seconds, the Modulation Transfer Function (MTF) is extracted from the captured signal, and STI is computed using male speech weighting factors and redundancy corrections. Ambient noise correction adjusts the MTF for environmental noise.
Why STI Matters
In emergency announcement systems, airports, schools, courtrooms, and houses of worship, the ability to understand spoken words is not just a quality metric — it is a safety requirement. Fire alarm voice evacuation systems in many jurisdictions must achieve STI ≥ 0.50. Airport PA systems per ICAO require STI ≥ 0.50 at gate areas. Classrooms per ANSI S12.60 target RT60 ≤ 0.6 s, which typically corresponds to STI > 0.60.
SonaVyx's STI measurement tool implements the full IEC 60268-16 methodology using a STIPA test signal generated and processed entirely in Rust WASM.
The Modulation Transfer Function
Speech carries information through temporal modulations — the rhythm of syllables, the attack of consonants, the decay of vowels. These modulations occur at frequencies between 0.63 and 12.5 Hz. The room and PA system reduce the depth of these modulations through reverberation (smearing) and noise (masking).
The MTF m(F, fmod) quantifies the modulation depth preservation at each octave band center frequency F and modulation frequency fmod. A value of 1.0 means the modulation is perfectly preserved. A value of 0 means the modulation is completely destroyed.
STIPA: The Practical Method
Full STI measurement requires 98 modulation transfer values (7 bands × 14 modulation frequencies). STIPA (IEC 60268-16 Table F.1) reduces this to 14 measurements — 2 modulation frequencies per octave band — while maintaining correlation >0.95 with full STI.
The STIPA Test Signal
The test signal contains 7 octave bands (125, 250, 500, 1000, 2000, 4000, 8000 Hz), each amplitude-modulated at two specific frequencies chosen to minimize inter-band interference:
| Band (Hz) | Mod Freq 1 (Hz) | Mod Freq 2 (Hz) |
|---|---|---|
| 125 | 1.00 | 5.00 |
| 250 | 0.63 | 6.30 |
| 500 | 0.80 | 8.00 |
| 1000 | 1.25 | 10.00 |
| 2000 | 1.60 | 5.00 |
| 4000 | 2.00 | 6.30 |
| 8000 | 2.50 | 8.00 |
SonaVyx generates this signal in Rust WASM at 48 kHz and plays it through the PA system for 15 seconds (Quick mode) or 3 × 15 seconds with averaging (Pro mode).
From MTF to STI
The computation chain per IEC 60268-16:
- Extract MTF: For each band, compute the modulation depth of the captured signal at the two modulation frequencies.
- Convert to apparent SNR: SNRapp = 10 log10(m / (1 - m)), clamped to -15 to +15 dB.
- Average per band: Mean of the two apparent SNR values per octave band.
- Apply redundancy corrections: Adjacent-band correlation correction (Table 5 of the standard) accounts for the fact that octave bands are not independent.
- Apply male speech weighting: Multiply each band's contribution by weighting factors (Table 4): 125 Hz = 0.085, 250 = 0.127, 500 = 0.230, 1000 = 0.233, 2000 = 0.309, 4000 = 0.224, 8000 = 0.173 (female weights differ slightly).
- Sum: STI = Σ(αk × SNRk) / 30 + 0.5, clamped to 0-1.
Ambient Noise Correction
The measured MTF includes the effect of ambient noise present during the measurement. If the measurement environment has different noise than the operational environment, the MTF must be corrected:
mcorrected = mmeasured / (1 + 10-(S-N)/10)
Where S is the signal level and N is the ambient noise level per octave band. The SonaVyx STI tool can measure the ambient noise first (with the PA off) and automatically apply the correction.
STI from Impulse Response
Per IEC 60268-16 clause A.4, STI can also be computed directly from the room impulse response:
m(fmod) = |∫ h²(t) exp(-j2πfmodt) dt| / ∫ h²(t) dt
This method does not require the STIPA test signal — any valid impulse response (from sweep or MLS measurement) can be used. It does not include PA system non-linearity or noise, so it represents the "room only" contribution to intelligibility. SonaVyx implements both the STIPA method (PA + room) and the IR method (room only) for comparison.
Interpreting Results
| STI Range | Rating | Typical Application |
|---|---|---|
| > 0.75 | Excellent | Studio, courtroom, critical listening |
| 0.60 - 0.75 | Good | Classroom, conference room, worship |
| 0.45 - 0.60 | Fair | Minimum for general PA, announcement systems |
| 0.30 - 0.45 | Poor | Reverberant spaces, noisy environments |
| < 0.30 | Bad | Unintelligible — remediation required |
For emergency voice alarm systems, most building codes require STI ≥ 0.50. The room analysis workflow includes STI measurement as part of the comprehensive room assessment, and the AI diagnostic flags intelligibility issues with specific remediation recommendations.
Common Issues and Fixes
- STI < 0.50 due to high RT60: Acoustic treatment to reduce RT60 below 1.0 s typically brings STI above 0.50.
- STI < 0.50 due to noise: Increase signal level, reduce noise sources, or add speakers closer to listeners.
- STI < 0.50 due to echoes: Strong late reflections (>50 ms) destroy modulation. Identify and treat the reflecting surface.
- STI varies across room: Poor speaker coverage. Add speakers or redirect existing ones. Measure at 5+ positions per ISO 3382.
Try It Now
Open this measurement tool in your browser — free, no download required.
Frequently Asked Questions
Last updated: March 19, 2026