KoSP2E ASR Recipe

This is the ESPnet2 recipe for the KoSP2E (Korean Speech Perception and Production Experiment) dataset.

Overview

The KoSP2E dataset is a large-scale Korean speech corpus designed for speech perception and production experiments. This recipe provides a full ASR pipeline using ESPnet2 with both Transformer and Conformer architectures.

Results

Environment

Date: Mon Nov 10 20:35:20 UTC 2025
Python: 3.10.19
ESPnet: 202509
PyTorch: 2.9.0+cu128
Model: Conformer (BPE=2000)
Decode: Transformer LM (valid.acc.ave)

WER

dataset	Snt	Wrd	Corr	Sub	Del	Ins	Err	S.Err
test	2320	22337	77.1	20.4	2.6	4.4	27.4	76.4

CER

dataset	Snt	Wrd	Corr	Sub	Del	Ins	Err	S.Err
test	2320	84267	92.5	5.7	1.8	1.7	9.2	76.4

TER

dataset	Snt	Wrd	Corr	Sub	Del	Ins	Err	S.Err
test	2320	65361	89.4	8.6	2.0	2.1	12.7	76.4

References

KoSP2E paper: https://arxiv.org/abs/2107.02875

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for espnet/kosp2e-asr-ko

Kosp2e: Korean Speech to English Translation Corpus

Paper • 2107.02875 • Published Jul 6, 2021