Model Description

We present a large classification model trained on a manually curated real-world dataset that can be used as a new benchmark for advancing research in toxicity detection and classification. We started with the original weights from the WavLM base plus and fine-tuned it with 2,374 hours of voice chat audio clips for multilabel classification. The audio clips are automatically labeled using a synthetic data pipeline described in our blog post. A single output can have multiple labels. The model outputs a n by 6 output tensor where the inferred labels are Profanity, DatingAndSexting, Racist, Bullying, Other, NoViolation. Other consists of policy violation categories with low prevalence such as drugs and alcohol or self-harm that are combined into a single category.

We evaluated this model on a data set with human annotated labels that contained a total of 9,795 samples with the class distribution shown below. Note that we did not include the "other" category in this evaluation data set.

Class	Number of examples	Duration (hours)	% of dataset
Profanity	4893	15.38	49.95%
DatingAndSexting	688	2.52	7.02%
Racist	889	3.10	9.08%
Bullying	1256	4.25	12.82%
NoViolation	4185	9.93	42.73%

If we set the same threshold across all classes and treat the model as a binary classifier across all 4 toxicity classes (Profanity, DatingAndSexting, Racist, Bullying), we get a binarized average precision of 94.48%. The precision recall curve is as shown below.

PR Curve

Usage

The dependencies for the inference file can be installed as follows:

pip install -r requirements.txt

The inference file contains useful helper functions to preprocess the audio file for proper inference. To run the inference file, please run the following command:

python inference.py --audio_file <your audio file path> --model_path <path to Huggingface model>

You can get the model weights either by downloading from the model releases page here, or from HuggingFace under roblox/voice-safety-classifier. If model_path isn’t specified, the model will be loaded directly from HuggingFace.

Downloads last month: 253

Safetensors

Model size

94.6M params

Tensor type

F32

Paper for Roblox/voice-safety-classifier

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

Paper • 2110.13900 • Published Oct 26, 2021 • 1