Automated detection of toxic comments in Hungarian

Hatvani, Péter, Yang, Zijian Győző (2025) Automated detection of toxic comments in Hungarian Annales Mathematicae et Informaticae. 61. pp. 108-117. ISSN 1787-6117 (Online)

pdf
108_117_hatvani.pdf
Download (903kB) [error in script]

Hivatalos webcím (URL): https://doi.org/10.33039/ami.2025.10.007

Absztrakt (kivonat)

Moderating toxic online comments in Hungarian remains a challenging NLP task. We introduce the first openly available Hungarian corpus for toxic comment classification, though limited in size (n = 655), sourced from social media and political news forums. We fine-tuned three BERTbased classifiers (huBERT, multilingual BERT, and huBERT-SetFit) and applied data augmentation techniques to expand the training dataset. The best-performing model, huBERT-SetFit, achieved an F1 score of 93.7%. Our results demonstrate the effectiveness of transformer-based models for toxicity detection in low-resource, linguistically complex settings.

Mű típusa:	Folyóiratcikk - Journal article
Szerző:	Szerző neve Email MTMT azonosító ORCID azonosító Közreműködés Hatvani, Péter NEM RÉSZLETEZETT NEM RÉSZLETEZETT NEM RÉSZLETEZETT Szerző Yang, Zijian Győző NEM RÉSZLETEZETT NEM RÉSZLETEZETT NEM RÉSZLETEZETT Szerző
Kapcsolódó URL-ek:	Befoglaló kötet
Kulcsszavak:	toxicity, online hate, nlp, classification, logistic regression
Folyóirat alcíme:	Selected papers of the International Conference on Formal Methods and Foundations of Artificial Intelligence
Nyelv:	angol
Kötetszám:	61.
DOI azonosító:	10.33039/ami.2025.10.007
ISSN:	1787-6117 (Online)
Felhasználó:	Tibor Gál
Dátum:	29 Okt 2025 12:33
Utolsó módosítás:	29 Okt 2025 12:33
URI:	http://publikacio.uni-eszterhazy.hu/id/eprint/8829

Műveletek (bejelentkezés szükséges)

Tétel nézet

Intézményi Publikációk is powered by EPrints 3 which is developed by the School of Electronics and Computer Science at the University of Southampton. More information and software credits.

TÁMOP-4.2.1.D-15/1/KONV-2015-0013 Kutatás, Innováció, Együttműködések – Társadalmi innováció és kutatási hálózatok együttműködésének erősítése projekt keretében készült.