Artificial Intelligence in Online Hate Speech Detection

Authors

  • Sadia Khan Assistant Professor of Computer Science, University of Sargodha Author
  • Muhammad Bilal Lecturer in Information Technology, University of Gujrat Author

Keywords:

artificial intelligence, hate speech detection, natural language processing, machine learning, social media, content moderation

Abstract

The study analyzes how artificial intelligence (AI) can detect hate speech on the Internet through the use of machine learning and natural language processing theory to label and mitigate abusive content on the Internet.  An unhomogenized dataset of multilingual social media messages was preprocessed with the help of tokenization, stemming, and contextual embedding models in order to identify subtle semantic attributes.  The results of the experiments suggested that deep learning models, particularly transformer-based, were more accurate, more recall-centered, and more F1-score centred than traditional classifiers, reaching higher than 92 percent detection accuracy and compensating with a significant reduction in false negatives.  A comparative study revealed that explainable AI strategies and ensemble models increased the resistance of models to change in the face of hostile text, whilst explainable AI strategies and ensemble models facilitated easier understanding of how the models achieved their decision-making.  Cross-linguistic analysis also demonstrated that AI-based detection systems had high levels of generalizability in different contexts related to various cultures.  These results confirm AI to be potentially a scalable and reliable instrument to improve content moderation policies, secure online communities and inform future governance actions against online hate speech.

Downloads

Published

2025-06-30