Profiling Hate Speech Spreaders on Twitter

I got 4th position on leaderboard at PAN in this competetion.

Task

Hate speech (HS) is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics. Given the huge amount of user-generated contents on Twitter, the problem of detecting, and therefore possibly contrasting the HS diffusion, is becoming fundamental, for instance for fighting against misogyny and xenophobia. To this end, in this task, we aim at identifying possible hate speech spreaders on Twitter as a first step towards preventing hate speech from being propagated among online users.

After having addressed several aspects of author profiling in social media from 2013 to 2020 (fake news spreaders, bot detection, age and gender, also together with personality, gender and language variety, and gender from a multimodality perspective), this year we aim at investigating if it is possbile to discriminate authors that have shared some hate speech in the past from those that, to the best of our knowledge, have never done it.

Dataset

Data is taken from PAN website The data uploaded on this github repo is password protected. You need to replace it with your data.

Codes

In order to reproduce the result, run all notebooks presents in for a particualar task. Each notebook produce two csv files. Put all csv files produced by all notebooks of a particular task in a folder such as submissions in our case and run the script.py file to average the predictions.

Result

Validation result

Five fold cross validation result
English: 75%
Spanish: 85%

Test result

English: 72%
Spanish: 82%

Paper

@inproceedings{anwar2021identify,
  title={Identify Hate Speech Spreaders on Twitter using Transformer Embeddings Features and AutoML Classifiers.},
  author={Anwar, Talha},
  booktitle={CLEF (Working Notes)},
  pages={1808--1812},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Notebooks		Notebooks
data		data
submissions		submissions
.gitignore		.gitignore
README.md		README.md
flowchart.pdf		flowchart.pdf
script.py		script.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Profiling Hate Speech Spreaders on Twitter

Task

Dataset

Codes

Result

Validation result

Test result

Paper

About

Releases

Packages

Languages

talhaanwarch/Profiling-Hate-Speech-Spreaders-on-Twitter

Folders and files

Latest commit

History

Repository files navigation

Profiling Hate Speech Spreaders on Twitter

Task

Dataset

Codes

Result

Validation result

Test result

Paper

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages