Overview

Collection of keyword lists used to censor chat messages on chat apps and live streaming apps used in China. Each of these apps implements censorship on the client-side, which allows us to reverse engineer them and extract the keyword lists used to block content in chat messages. Our analysis shows how keyword lists are downloaded to the applications and how encryption and decryption of lists is implemented (when it is present). Changes to keyword lists are tracked over the data collection period for each application.

Full details on reverse engineering method and results are avalible in reports below:

[Chat program censorship and surveillance in China: Tracking TOM-Skype and Sina UC] (http://firstmonday.org/ojs/index.php/fm/article/view/4628/3727)

[Asia Chats: Investigating Regionally-based Keyword Censorship in LINE] (https://citizenlab.org/2013/11/asia-chats-investigatingregionally-based-keyword-censorship-line/)

[Every Rose Has Its Thorn: Censorship and Surveillance on Social Video Platforms in China] (https://www.usenix.org/conference/foci15/workshop-program/presentation/knockel)

Keyword Content Analysis

Datasets include raw keyword lists collected from the applications and processed datasets that include translations of keywords. Keywords were translated to English using combination of machine and human translation. Based on interpreting these translations with contextual information, we coded each keyword into content categories grouped under six general [themes] (https://github.com/citizenlab/chat-censorship/blob/master/themes_keyword_censorship.csv) according to a [code book] (https://github.com/citizenlab/chat-censorship/blob/master/categories_keyword_censorship.csv)

License

All data is provided under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International and available in full here and summarized here

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
LINE		LINE
SVP		SVP
TOM-Skype--Sina-UC		TOM-Skype--Sina-UC
livestream		livestream
README.md		README.md
categories_keyword_censorship.csv		categories_keyword_censorship.csv
themes_keyword_censorship.csv		themes_keyword_censorship.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Keyword Content Analysis

License

About

Releases

Packages

Languages

x-i-ao-b-ai/chat-censorship

Folders and files

Latest commit

History

Repository files navigation

Overview

Keyword Content Analysis

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages