Identi ﬁ cation of online harassment using ensemble ﬁ ne-tuned pre-trained Bert

Identification of online hate is the prime concern for natural language processing researchers; social media has augmented this menace by providing a virtual platform for online harassment. This study identifies online harassment using the trolling aggression and cyber-bullying dataset from shared tasks workshop. This work concentrates on extreme pre-processing and ensemble approach for model building; this study also considers the existing algorithms like the random forest, logistic regression, multinomial Naïve Bayes. Logistic regression proves to be more ef ﬁ cient with the highest accuracy of 57.91%. Ensemble bidirectional encoder representation from transformers showed promising results with 62% precision, which is better than most existing models.


INTRODUCTION
With the increasing parameter of social media usage among all age groups, its erroneous use has led to online harassment. Cyber or Internet bullying is bullying through digital media, mainly social media. According to UNICEF, cyber-bullying has repetitive behavior to scare those targeted, anger, or shame. Examples include spreading lies about someone, sending hurtful messages or threats on social media through messages, impersonating someone, and sending mean messages on their behalf [1]. Social media provides us with a space to discuss various topics related to day-to-day life. There may be narratives and counter-narratives, which is generally regarded as suitable for dissent and discussion; however, some cyber abusers take this opportunity to abuse and shame someone. With several languages, users utilize while interacting online, the cyber world remains global. In linguistically diverse countries like India, Indonesia, etc., the gap between users using their native language and English speakers are noteworthy. Social media giants like Facebook and Twitter took several steps to mitigate or eradicate cyber abuse, but it still exists. This study has been carried out to identify online harassment in multilingual text. Significant work has been done to determine cyber harassment in an automated way using traditional supervised machine learning methods like Support Vector Machine (SVM), Long Short Term Memory (LSTM), logistic regression, decision trees, etc., [2][3][4]. Though, most of the work has been prepared in the English language. This study used fine-tuned uncased-Bidirectional Encoder Representation from Transformers (BERT) architecture for identifying online harassment in a multilingual dataset. Authors in [5] tried to detect the cyber abuse in multilingual data, but they used simple transformer architecture without fine-tuning and significant preprocessing of the textual data. This study focuses on the famous ensemble approach to attain more accuracy. In preprocessing, lemmatization, stop-word removal, Parts of Speech (PoS) tagging have been evaluated to feed the most accurate data to the model. Before using the pre-trained BERT network, the data was provided into various traditional classifiers like SVM, multinomial Niave Bayes, Logistic regression, etc. Almost all the classifiers achieved the same accuracy. This study used TRolling Aggression and Cyberbullying (TRAC)-1 dataset and showed accuracy close to state-of-art results and more than the baseline without much fine-tuning.

MATERIALS AND METHODS
Online harassment can take any form, but predominantly it is rooted in social media. The latest survey by pew research center [6] finds that 75% of the targets of online abuse equaling 31% of Americans overall say their most recent experience of online hate was on social media. Questions have been upraised on the working of social media giants for the elimination or mitigation of online harassment; about 79% say social media companies are not doing a fair job at addressing online harassment bullying on their platforms. Some of the key findings of an online survey conducted by the American trends panel [7] are that 41% of American adults have experienced online hate, and 25% have experienced grave harassment. The above disturbing trends have forced the researchers to automate the detection and subsequent eradication of online harassment, which eventually gives rise to online hate detection using Natural Language Processing (NLP). It is pretty challenging and perplexing to institutionalize the idea of abuse. Mishra [8] used it to discuss racism and sexism, while Nobata [9] referred to hate speech, profanity, and derogatory language. The first reported method for abuse detection was that of Spertus [10] in 1997, who hand-crafted rules over text to generate feature vectors for learning. Dadvar [11] uses a social feature engineering technique that incorporates features and identity traits of a user to the model likelihood of abusive behavior called user profiling Dadvar [11] includes the user's age alongside other lexicon-based features to detect cyber-bullying. In [12], authors used the gender of Twitter users with character n-gram for detection of sexism and racism in tweets F1-score improved from an existing 73.89%-73.93%. Authors [13] were the first to use the deep learning model for online harassment detection. They improved the accuracy of their model from existing 78.89%-80.07%, which outperforms the existing traditional methods significantly. In [4], used LSTM model with GloVe for feature engineering to detect online abuse, they achieved the best (weighted F1 of 93%) results by randomly initializing embeddings. Park and Fung [14] categorize the comments collected by combining two datasets, they concluded that combining the two-granularities using two input channels improves accuracy other researchers like [15][16][17] acknowledge the same. In GermEval shared task [18], authors made the winning submission with an F1-score of 76.95% and 53.59% for sub-task 1 and sub-task 2. Researchers in [19] have shown that learning about the classification of emotions and detecting abuse leads to improved performance.

DATASET
For this study data has been collected from the dataset -the shared task on aggression identification organized at the trolling, aggression, and cyber-bullying workshop [20]. Training data consists of 10,799 randomly selected Facebook comments; these comments have been annotated into three categories Overly AGgressive (OAG), COovertly Aggressive (COA), and Non-AGgressive (NAG). Test data or validation data is 1200 samples.

RESULTS AND DISCUSSION
Identification of observations as abusive gives the victims of abuse validation and allows observers to understand the extent of the problem. This study tried to identify online harassment using pre-trained BERT with an ensemble approach on the TRAC-1 dataset. The most recent study by [5] has used simple BERT architecture without considering the importance of preprocessing steps like handling of Not a Number (NaN) values, stopword removal, PoS tagging, contractions, stemming and lemmatization, which suggests that probably their model was not trained on good data, which may have led to model over-fitting [21][22][23][24]. The researchers also did not consider the fine-tuning strategies [25][26], which supplement the model to achieve better results. In this study, all the steps mentioned above were performed and try to identify the abuse in the multilingual text as it is shown in Table 1. This experiment has been  After preprocessing, the data has been fed into various famous existing algorithms like SVM, Naïve Bayes, logistic regression, random forest, etc., due to the shallow nature of the network, but the results obtained were not satisfactory. The accuracy achieved is not a milestone, but it is more than the baseline, which is 35.53% as it is shown in Fig. 1. Due to poor performance by the above algorithms deep learning approach has been introduced, the data has been fed into the pre-trained BERT with a multi-head attention model. It works on the mechanism of multi-head attention with a masked language model. BERT is a language representation pre-training method used to create models that are then freely downloaded and utilized by NLP practitioners. There are two ways to approach the problem either use the existing models to extract language features of high quality from text data, or fine-tune them to produce state-of-the-art predictions for a particular task (classification, identification of entities, answering question, etc.).
Three main advantages of BERT are quicker development, fewer data and better results. Fine-tuning the model played an important role in increasing the network's performance. BERT sequence classifier from transformers has been used for classification. BERT comes in two variants base model and large model. The size of training data is only Fig. 1. Result of machine learning algorithms difference between the two variants. This study used a bertbased-uncased model with several labels 3. Results of various fine-tuning parameters are listed below. BERT consists of the encoder and decoder parts. The first encoder layer receives a concatenation of WordPiece embeddings and positional embeddings produced from the input sequence as its input representation. The conversion of a query and a group of key-value pairs to output can be characterized as an attention function, where the question, keys, values, and production are all vectors. The result is a weighted sum of the values, with the weight allocated to each value determined by the query's compatibility function with the relevant key. a Query, Key, and Value vector for each input embedding token are built by multiplying the embedding by three learned matrices W Q , W K , and W V , respectively, given an embedded column vector x for an input sequence. The Query, Key, and Value vectors are stacked into column vectors Q, K, V for concurrent computing. The self-attention function is therefore provided by: where d k is the dimension of queries and keys. The transformer performs self-attention function in parallel with multiple attention heads by projecting the queries, keys and values h times with different, learned linear projections to d k ; d k and d v dimensions, respectively. Attention function is performed in parallel on each of these projected versions of queries, keys and values, resulting d v -dimensional output column vector values, this operation Concatðhead 1 ; …; head h ÞW o results in row vector, because W is a matrix. Concat is a row vector, so the result is a row vector, it means that MultiHeadðxÞ is a row vector and here head i ¼ AttentionðW Q i Q; W K i K; W V i VÞ. Concat is the concatenation function; the projections are parameter matrices Each transformer layer consists of two sub-layers. The first sub-layer is the multi-head attention and its normalized output is fed to the second sub-layer of fully connected feed forward network. The activation function for the feed forward networks is ReLU. Formally, the hidden states of transformer with M number of transformer layers are calculated as follows: where where norm is the normalization function with linear connection followed by fully connected feed forward network, W 1 and W 2 are the weights of the first and second fully connected networks with b 1 , b 2 as bias values, and m e M.
BERT creates a corrupted version b X by randomly assigning a special symbol [MASK] to 15% of the tokens in x. If the masked tokens are denoted as x, the training goal is to reconstruct x from b X, where m t 5 1 denotes token x t is masked, e(x) indicates the embedding of x and H q is a transformer that transforms a length-T text sequence x into a series of hidden vectors H q ðxÞ ¼ ½H q ðxÞ 1 ; H q ðxÞ 2 ; … ; H q ðxÞ T . The results with fine-tuning are shown in Tables 2-5, the comparison between the existed methods and this approach is shown in Fig. 2.

CONCLUSION
The research has been carried out to identify the online harassment on digital media using a famous dataset from the shared task of identifying trolling, aggression, and cyberbullying workshop (TRAC-1), unlike existing studies, which fed the semi preprocessed data to the model. This study preprocessed the data significantly by applying the techniques like contraction handling, stemming, lemmatization, stop-word removal, etc. The preprocessed data has been fed to the existing algorithm like Naïve Bayes logistic regression, but the accuracy achieved is not par. This work achieved competitive accuracy compared to state-of-the-art models by using fine-tuning strategies for pre-trained BERT with an ensemble approach. However, it can be concluded that with the increase in batch size and learning rate, the accuracy deteriorates, and the model starts to over-fit.