Background

Online extremist narratives can polarize communities and amplify misinformation. Traditional detection methods often overlook the relational dynamics of online discourse. This study introduces a hybrid BERT-GNN model that combines linguistic features with network structures to identify extremist content and map influence within BLM Twitter conversations.

Objectives

  • 01
    objective: Develop and validate a tweet classification pipeline.
    Develop and validate a tweet classification pipeline.
    Manually labeled tweets (Extremist / Non-extremist) were used to fine-tune a BERT model to produce robust content labels across the dataset.
  • 02
    objective: Construct and train a GNN for link prediction.
    Construct and train a GNN for link prediction.
    Construct a social graph of users and labeled tweets and train a GNN to predict which user–tweet pairs are likely to form engagement links.
  • 03
    objective: Evaluate model performance and robustness.
    Evaluate model performance and robustness.
    Measure performance with metrics and run ablation studies to compare GNN configurations.

Methodology

  • objective: Data Collection
    Data Collection
    Collect data using RapidAPI APIs or scraping tools like Tweety
  • objective: Data Preprocessing
    Data Preprocessing
    Clean and preprocess the collected data to ensure quality and consistency.
  • objective: BERT-GNN
    BERT-GNN
    Use the BERT model to output labels then feed them into the GNN for training and link prediction.
  • objective: Evaluation & Deployment
    Evaluation & Deployment
    Evaluate the model performance and deploy it for real-time analysis.

Results

AccuracyPrecisionRecallF1 ScorePR AUC
GCN0.710.4710.5670.5080.758
GraphSAGE0.8870.8150.9990.8980.925

The GraphSAGE model achieved the highest overall performance, with an accuracy of 0.887, F1-score of 0.898, and PR-AUC of 0.925, indicating strong capability in correctly identifying both positive and negative links within the graph. Its recall score of 0.999 suggests that nearly all true positive links were successfully detected, though the slightly lower precision (0.815) indicates a modest number of false positives. This behavior implies that GraphSAGE effectively generalizes the contextual and relational features learned from the BERT HateXplain embeddings, capturing fine-grained semantic cues and user–tweet relationships within the network.