Ahmad Hussein | AI Systems Architect

Introduction

Fact-checking is essential for combating misinformation, but manual verification cannot scale to the volume of content produced daily. This research presents an automated system for identifying "check-worthy" claims that deserve fact-checker attention.

The CLEF 2020 CheckThat! Lab

We participated in Task 1 of the CheckThat! Lab, which focused on ranking sentences by their check-worthiness. The task included multiple languages, and we developed a robust multilingual approach.

Architecture

Hybrid Design

Our model combines the strengths of two architectures:

CNN Branch:

Multiple filter sizes (3, 4, 5)

Captures local n-gram patterns

Good for detecting specific phrases and keywords

BiLSTM Branch:

Bidirectional processing

Captures long-range dependencies

Understands sentence-level context

Fusion Strategy

The outputs are concatenated and passed through:

Dense layers with dropout

Attention mechanism for interpretability

Final sigmoid for ranking score

Implementation Details

class HybridModel(nn.Module):
    def __init__(self, vocab_size, embed_dim):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embed_dim)
        self.cnn = MultiScaleCNN(embed_dim)
        self.lstm = nn.LSTM(embed_dim, 256, bidirectional=True)
        self.attention = SelfAttention(512)
        self.classifier = nn.Linear(768, 1)

Results

Our system ranked in the top 5 for the Arabic track and demonstrated competitive performance across all languages tested.

Lessons Learned

Hybrid architectures outperform single-model approaches

Language-specific preprocessing is crucial

Ensemble methods provide additional gains