Automating Fact-Checking: A Hybrid CNN/RNN Approach
A hybrid deep learning model for fact-checking systems, combining CNN and RNN architectures. Presented at CLEF 2020.
Introduction
Fact-checking is essential for combating misinformation, but manual verification cannot scale to the volume of content produced daily. This research presents an automated system for identifying "check-worthy" claims that deserve fact-checker attention.
The CLEF 2020 CheckThat! Lab
We participated in Task 1 of the CheckThat! Lab, which focused on ranking sentences by their check-worthiness. The task included multiple languages, and we developed a robust multilingual approach.
Architecture
Hybrid Design
Our model combines the strengths of two architectures:
CNN Branch:
- Multiple filter sizes (3, 4, 5)
- Captures local n-gram patterns
- Good for detecting specific phrases and keywords
BiLSTM Branch:
- Bidirectional processing
- Captures long-range dependencies
- Understands sentence-level context
Fusion Strategy
The outputs are concatenated and passed through:
- Dense layers with dropout
- Attention mechanism for interpretability
- Final sigmoid for ranking score
Implementation Details
class HybridModel(nn.Module):
def __init__(self, vocab_size, embed_dim):
super().__init__()
self.embedding = nn.Embedding(vocab_size, embed_dim)
self.cnn = MultiScaleCNN(embed_dim)
self.lstm = nn.LSTM(embed_dim, 256, bidirectional=True)
self.attention = SelfAttention(512)
self.classifier = nn.Linear(768, 1)
Results
Our system ranked in the top 5 for the Arabic track and demonstrated competitive performance across all languages tested.
Lessons Learned
- Hybrid architectures outperform single-model approaches
- Language-specific preprocessing is crucial
- Ensemble methods provide additional gains
Written by
Ahmad Hussein
AI Systems Architect specializing in LLM orchestration, RAG systems, and scalable cloud platforms. Building secure AI solutions for Fintech, Robotics, and Legal-tech.