ML Anomaly Scoring — AI-Powered Transaction Graph Analysis
Introducing ML Anomaly Scoring for ChainAnalyzer. Machine learning models analyze transaction graphs to automatically detect anomalous wallet behavior patterns that rule-based checks may miss.
Why ML Detection
Existing rule-based detection (62 checks across 5 chains) accurately catches known patterns like OFAC sanctions, mixer contact, and peel chains. However, unknown attack methods and complex pattern combinations can slip through rules alone.
ML anomaly detection learns the structural features of transaction graphs and statistically identifies "behavior patterns that differ from normal wallets." If rule-based detection is a "microscope" for specific patterns, ML is a "radar" detecting anomalies from the big picture.
2-Model Ensemble
Two different anomaly detection approaches are combined for improved accuracy:
- Anomaly Detection Model — A decision tree-based model that detects data points that are "easy to isolate" from normal data. Particularly effective at outlier detection, efficiently finding the rare drainer/laundering wallets.
- Reconstruction Model — A neural network that learns to compress and reconstruct normal data, flagging data it cannot reconstruct well (i.e., deviating from normal patterns) as anomalous. Captures complex multi-dimensional relationships.
Graph Features
Multiple structural features are extracted from the transaction graph for each wallet:
- Connection Structure — In-degree, out-degree, degree centrality
- Fund Flow — Total received, total sent, net flow
- Counterparties — Unique senders, unique receivers
- Transaction Patterns — Average received, average sent, max single transfer, in/out ratio
Features are normalized with log1p transform (logarithmic compression) and StandardScaler,
optimized for the power-law distribution typical of blockchain data.
Integration with Rule-Based Detection
ML anomaly detection complements rather than replaces existing rule-based detection:
- If rule-based score is already HIGH/CRITICAL → ML detections are not added (avoids redundancy)
- If rule-based score is LOW/MEDIUM but ML score is high → Additional Detections are injected
- ML score > 0.90 → ML_HIGH_ANOMALY (added as MEDIUM detection)
- ML score > 0.75 → ML_ANOMALY_DETECTED (added as MEDIUM detection)
Graceful Degradation
ML scoring is optional. The system falls back to rule-based only in these cases:
- No graph data exists for the target address
- Graph database connection is unavailable
- ML model loading fails
If ml_anomaly_score is null in scan results,
it means ML scoring was not executed.
Scan Result Display
When ML scoring is executed, the following is added to scan results:
- ML Anomaly Score — 0.0-1.0 ensemble score (higher = more anomalous)
- AI Analysis — LLM analysis (GPT-5.2/o3) also considers the ML score
Training Data
Models were trained on transaction graph data of 13,883 nodes. Based on real data including ScamDB-registered drainers, Avalanche fraud networks, and suspicious addresses discovered by Follow Mode.