Error Propagation
Real-world transformer inference with quantized attention. Observe how approximation errors propagate through actual language model layers.
- Qwen2 language models (0.5B & 1.5B parameters)
- 8+ softmax implementations compared
- Layer-by-layer error tracking
- Real-time streaming results
- Batch testing capabilities