ARM TOSA Research

Quantized Softmax.
Reimagined.

Advanced integer-only approximations for ARM TOSA architecture. Pushing the boundaries of transformer efficiency on edge hardware.

Explore Experiments Read Thesis

Research Journey

The Story Behind the Research

From problem identification to novel solutions - explore the key chapters of this thesis.

1. The Challenge

Why deploying transformers on edge devices requires integer-only softmax approximations.

2. TOSA Framework

ARM's hardware specification: integer operations, TABLE lookup, and NPU constraints.

Learn more

3. Background 3 TOPICS

Integer arithmetic fundamentals, homogeneous operations, and non-linear challenges.

Explore topics

4. Full Integer Translation

Complete workflow for converting LLMs to integer arithmetic.

Learn more

5. DIGmax Solution

Adaptive multi-table approach achieving 80,000× better precision than baseline.

Learn more

6. Real-World Validation

Testing error propagation on Qwen2 language models through transformer layers.

Learn more

7. Key Findings

Which methods work best for different hardware constraints and accuracy requirements.

Learn more

8. Full Thesis

Complete methodology, analysis, experiments, and conclusions.

View PDF

Choose Your Path

Two Ways to Explore

Test real-world transformer inference or dive into theoretical mathematical analysis.

Error Propagation

Port

Offline

Real-world transformer inference with quantized attention. Observe how approximation errors propagate through actual language model layers.

Qwen2 language models (0.5B & 1.5B parameters)
8+ softmax implementations compared
Layer-by-layer error tracking
Real-time streaming results
Batch testing capabilities

Launch Experiment

Theoretical Analysis

Port

Offline

Mathematical comparison of softmax approximation methods. Analyze error characteristics across different input ranges and visualize precision trade-offs.

Pure exp(x) function analysis
Custom input range selection
Interactive error visualization
Absolute & relative error metrics
Export plots as high-res PNG

Launch Experiment

Quantized Softmax.Reimagined.

The Story Behind the Research

1. The Challenge

2. TOSA Framework

3. Background 3 TOPICS

4. Full Integer Translation

5. DIGmax Solution

6. Real-World Validation

7. Key Findings

8. Full Thesis

Two Ways to Explore

Error Propagation

Theoretical Analysis

Quantized Softmax.
Reimagined.