SubQ Claims 12 Million Token Context with 52x Speed Gain Over FlashAttention

A new sparse-attention architecture called SubQ promises the first fully sub-quadratic frontier model, processing 12 million tokens at a fraction of the compute cost.

Subscribe to unlock all stories

Get full access to The Singularity Ledger, archive included.

Cancel anytime. Payments powered by Stripe.