Algorithm Research — FuriosaAI

Furiosa Algorithms Research

We work at the intersection of AI and hardware to make AI computing sustainable

Grid pattern with red, yellow, white dots and squares surrounding a large black irregular shape center-right.

RL Post-Training

Deploy the most capable models with strong latency and throughput.

Abstract digital art of colored dots and squares in red, blue, white, and black forming curved shapes.

Non-autoregressive Generation

Lower total cost of ownership with less energy, fewer racks, and air-cooled data centers of today.

Inference-time Algorithms

Stay future-proof for tomorrow’s models and transition with ease.

Publications

Transformers in the Dark: Navigating Unknown Search Spaces via Bandit Feedback

TMLR

2026

transformer

TABED: Test-Time Adaptive Ensemble Drafting for Robust Speculative Decoding in LVLMs

EACL

2026

speculative-decoding

vision-lanuage

Draft-based Approximate Inference for LLMs

ICLR

2026

speculative-decoding

kv-cache

ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs

ICLR

2026

parallel-decoding

benchmark

Inference-Aligned SFT for Diffusion LLMs via Group-based Trajectory Sampling

ICLR

2026

diffusion-llm

discrete-diffusion

sft

XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization

2026

Preprint

kv-cache

quantization

Counting Guidance for High Fidelity Text-to-Image Synthesis

WACV

2025

text-to-image

diffusion

VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data

ICML

2025

Oral

reward-model

reasoning

Parameter-Efficient Fine-Tuning of State Space Models

ICML

2025

ssm

peft

State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models

ACL

2025

ssm

peft

Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing

ECCV

2024

diffusion

image-editing

Can MLLMs Perform Multimodal In-Context Learning for Text-to-Image Generation?

COLM

2024

text-to-image

in-context-learning

Long-term collaboration partners on LLM efficiency, quantization, parallel decoding, and other advanced research areas for efficient inference.

We work at the intersection of AI and hardware to make AI computing sustainable

Publications

Transformers in the Dark: Navigating Unknown Search Spaces via Bandit Feedback

TABED: Test-Time Adaptive Ensemble Drafting for Robust Speculative Decoding in LVLMs

Draft-based Approximate Inference for LLMs

ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs

Inference-Aligned SFT for Diffusion LLMs via Group-based Trajectory Sampling

XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization

Counting Guidance for High Fidelity Text-to-Image Synthesis

VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data

Parameter-Efficient Fine-Tuning of State Space Models

State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models

Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing

Can MLLMs Perform Multimodal In-Context Learning for Text-to-Image Generation?

Join our team