Q&A: ASUS on AI server trends, FuriosaAI partnership, and more

Our Viewpoints March 13, 2024

Share this article

Last month, FuriosaAI attended MWC Barcelona 2024 with our strategic partner ASUS, which was showcasing our Gen 1 Vision NPU card for highly efficient computer vision inference.

We sat down with Richard Liu, Head of ASUS’s Enterprise Server Business Unit, to discuss our collaboration on Furiosa’s upcoming second-generation chip for inference with large language models (LLMs), and the evolving global market for AI inference servers. You can read the Q&A below.

ASUS has been a global key player in manufacturing servers and other electronics since 1989. The company successfully qualified Furiosa's Gen 1 Vision NPU, a high power- and compute-efficient accelerator for computer vision applications. We are working together to design and manufacture the card for Furiosa’s second-generation chip, which will be released this year.

What are the most important server industry trends that you’ve observed at MWC this week?

Richard Liu: We’re seeing a hint of the coming AI-driven structural changes in the data center industry. Hyperscalers initiated the adoption of powerful AI servers in their cloud datacenters, and now telecom companies worldwide are embracing a similar transformation.

We've noticed the demand for AI compute is shifting away from focusing primarily just on training, and there’s a growing interest in inference servers from both cloud and enterprise sectors. The demand for infrastructure to support inference at scale will grow exponentially as more applications are adopted by enterprises and consumers.

How is the growth of Generative AI affecting businesses’ data center compute needs?

Generative AI's impact is already significant. It’s creating a demand for different system designs and server architectures to handle the unique compute requirements of Gen AI workloads.

This evolution is driving the need for more specialized servers designed for unique workloads of AI – whether they're compute-bound, I/O-bound, or memory-bound – and this means the world will need more specialized server designs, and maybe even entire data centers specialized for particular applications.

Heterogeneous computing will mean not just focusing on more powerful CPUs and GPUs, and more memory, but also incorporating a variety of compute options like FPGA, GPU, CPU, and ASIC, depending on the use case.

ASUS1 — Richard Liu, Head of ASUS's Enterprise Server Business Unit, and FuriosaAI Co-Founder & CEO June Paik hold a WARBOY card. (All images courtesy ASUS.)

What about AI workloads more generally? What do you see as the key challenges for customers?

Product-wise, thermal management for AI servers poses a significant challenge. Liquid cooling is gaining popularity, but air-cooled systems will remain predominant.

Scalability and cost are also critical hurdles for wider AI adoption. Addressing them requires innovative data center designs and specialized solutions, highlighting the importance of ecosystem partnerships with both existing players and new breakthrough technologies.

And of course, performance is paramount for AI servers for data centers. These products need to meet latency requirements while also optimizing for power consumption, costs, reliability, and software needs.

The complexity of these challenges necessitates greater flexibility, faster time to market, and the ability to accommodate more specialized demands.

ASUS and Furiosa have been working closely together on Furiosa’s second-generation chip, and the companies recently announced a strategic partnership. What have been the standout aspects of this collaboration for ASUS?

First and foremost is our partnership on sharing the knowledge and insights, and building the product together. We’ve really been able to move quickly and stay in sync when questions arise from Day 1 of productizing Furiosa's first accelerator. For example, early in the process with the NPU, we suggested a modification to the heat sink to improve thermal performance. The Furiosa team was open to our feedback, and we were able to implement a great solution. Furiosa's Gen 1 Vision NPU is extremely energy efficient – using between 40 and 60 watts per card – but it’s still crucial to deal with heat as effectively as possible. This is a big benefit for both large scale deployments and on-site solutions in small data centers.

Secondly, ASUS and Furiosa have worked closely together to address opportunities in the server market. According to market reports, the global AI server market is expected to increase significantly in 2024 and AI servers will make up a substantial share of the total server market. So the server market is crucial for companies like Furiosa as well as for ASUS. The ASUS team provided a server standard to the Furiosa team to validate its Gen 1 Vision NPU from hardware to software, making the card more suitable for cloud and on-prem applications.

What unique strengths does each company contribute to this partnership? How do these strengths complement each other?

One example has been ASUS’s validation of Furiosa's Gen 1 Vision NPU for use in edge servers. This leveraged each company’s unique strengths – Furiosa’s AI hardware expertise and ASUS’s leadership in servers. The result is that our edge servers have more options for AI accelerators and that the NPU is available for more edge AI computing use cases.

Also, ASUS has close, long-standing relationships with companies providing everything from core components to PCBs. We’ve worked together to help Furiosa find the right suppliers for both this Gen 1 Vision NPU and its second-generation chip that’s launching this year.

Could you share your insights on our Gen 1 Vision NPU as a product and its significance in the partnership?

The Gen 1 NPU's ability to run a variety of computer vision tasks makes it particularly suitable for deployment in edge servers. These include use cases like smart camera systems in factories to detect workplace safety issues or help with quality control on a production line. Edge servers are a booming sector of the total server market, and this NPU can play a key role.

ASUS2 — Furiosa’s WARBOY card on display at the ASUS booth at MWC 2024.

How does ASUS view the market potential for high-end LLM servers, and what role does Furiosa’s second-generation chip play in this vision?

The market right now is still focused primarily on mid-range AI servers, which are most suited for tasks like object detection, translation, and smaller language models. But over the last year or so, we’ve seen that start to change and demand is growing very rapidly for very high-end, LLM-capable chips, with much more memory and bandwidth. There are typically four, eight, or even 10 of these powerful chips together in a single server, which generates very challenging thermal conditions.

Over the next few years, we expect strong demand for both mid-range AI chips and these new, more capable accelerators that support large language models, multimodal models, and various generative models.

A key strength of Furiosa’s second-generation chip is that it’s well-suited to all of these tasks. It has the compute and memory bandwidth to run extremely demanding LLMs and multimodal models, but also a very flexible form factor and much lower thermal requirements.

Current high-end GPUs for LLMs can require up to 10kW per server. There are definitely demand for these high-end servers, but that’s not compatible with standard racks and is very difficult to cool. Furiosa’s chip is expected to have a TDP of about 150 watts, by contrast. That means the product will have a very large total addressable market.

Another area of concern for customers is the cost of LLM servers, and we expect Furiosa’s product will be very competitive there, too. Because these are high-end, cutting-edge cards, it’s important to make sure the quality and design is really excellent. And I think ASUS has shown the ability to be a good partner there for Furiosa.

What are some of the noteworthy features of Furiosa’s second-generation chip?

As I noted earlier, power consumption is an important differentiator. It’s exciting to see that Furiosa’s second-generation chip can run language models with tens of billions of parameters but uses much less electricity than other alternatives. Power consumption is a significant consideration when running LLMs at scale, so I think Furiosa’s product will provide a lot of value.

What are some of the ways FuriosaAI and ASUS are partnering to address this market? What are the long-term goals and vision?

I think ASUS has shown we can be a great partner for both new and established AI silicon companies and bring a lot of valuable design and production expertise. The industry is just starting to figure out how best to deploy LLMs and other new Generative AI technology in their products. So this will be an important growth area for ASUS. We’re pleased to be working with Furiosa to deliver products that will enable businesses to use these technologies in new ways and ultimately deliver benefits to millions of people.

Sign up here to be notified first about RNGD availability and product updates.

Share this article

Q&A: ASUS on AI server trends, FuriosaAI partnership, and more

Other posts

LG AI Research taps FuriosaAI to achieve 2.25x better LLM inference performance vs. GPUs

Furiosa SDK 2025.2.0 is here: Hugging Face Hub integration, reasoning model support, enhanced APIs, and more

Furiosa to bring RNGD to Microsoft’s Azure Marketplace

Get the latest updates on FuriosaAI

Get the latest from Furiosa AI