Skip to main content

AI LLM provider backed by MLPerf cofounder bets barn on mature AMD Instinct MI GPU — but where are the MI300s?

With demand for enterprise-grade large language models (LLMs) surging over the last year or so, Lamini has opened the doors to its LLM Superstation powered by AMD’s Instinct MI GPUs.

The firm claims it’s been running LLMs on over 100 AMD instinct GPUs in secret for the last year in production situations – even before ChatGPT launched. With its LLM Superstation, it’s opening the doors to more potential customers to run their models on its infrastructure.

These platforms are powered by AMD Instinct MI210 and MI250 accelerators, as opposed to the industry-leading Nvidia H100 GPUs which. By opting for its AMD GPUs, Lamini quips, businesses “can stop worrying about the 52-week lead time”. 

AMD vs Nvidia GPUs for LLMs

Although Nvidia’s GPUs – including the H100 and A100 – are those most commonly in use to power LLMs such as ChatGPT, AMD’s own hardware is comparable.

For example, the Instinct MI250 offers up to 362 teraflops of computing power for AI workloads, with the MI250X pushing this do 383 teraflops. The Nvidia A100 GPU, by way of contrast, offers up to 312 teraflops of computing power, according to TechRadar Pro sister site Tom’s Hardware.

"Using Lamini software, ROCm has achieved software parity with CUDA for LLMs,” said Lamini CTO Greg Diamos, who is also the cofounder of MLPerf. “We chose the Instinct MI250 as the foundation for Lamini because it runs the biggest models that our customers demand and integrates finetuning optimizations. 

“We use the large HBM capacity (128GB) on MI250 to run bigger models with lower software complexity than clusters of A100s."

The Lamini LLM Superstation

(Image credit: Lamini)

AMD’s GPUs can, in theory, certainly compete with Nvidia’s. But the real crux is availability, with systems such as Lamini’s LLM Superstation able to offer enterprises the opportunity to take on workloads immediately. 

There’s also the question mark, however, over AMD’s next-in-line GPU, the MI300. Businesses are currently able to sample the MI300A now, while the MI300X is being sampled in the coming months.

According to Tom’s Hardware, the MI300X offers up to 192GB memory, which is double the H100, although we don’t yet fully know what the compute performance looks like. Nevertheless, it’s certainly set to be comparable to the H100. What would give Lamini’s LLM Superstation a real boost is building and offering its infrastructure powered by these next-gen GPUs. 

More from TechRadar Pro



Comments

Popular posts from this blog

The latest Apple TV 4K test lets you watch four sports streams at once

Apple is trying something new with the latest beta version of tvOS 16.5: the option to watch up to four simultaneous streams at once. Right now it's limited to live sports streamed through the Apple TV app on the Apple TV 4K , specifically MLB Friday Night Baseball and the MLS Season Pass. A multi-view option was spotted in the tvOS software last month, but the code was hidden and not enabled. MacRumors reported that the feature would be enabled this weekend, and beta testers have since been able to use it. As yet multi-view hasn't been officially announced by Apple, but it's expected that tvOS 16.5 is going to be pushed out in its final form within the next month or so. WWDC 2023 is around the corner as well, when we should be hearing about the next major updates for Apple's various operating systems – including tvOS 17. How it works Over at 9to5Mac there's a hands-on demonstrating how the multi-view feature works, and it's pretty much as you would expe...

Quantum computers are fast becoming cheaper and smaller — and they could be coming to a data center near you very soon

IonQ claims we’re closer to widespread enterprise quantum computing deployment as it lifted the lid on two rack-mounted models that can be deployed on-premises.   The startup has built the fourth-generation #AQ35 IonQ Forte Enterprise and fifth-generation #AQ64 IonQ Tempo, both of which are designed to be deployed in enterprise and government data centers. It’s also said it is deploying two quantum computers to the US Air Force.  While revealing these two models, IonQ co-founder and CTO Jungsang Kim said quantum computers are already in use by enterprises to churn through machine learning workloads. This, he added, suggests we’re much closer to readily available and affordable machines. Priming enterprises for a quantum future “We believe in the enterprise-grade quantum computing, which is where it can be something of value for enterprises, can happen in the next few years as we build powerful enough quantum computers that can actually do things that classical computers w...

Nvidia RTX 4080 GPU could get cheaper with a new version – but don’t get your hopes up

Nvidia’s RTX 4080 is purportedly getting a new spin on the GPU which could reduce the cost, but any price reduction will likely be very minor, sadly, if it happens at all. Tom’s Hardware flagged up this rumor – and treat it with caution, as with anything from the ever-spinning mill – that originated from HKEPC (a tech site in Hong Kong), claiming that while the current RTX 4080 graphics card is built on the AD103-300 chip, Nvidia is going to use a slightly different GPU in the future, namely AD103-301. There’s now more evidence this is actually happening, Tom’s points out, courtesy of a graphics card maker, Galax, which under its RTX 4080 product details lists the GPU as ‘AD103-300/301’. Furthermore, VideoCardz , which also picked up on this, informs us that Gainward, another card maker, has also listed the updated GPU variant AD103-301 in its product specs. With two separate third-party graphics card makers mentioning this new spin on the GPU in their specs, it seems pret...