Generative AI and the GPU Rush

The war between NVIDIA, AMD and Intel

Generative AI needs massive amounts of Graphics Processing Units (GPUs) in order to operate. This is what has driven NVIDIA’s recent huge success. Ever since OpenAI unleashed ChatGPT in November 2022, the race for chipmakers’ supremacy of GPU deployment has been on.

Besides NVIDIA, we have AMD and Intel, with AMD coming in second and Intel trailing a farther third.

When it comes to the pulsating heart of the graphics technology world, NVIDIA is often the first name that springs to mind. NVIDIA is expected to control a whopping 90%+ of the market for GPUs this year. Known for powering the visually stunning realms of high-end gaming and professional design, NVIDIA has cemented its position as a titan in the GPU market.

 NVIDIA captured 92% of the Generative AI market share in 2023 for data center GPUs

The secret sauce? Its bleeding-edge GPU architectures, like the formidable Ampere series, and a robust software ecosystem centered around CUDA for parallel computing. Gamers, designers, and data scientists alike often look to NVIDIA's GPUs for not just their raw performance but also their energy efficiency and pioneering features like real-time ray tracing, which have become synonymous with high-fidelity graphics. In January 2024, NVIDIA unveiled a new lineup of GPUs tailored for powering generative AI applications on personal computers. Additionally, laptops from Acer, Dell, and Lenovo will be equipped with these advanced GPUs.

NVIDIA’s powerhouse chip, the H100 with price tags that can soar up to a cool $40,000, are the go-to hardware for training those Large Language Models (LLMs) and unleashing AI applications that are nothing short of revolutionary. The H100 represents a 1000x increase of use for AI inference over the last decade. But here's the catch – they have been in short supply. The hunt for H100s is on, as tech aficionados and AI gurus alike vie to get their hands on these rare gems of the computing world. Very recently, supply issues have eased, and instead of having to wait  8-11 weeks to get your hands on H100s, the wait time is currently 3-4 weeks for some.

Single-GPU performance on AI inference expanded 1,000x in the last decade

However, in the high-stakes world of AI, the quest for Nvidia's H100 chips is akin to a modern-day gold rush, but with a twist – the gold is still in short supply. Companies diving into the deep end to craft their very own LLMs are hitting a snag, and it's not just a minor hiccup. The hunger for GPUs is insatiable, with the need stretching into the realms of tens and even hundreds of thousands, turning the dream of vast GPU clusters into a bit of a mirage.

The journey to AI greatness is now a waiting game, where delays can span a calendar quarter (or more) for large users, leaving companies on tenterhooks as they await the tech that will turbocharge their ambitions. And amid this scramble for silicon, prices for the coveted H100 chips stand firm, unswayed by the whispers of increased availability. Nvidia, riding high on this wave of relentless demand, continues to reap the rewards, with profit margins as plump as ever. For the AI pioneers, the mantra seems to be: Patience is profitable, at least for one side of this silicon chase.

Not to be outdone, AMD is hot on NVIDIA’s heels, especially when it comes to wooing gamers and professionals with its Radeon and Radeon Pro GPU lines. The tech giant has also been carving out its territory in the data center arena with its Radeon Instinct GPUs. AMD has become somewhat of a dark horse, known for delivering a knock-out price-performance ratio. Plus, let's not forget their pivotal role in the gaming universe – their GPUs are the lifeblood of powerhouse gaming consoles like the PlayStation 5 and Xbox Series X, underlining AMD's influential presence in the industry. In December 2023, AMD announced the Instinct MI300X - the third iteration of AMD’s Instinct line of data center GPUs - and the Instinct MI300A, a CPU-GPU hybrid chip for servers. AMD claims that the M1300X can outperform NVIDIA’s H100, saying that its AI accelerators are 1.6 times faster than that of NVIDIA’s.

Then there's Intel, a household name that has long dominated the CPU market and is now venturing into the GPU landscape with the Intel Arc series. Intel's integrated GPUs, such as the Intel Iris Xe, have become a staple for day-to-day computing, casual gaming, and burgeoning content creators. But now, Intel is not just dipping its toes but diving headlong into the discrete GPU pool, aiming to rival NVIDIA and AMD's supremacy in the high-stakes gaming and high-performance arenas. Also, In December 2023, Intel unveiled its Core Ultra chips, engineered to accelerate AI program execution. The new processors are set to empower over 230 AI-focused PCs, the first of their kind globally, produced by leading manufacturers such as Acer, ASUS, Dell, HP, and Lenovo.

The battle of the GPUs is a saga filled with constant innovation and fierce rivalry, as these tech behemoths vie for the crown in a market that's as dynamic as the graphics they render. Whether it’s the allure of NVIDIA’s top-tier graphics, AMD’s balance of cost and performance, or Intel’s ambitious play for new territories, the GPU wars are a thrilling plotline to follow for tech enthusiasts and everyday users alike. 

Taking a step back, let’s talk about why GPUs are so important for Generative AI. GPUs, once relegated to the gaming den, are now the darlings of the AI and ML world. And for good reason! Let's dive into the nitty-gritty of why GPUs are the unsung heroes of AI innovation.

Why GPUs are AI's New Best Friend

The Power of Parallelism

Imagine you’re painting a mural. Would you rather have one person with a brush or a hundred? In the AI universe, where deep learning networks are like vast murals requiring thousands of brush strokes (or operations), GPUs are the veritable army of painters. With their hundreds to thousands of cores, they tackle tasks concurrently, versus CPUs which do things serially, slashing the time it takes to process the complex data tapestries of AI.

CPU vs. GPU: Difference Between processing units

Speed Demons

If AI had a need for speed, GPUs would be its turbo engines. Churning through calculations at breakneck speeds, GPUs leave CPUs in the dust when it comes to tasks that can run in parallel. This means AI models can feast on larger data sets and flex their predictive muscles more swiftly, bringing AI smarts to the market at an unprecedented pace.

Masters of Matrices

Deep learning loves to play with matrices – those grids of numbers are its sandbox. And GPUs? They're the best shovels around. Designed to efficiently handle matrix operations, with special tech like Tensor Cores in NVIDIA GPUs for an extra boost, they're tailor-made for the heavy lifting involved in AI training and inference.

Scaling New Heights

Got an AI problem that's as tough as climbing Everest? Stack GPUs together, and suddenly, you're scaling that peak with the ease of a weekend hike. This scalability is what makes GPUs the go-to for anyone pushing the boundaries of AI, from researchers decoding the mysteries of the human brain to startups polishing the next big recommendation algorithm.

Green Machines

Here’s a riddle: what has monster computational power but sips energy like a finely tuned hybrid car? GPUs. In a world where data centers guzzle electricity, GPUs offer a greener pasture with their superior performance per watt, making them the eco-friendlier choice for AI's demanding workloads. 

The Support System

It takes a village to raise an AI, and GPUs boast a bustling one. With an ecosystem brimming with libraries, frameworks, and tools all geared towards GPU computing, the path from AIdream to reality is well-paved. NVIDIA's CUDA is like the town square of this village, a place where developers gather to translate their AI aspirations into action.

The Jack-of-All-Trades

Here’s the kicker – GPUs aren't just about AI. They're the Swiss Army knives of computing, handy for a swath of tasks from simulating the birth of stars to powering the real-time graphics in virtual reality. This versatility ensures that GPUs will remain central to technological progress across domains.

GPUs have morphed from gaming gadgets to pivotal players in the AI revolution. They're not just supporting the rapid development of AI; they're propelling it into the future. As AI and ML intricacies grow, so too does the role of GPUs, making them indispensable allies in our quest to make machines smarter. If there's one thing that's certain, it's that the GPU-powered AI journey is just getting started.

What’s Needed

In February, it was reported that Sam Altman was seeking $5-7 trillion dollars from investors including the United Arab Emirates to expand the world’s capacity of GPUs. The main reason for this was his concern that Open AI would not have enough GPUs in the future to carry on with their operations.

Sam Altman is reportedly looking for a $5-7 trillion investment for increasing the world’s capacity of GPUs

In an era where technology seamlessly intertwines with every facet of our lives, the semiconductor industry stands at the forefront of a remarkable evolution. As investments pour in, they promise to transform the landscape of this sector, setting the stage for a future where the industry's scale is nothing short of extraordinary. Last year alone, the global sales of chips—a cornerstone of modern technology—soared to an impressive $527 billion. Yet, this is just the beginning. Experts forecast a leap to an annual turnover of $1 trillion by 2030, painting a picture of unprecedented growth. 

But the story doesn't end with chip sales. The machinery that breathes life into semiconductor manufacturing, the backbone of chip factories, also saw a significant surge in demand. According to SEMI, an authoritative voice in the industry, sales of semiconductor manufacturing equipment reached the $100 billion mark last year. This figure is a testament to the vital role these technologies play in shaping the future of not just the semiconductor industry, but the global technological landscape at large.

With such a massive influx of investment, the industry is set to dwarf its current size, heralding a new era of innovation and economic prowess. This is a clear indicator of the critical importance of semiconductors in our increasingly digital world, and a glimpse into a future where technology's potential is limitless. 

But this is still dwarfed by the $5-7 trillion dollars that Sam Altman is seeking. Currently, Open AI needs about 30,000 GPUs to serve GPT, but it is estimated that in over a decade it could reach something close to 10 million GPUs for all its operations.

What is known now is that the GPU rush is on and NVIDIA looks like the long-term winner of the supply war. We’ll be checking back on this hot topic and will be reporting on the most important changes. 

Just Three Things

According to Scoble and Cronin, the top three relevant happenings last week

Anthropic’s Claude 3

Anthropic announced its latest Generative AI technology, Claude 3, which is actually a trio consisting of Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. The big news is that Opus, which is the most powerful, rivals GPT 4 on several benchmarks. Feedback on X is that it is most powerful with writing and providing more detailed, longer and appropriate responses than GPT4. It also is multimodal in that it can analyze visuals, but does not provide them when given prompts. What does Claude 3 mean to us? It shows that LLM tech is becoming more and more commoditized and that the next golden area is fine-tuning LLMs for verticals such as healthcare and the legal industry. TechCrunch

Sam Altman Back on the Board

Sam Altman has been reinstated to the Open AI’s non-profit board of directors after an independent investigation determined that he did not warrant firing in the first place. Additionally, former Gates Foundation CEO Sue Desmond-Hellman, former Sony General Counsel Nicole Seligman, and Instacart CEO Fidji Simo were appointed as new directors. We feel that Altman’s reinstatement overall is a good thing for the AI ecosystem. CNN

Middle Schoolers Charged with Felonies for Deepfakes

Two boys, 13 and 14-years old, in Florida have been charged and arrested with felonies for allegedly creating deepfake nudes of their classmates. They were arrested under a 2022 Florida statute that makes it a felony to distribute deepfake sexually explicit imagery without consent. It seems these are the nation’s first arrests and charges specifically concerning the circulation of AI-generated nude images. No federal law exists that addresses nonconsensual deepfake nudes. Because of this, states are left to decide how to deal with this issue and it has caused confusion. We feel that the sooner that legal statutes across states have been instituted, the better. The Verge

Scoble’s Top Five X Posts