The DeepSeek Conundrum

Insight: Of Breaking Barriers, Innovation, and the Changing World of AI

Feb 06, 2025

Hello Everyone!

By now, you may have heard the news about the AI breakthrough that sent the US stock market into a tailspin.

On January 27, 2025, the Standard & Poor 500 (S&P 500) index dropped 1.5%, while the tech-heavy Nasdaq fell 3.1%.

Nvidia, a California-based leader in AI hardware, saw its stock plummet 16.9%, wiping out $600 billion in market value.

This decline rippled across global markets, impacting Europe and Japan, as tech stocks reflected growing uncertainty around AI infrastructure.

The cause? New AI models released by DeepSeek, a small Chinese startup founded by hedge fund billionaire, Liang Wenfeng.

So, what exactly is this breakthrough announced by DeepSeek?

First, let’s review the context.

AI Is Transforming Businesses. No Question.

AI is already reshaping business across the board—from customer service to product development and marketing.

Today’s companies often rely on various tools like predictive analytics, CRMs (Customer Relationship Management systems such as Salesforce), and dashboards to guide decisions. However, these tools can be clunky and expensive. They require a lot of human oversight and sometimes produce only partial insights.

Generative AI, which can handle repetitive tasks and data analysis automatically, promises to free up people to focus on more strategic and creative work.

Imagine a future where your software can suggest decisions as well as or even better than a human colleague. The key questions now are:

How much should we rely on AI versus human judgment?
What security and compliance measures must we have in place?
How can workers update their skills to work alongside AI?

A huge part of this puzzle is the need for powerful processing hardware.

Processing Power: The Key to AI’s Future

To run powerful AI models like the one relied on by OpenAI (GPT or o3)1 or Anthropic (Claude), you need serious computational muscle in the form of Graphics Processing Units (GPUs). These engines enable AI to process massive datasets and perform complex tasks by executing thousands of operations simultaneously.

The problem? GPUs are expensive and in limited supply.

This demand has driven up the stock prices of companies like Nvidia2, which dominate the GPU market.

In 2024, Nvidia's valuation crossed $1 trillion, fueled by the AI gold rush.

That was, until DeepSeek dropped its bombshell two weeks ago.

DeepSeek’s $6M Breakthrough

At the core of DeepSeek's breakthrough is its ability to use fewer GPUs than OpenAI’s models, sparking market concerns over reduced demand for Nvidia GPUs.

In simple terms, the innovation shows us that AI models can be built with:

Fewer chips: DeepSeek’s model requires only 2,000 GPUs, compared to the over 10,000 GPUs needed by OpenAI’s GPT-3 model and 25,000 GPUs for GPT-4. This marks a significant leap in efficiency and a substantial reduction in costs.
Less powerful chips: DeepSeek achieved this with less powerful hardware, the Nvidia H8003 GPUs instead of the top-of-the-line A100 GPUs used by OpenAI.
Larger parameters: DeepSeek is using 674 billion parameters4 - a nearly 4x increase from OpenAI’s GPT-3 model which used 175 billion parameters. More parameters can enhance model accuracy in reasoning and result generation.
More efficient reasoning: DeepSeek’s research paper explains it achieves smarter, more efficient reasoning5 with less processing power per task with comparable accuracy to GPT’s models. It corrects itself using curated datasets and delivers faster, cheaper performance compared to OpenAI's models.
Open Source: DeepSeek’s model is based on open-source6 code—code that is available for anyone to review, use, and modify for free, while OpenAI, Google, and Anthropic’s language models are proprietary. This means it is open to reuse and replicability.
Lower Cost: DeepSeek’s entire project is supposed to have cost just $6 million to build. In contrast, the GPT models are estimated to have cost around $100M—an investment nearly ten times higher than DeepSeek's, even if DeepSeek's actual costs were double the reported figure.

What Does This Mean for the AI Race?

So far, AI development has been dominated by large players—mainly US-based companies like OpenAI, Google, and Microsoft—with access to massive computing resources and billions7 in funding.

DeepSeek’s breakthrough proves that cutting-edge AI can now be built without a massive budget or warehouse of expensive GPUs.

This levels the playing field for new players including startups, forcing established players to rethink their pricing and product strategies.

Companies will gain access to new AI model options leading to competitive pricing.

Meta’s engineers are reportedly already reverse-engineering DeepSeek.

AI developers are also impacted. With lower costs and reduced reliance on expensive GPUs, smaller companies and indie developers can also cost-effectively build applications and tools to automate tasks, enhance customer service, generate content, and power bots.

What Does All This Have to Do With You?

If we're moving rapidly toward an AI-driven software world, the platforms you use will soon integrate AI systems.

Currently, most AI apps are simple wrappers around GPT or similar models, as custom AI remains costly to build.

However, with reduced innovation costs and entry barriers, more developers, platforms, and applications could enter the market with competitive offerings.

For everyday users, this could mean cheaper AI services, better personalization, faster support, improved recommendations, and innovative tools across industries like healthcare, education, and entertainment. Tasks that once took hours could be automated, enhancing user experiences.

For business professionals, integrating AI into everyday software means faster, data-driven insights and a growing need to upskill to innovate and develop new AI-enhanced products.

Caveats: DeepSeek Issues

Despite its achievements, there are several caveats to DeepSeek's success.

Dependence on Known Models: DeepSeek relied on OpenAI GPT models, synthetic data8, and open-source code. It still requires rigorous testing across diverse conditions.
Security Vulnerabilities: Experts have flagged concerns over data handling and security guardrails. With the model’s data stored on servers in China, issues of privacy and inspection arise. However, if replicated by others and improved, the innovation could still be applied to secure implementations.
Performance Consistency: DeepSeek must prove it can handle complex tasks with the same accuracy and versatility as established models. OpenAI’s advanced reasoning models (e.g., o3) already set a high standard.
Censorship Concerns: Reports indicate DeepSeek's model censors certain results, raising questions about its utility and long-term adoption.
GPU Access: DeepSeek’s founder stockpiled 50,000 GPUs before the U.S. export ban on AI chips, which primarily restricts access to Nvidia's advanced GPUs essential for training and running large AI models. This restriction may have spurred the innovation. If the ban persists, it could push non-U.S. firms to develop alternative GPU technologies or Nvidia clones.

Conclusion: The AI Game Has Changed!

DeepSeek’s breakthrough is the second 'four-minute mile'9 moment in tech, following OpenAI’s ChatGPT launch in 2022.

OpenAI’s early GPT models showed AI could handle complex human-like interactions, while fields like healthcare benefit from AI innovation through faster (and potentially cheaper) research and diagnosis.

DeepSeek has now pushed these boundaries further, proving that cutting-edge AI can be developed faster, cheaper, and with fewer resources—forcing the industry to rethink assumptions about AI scalability and infrastructure. This shift mirrors how cloud computing made SaaS more accessible.

The market may have overreacted to DeepSeek, as both GPUs and AI models will continue to evolve, maintaining a dependency on high-performance hardware for the foreseeable future.

But this still presents a conundrum.

DeepSeek operates within heavily censored environments, raising global concerns over trust, security, privacy, and regulatory oversight in AI development.

For example, researchers found10 DeepSeek's chatbot contains code potentially transmitting user details to China Mobile, a state-owned company, intensifying fears11 over data privacy.

At the same time, DeepSeek’s efficiency has the potential to democratize AI development, making advancements more accessible and affordable for industries worldwide—ultimately benefiting users on a global scale.

The challenge ahead lies in balancing the need for rapid AI innovation with the implementation of secure international guardrails—such as watermarks to prevent deepfakes and other forms of misinformation—in an increasingly interconnected AI-driven world.

I hope you found this useful!

GPT stands for Generative Pre-trained Transformer, a neural architecture that learns from billions of parameters and generates new text and images.

Key players in AI hardware include AMD, Intel, Apple, and Google. Apple designs custom chips (e.g., A-series, M1/M2) to optimize performance in its devices, while Google develops Tensor Processing Units (TPUs) for machine learning workloads across its services and cloud infrastructure. (Source: Online market research.)

Nvidia H800 GPUs are designed for cost-efficiency and specific AI workloads, whereas the A100 GPUs are higher-end with massive parallel processing power for training large-scale models like GPT-3 and GPT-4.

Parameters in AI models are the internal values that the model learns during training to make predictions. These values control how the model processes input data to produce output. For example, in language models: Parameters control how well the AI understands the context of a sentence. More parameters can improve the model's ability to generate coherent and contextually accurate text.

DeepSeek is believed to use a Mixture of Experts (MOE) architecture. This enables the model to create specialized topic experts and activate them selectively for relevant tasks, thus optimizing and conserving processing power.

Meta, which has lagged in the AI race due to its ill-advised bet on virtual reality, took a gamble when it released its fledgling AI project building the Llama language model as open-source. This model may have been leveraged by DeepSeek.

Microsoft has committed $80 billion in fiscal year 2025 to develop AI-enabled data centers and expand its Azure AI platform. Google has pledged over $27 billion in 2024 to advance AI, focusing on Tensor Processing Units (TPUs) and Google Cloud AI services. Amazon is expected to invest more than $35 billion by 2025 in AWS AI services and custom AI chips. Apple is investing approximately $20 billion annually in AI and machine learning across its hardware and software systems. Source: Reuters/Online market research

Data created for training v. real-world data.

Until Roger Bannister shattered the four-minute mile barrier in 1954, experts believed it was impossible. Within six weeks of his achievement, a second athlete broke the barrier, three more did so within a year, and over 1,500 runners have achieved it in the decades since—a testament to how quickly perceived limits can shift once broken.

Source: AP News.

Countries banning it include Australia, South Korea, and Italy. U.S. defense agencies, including the Pentagon and NASA, have advised against using DeepSeek due to security and ethical concerns. Source: AP News, Reuters.

Sekar Langit

Feb 9

Thank you for compiling this report, well done.

Expand full comment

1 reply by Jayshree Gururaj

Jim in Alaska

Feb 7

Excellent! Thanks!!

Deep Seek's interesting. I played with it a wee bit, a buoyancy query, DS told me I needed to use a lighter gas than hydrogen, like helium. I chided it and it allowed I was, in it's words, "absolutely correct" & recalculated.

Virginia Postrel had ChatGPT & Deep Seek attempt a serious essay, DS's is quite readable (https://vpostrel.substack.com/p/experimenting-with-chatgpt).

Again thanks, clarified a lot. The first read I read on DS noted it was basically open source (check) cost around 6 mil. (check, maybe) & suggests it could run on a gaming computer (OK a big stretch. Gaming boxes, GPUs, pretty close though). The way things are going Deep Seek won't fit therein but I expect, or I won't really be surprised if built with snippets of the same open source(s), Shallow Seek and even Maybe Not Deep But Well Beyond Middle Seek Even If Not Really Really Deep Seek will soon fit in such boxes.

2 more comments...