Categories
AI invest technology

Unexpected winners and losers after the week of DeepSeek

How DeepSeek would like the world to think about the youthful team. Image created with Midjourney.

It was the week of DeepSeek's CEO Liang Wenfeng, who seemed to appear out of nowhere to scare the hell out of everyone from Silicon Valley to Washington to Wall Street.

Apparently, not everyone has noticed that China is making the leap from an agricultural to a post-industrial society in record time. What chuckles there must have been in Beijing and Shanghai when Chinese New Year was celebrated last week.

Last week I wrote that Silicon Valley was rudely awakened by DeepSeek, and on Tuesday I added that Wall Street had overreacted. Today an attempt to chart the winners and losers, short- and long-term, of the rise of DeepSeek.

Who is Liang Wenfeng?

But first: who is Liang Wenfeng, the founder and CEO of DeepSeek? What is special about Wenfeng, as a startup founder, is his background as the founder of a hedge fund: High Flyer

"When we first met him, he was this very nerdy guy with a terrible hairstyle talking about building a 10,000-chip cluster to train his own models. We didn’t take him seriously" one of Liang's business partners told the Financial Times.

During his time at High Flyer, Liang began buying Nvidia equipment and learned the various ways to develop algorithms for AI applications, lessons he now applies at DeepSeek. More remarkably, DeepSeek's sudden success is driven by Gen Z newcomers from diverse backgrounds. Liang likes originality and creativity from young smart people and values experience a lot less.

Liang also talked about hiring literature buffs on the engineering teams to refine DeepSeek's AI models. "Everyone has their own unique path and brings their own ideas, so there's no need to direct them." This is especially interesting to read in the week that Mark Zuckerberg boasts that he is getting rid of all diversity programs at Meta, in an effort to appease the Trump administration.

OpenAI worth $300 billion after all?

According to the Wall Street Journal, Japan's SoftBank would lead a $40 billion investment round in the ChatGPT maker, part of which is to be spent on its Stargate AI infrastructure project. With a valuation of $300 billion, OpenAI would become the second most valuable startup in the world, behind Elon Musk's SpaceX, the major rival of OpenAI CEO Sam Altman.

It would be downright amazing if Altman manages to raise money for his money losing company at that stratospheric valuation, in the week when its vision and technological architecture are being doubted worldwide. But let us not overestimate SoftBank: it is the same club and the same man, Masayoshi Son, who burned tens of billions in WeWork; all the way to bankruptcy. The question is: why won't anyone but SoftBank step in at this valuation?

Is Stargate science fiction?

Both OpenAI and SoftBank have declared they will invest tens of billions in Stargate, the $500 billion budgeted AI infrastructure project that is supposed to seal American hegemony in technology. The crazy thing is that OpenAI doesn't have that money at all, and neither does SoftBank. So when SoftBank invests in OpenAI, which thereby invests in Stargate, it's basically filling one hole with another one.

The Verge published a lucid analysis of the Stargate project. If Stargate fails, it would not simply be the end of a startup. It would be an expensive reality check for an entire industry that claims to transform the world through pure computing power.

Altman likes to present himself as the protagonist in a classic science fiction story: the visionary who promises to transform society through technological power. 

In say a year, we will know whether Stargate was the beginning of America's AI revolution, or just a techno-optimistic fantasy that could not survive in the real world.

DeepSeek's actual costs

Then to a much-discussed topic: the costs allegedly incurred by DeepSeek to develop the acclaimed R1 model. The wildest stories are circulating about this, while DeepSeek itself has been fairly transparent about it:

"Finally, we again highlight the economic training cost of DeepSeek-V3, as summarized in Table 1, achieved by our optimized co-designs of algorithms, frameworks and hardware.

During the pre-training phase, training DeepSeek-V3 on every trillion tokens requires only 180K H800 GPU hours, or 3.7 days on our cluster with 2048 H800 GPUs. This completes our pre-training phase in less than two months and takes a total of 2.664M GPU hours. Combined with 119K GPU hours for context length extension and 5K GPU hours for post-training, DeepSeek-V3 costs a total of only 2.788M GPU hours for full training.

If we assume that the rental cost of an H800 GPU is $2 per GPU hour, our total training cost is only $5.576M. Please note that the above costs include only the official training of DeepSeek-V3 and not the costs associated with previous research and tear-down tests of architectures, algorithms or data."

I highlighted the crucial part: all previous costs are not included in the cost calculation. It's like calculating the cost of a bodybuilder's meals on competition day without including how many meals it took to get to the competition. 

Cheaper AI: who benefits?

Even more interesting than the cost aspect, DeepSeek offers the ability to install the model locally and develop on it. Microsoft CEO Satya Nadella pointed directly to Jevon's Paradox.

In short, precisely because of the reduced cost, the use of an innovationwill increase. It looks like Nadella is going to be right about that. In the long run, the "commoditization" of AI models and cheaper inference as demonstrated by DeepSeek will benefit Big Tech. Microsoft, for example, needs to spend less on data centers and GPUs, while benefiting from increased AI utilization through lower inference costs.

Amazon is also a big winner: AWS has not developed its own high-quality AI model, but that doesn't matter when there are high-quality open-source models available that it can offer at much lower cost.

Apple also benefits

Drastically reduced memory requirements for inference make AI on iPhones much more feasible. Apple Silicon uses a unified memory architecture, with the CPU, GPU and NPU (neural processing unit) accessing a shared memory pool, argues Stratechery in an excellent piece. This effectively gives Apple's hardware the best consumer chip for inference. Nvidia's gaming GPUs, for example, reach a maximum of 32GB of VRAM, while Apple's chips support up to 192GB of RAM.

Meta the biggest winner

AI is central to Meta's long-term strategy, and one of the biggest obstacles to date has been the high cost of inference. If inference and training become much cheaper, Meta can accelerate and expand its AI-driven business model more efficiently. 

Sensibly, Zuckerberg has reportedly set up several war rooms to determine how Meta will react to the introduction of DeepSeek. Whereas in the short term DeepSeek is thought to be a threat to Meta's AI strategy with its Llama LLM, a structural reduction in AI development costs will actually lead to a huge advantage for Meta, which is on track to invest $65 billion in AI development this year alone.

Most of that is spent on hardware and data centers. If that kind of investment can be minimized by imitating DeepSeek's approach, Meta will see its net profits increase substantially without weakening its competitive position.

Google the loser?

While Google also benefits from lower costs, any change from the current status quo is likely to be a net detriment to Google. Every search in OpenAI, DeepSeek or a Meta agent, comes at the expense of a search on Google's search engine.

Despite all its efforts and hundreds of acquisitions over the last few decades, Google still depends largely on the search engine for revenue and profits. It remains to be seen whether Google will succeed in "redirecting" that traffic from the AI agents and chatbots the world so eagerly uses, back to Google's AI tools.

Nvidia not defeated by DeepSeek

Despite DeepSeek's breakthrough, Nvidia has two moats, according to Stratechery:

  • CUDA is the preferred programming language for anyone developing these models, and CUDA works only on Nvidia chips.
  • Nvidia has a huge lead when it comes to the ability to combine multiple chips into one large virtual GPU.

These two lines of defense reinforce each other. As mentioned earlier, if DeepSeek had had access to H100s, they probably would have used a larger cluster to train their model simply because it was the easiest option. The fact that they did not and were limited by bandwidth dictated many of their decisions in terms of model architecture and training infrastructure.

DeepSeek has shown that there is an alternative: heavy optimization can achieve impressive results on weaker hardware and with lower memory bandwidth. So paying more to Nvidia is not the only way to develop better models.

However, there are three factors that still work in Nvidia's favor.

  • First, how powerful would DeepSeek's approach be if applied to H100s or the upcoming GB100s? Just because they have found a more efficient way to use computing power does not mean that more computing power would not be useful.
  • Second, lower inference costs are likely to lead to wider use of AI in the long run. Microsoft CEO Satya Nadella recently confirmed this in his late-night tweet about Jevon's paradox.
  • Third, reasoning models such as R1 and o1 derive their superior performance from using more computing power. As long as AI's strength and capabilities depend on more computing power, Nvidia will continue to benefit.

Also, with a larger market, Nvidia will benefit from revenue growth in cheaper chips, although it will be hampered in that market by competitors such as AMD. 

My subjective "Spotlight on AI" basket took relatively few hits last month.

DeepSeek thought 28 seconds about a hot dog

Joanna Stern of the Wall Street Journal did a funny test of DeepSeek and discovered how it differs from OpenAI's ChatGPT and Anthropic's Claude. Unlike OpenAI's reasoning models, DeepSeek shows its full thought process. When asked if a hot dog is a sandwich, DeepSeek thought about it for 28 seconds and responded with: "First, I need to understand what the definition of a sandwich is." It illustrates that there is no specific form of AI that works best for all issues.

The advance of AI throughout society is irreversible and with DeepSeek's approach, which will be copied frequently, the market will only grow larger. Therefore, despite all the doom-and-gloom news last week on Wall Street, it is fascinating that over the entire month of January, the performance in what I consider to be AI stocks has been better than one would expect. 

ARM's 29% rise is remarkable and is largely based on ARM's participation in Stargate. The remarkable thing is that SoftBank owns ARM and therefore there is a good chance that Masayoshi Son will use the shares in ARM as collateral when raising loans, which SoftBank can then use to pay for investments in OpenAI and in Stargate. Time will tell whether this approach leads to a skyscraper, or a house of cards.

This is how the main parties of the DeepSeek crash closed on Wall Street yesterday

What did America's tech billionaires buy from Trump?

President Trump has often expressed hostility toward major technology companies and their leaders, calling Facebook an "enemy of the people" and labeling Jeff Bezos as "Jeff Bozo," for example. Yet these gentlemen were in the front row at the inauguration, having lapped up significant sums of money. This was obviously no coincidence, and the technology sector wants something back from Trump soon. Bloomberg looked at each of them and mapped out what they each want to accomplish.

As we take stock of the performance of Big Tech stocks in the month of January at the end of the second week in Trump's second reign, it appears that the short-term results are not yet what Trump's new tech pals had hoped for. Despite all of Trump's presidential decrees and appointments, stock market results have been rather mixed, to say the least.

What is particularly striking is that investors are sharply divided over the tech sector as a whole. Meta rose mainly due to good quarterly earnings, but how could Microsoft fall while Google rose? Did Apple fall in January due to the possibility of a trade war with China? It is strange that the financial media was mostly focused on last week's results and ignored what happened in terms of price swings earlier in the month. Consider, for example, Palantir, up nearly 10% in January and already up 385% in the last year.

Huang at Trump, Liang at Li Qiang

President Trump and Nvidia CEO Jensen Huang discussed the impact of DeepSeek and possible restrictions on AI chip exports to China during a meeting at the White House on Friday. Huang will certainly have been thinking about the possible impact on Nvidia's stock price.

DeepSeek's Liang Wenfeng also met with an important politician this week: as the sole representative of the AI industry, he met with Premier Li Qiang, China's second most powerful man. Both meetings underscore the importance of technology to economic power in the new world order defined in part by AI.

Palantir CEO Alex Karp told CNBC that the rise of DeepSeek is a sign that the U.S. needs to work faster to develop advanced AI. "Technology is not necessarily good and can pose threats in the hands of adversaries. We need to recognize that, but that also means we need to run harder, go faster and make a national effort."

Boring: success begins with homework

Europe is no longer a consideration in the geopolitical shuffling between continents; how can it be, with so much talent among half a billion people?

Malaysian comedian Ronny Chieng summed up the West's problem perfectly: people are willing to die for their country, but they don't want to do homework for it. Chieng is talking about America, but it applies just as well to Europe.

Michiel Frackers
Michiel

I try to develop solutions that are good for the bottom-line, the community and the planet at <a href="http://bluecity.solutions">Blue City Solutions</a> and <a href="https://jointracer.io">Tracer</a>.

By Michiel

I try to develop solutions that are good for the bottom-line, the community and the planet at Blue City Solutions and Tracer.