Deepseek - More open that OpenAI
Deepseek has thrown open a question about American AI supremacy. It also has forced questions about the claims made by the AI gods in America. Are they lying or just plain stupid? They don't look go
In The Beautiful Constraint, the authors discuss Google's early days. Neither Larry nor Sergey knew how to build a webpage, so they made the simplest page possible—a text box to collect a query and a button to send it. The website's simplicity was one of the biggest contributors to its success. There was only one thing you could do on Google.com.
Constraints are a great way to focus your attention and force you to find new ways.
In 2012, Fei-Fei Li was working on building an algorithm that could identify pictures. She downloaded a million pictures under various categories to train the neural network. The breakthrough she made at Stanford formed the basis for many of the AI technologies that are available today. Many of the researchers on this team were later poached by a not-for-profit called OpenAI.
With no shortage of money, they started brute-force development of models. Supplying more and more parameters alongside greater processing power; they built ChatGPT.
Last year, the Biden administration placed a ban on AI chip exports to China. This restriction on computing was a beautiful constraint.
Denied the most powerful chips thought needed to create state-of-the-art AI models, DeepSeek pulled off some engineering master strokes that allowed the researchers to do more with less. The DeepSeek-V3 and DeepSeek-R1 models the company recently released achieved state-of-the-art performance in benchmark tests and cost much less time and money to train and operate than comparable models.
And the cherry on top: The company’s researchers showed their work—they explained the breakthroughs in research papers and open-sourced the models so others can use them to make their own models and agents.
The main reason DeepSeek had to do more with less is that the Biden administration put out a series of restrictions on chip exports saying that U.S. chipmakers such as Nvidia couldn’t ship the most powerful GPUs (graphics processing units, the go-to chip for training AIs) to countries outside the U.S.
Source: Fast Company
OpenAI has been running a blank cheque company with no constraints in terms of capital, or resources available. With investment from Microsoft, they had all of the data centre capacity as well as capital on the ready. Hence they built a model that maps and activates hundreds of billions of parameters to answer any question.
This is a bit like reading every book on the planet before answering the question “How do you make tea?”
Deepseek has taken an approach where they have mapped out all of these parameters but have created an architecture called the Mixture of Experts (MOE). Each expert operates in a small area and uses perhaps 10-30 billion parameters to answer a question. This makes the model radically more efficient.
Given that it is from China, I can hear the argument coming - It’s just the entire population of China hired to answer every question we toss at it. But, the model is open source. Anybody can go check out the model and build on top of it.
Source: Deepseek
Processor War
Nvidia has been the darling of the stock market because of the huge demand that AI has laid on its GPUs. The stock was trading at $5 in 2020 and at its peak was trading at almost $150 a few weeks ago.
Since OpenAI had taken a brute-force approach, it seemed the only way to get better at AI would involve adding processing capacity.
There was a processor war underway with companies trying to secure Nvidia’s processor output. Also, nuclear plants that had been shut down for years were being brought back online to cater to the vast energy needs of these models.
DeepSeek said in a research paper that its V3 model cost a mere $5.576 million to train. By comparison, OpenAI CEO Sam Altman said that the cost to train its GPT-4 model was more than $100 million.
Source: Fast Company
This is at the heart of Nvidia’s gut-wrenching downfall. If you do not need as much capacity, the investment declines and the future prospects look dim.
Only Way
In 2023, Sam Altman took a world tour to tell investors and policymakers across the world that there was no point in trying to build a similar product. He told them to “forget it”. All of the investors in India meekly complied.
He also wanted the area to be heavily regulated and a body such as the IAEA to be created to potentially kill any possible competition that could arise from any corner of the world.
The Chinese did not care what Sam Altman had to say. They not only produced a comparable product in much less time but also made it radically more efficient.
I had written at the time - building Amazon in 1995 was hard; today, monkeys can probably develop an e-commerce site. The barriers to entry erode away over time.
He wanted to make it seem that this was the only way, of course, it was not.
Open
The most radical part of this entire saga is that OpenAI is not open which Deepseek is.
Unlike its Chinese counterpart, OpenAI doesn’t disclose the underlying “weights” of its models, which determine how the AI processes information. It also has declined to make public the full “chains of thought” produced by its own reasoning models.
Source: WIRED
Americans like to tout themselves as the purveyors of freedom. They would like the rest of the world to believe that they are the only ones to operate at the cutting edge of technology.
This news certainly put a spade in the works. It makes it obvious that the reality is far different.
First of all, this whole AI business has delivered very little real-world value. Hyping it up to this extent was not needed. Now that this regurgitation technology has been hyped up to this extent, you need to be able to uphold the bar that has been set. On that front, this announcement has highlighted the spectacular failure of Big American Tech.
This coming a week after Trump’s Inauguration almost seems orchestrated. You have a president announcing imperialist ambitions and a departure from all the values Americans profess (not live by). Immediately after, their technology supremacy is questioned blatantly.
Helping the Oligarchs
China’s greatest contribution to the world has been helping its oligarchs rob American jobs. First, the manufacturing jobs were lost and now writing and coding jobs will be lost to AI. Deepseek will only help hasten this.
Under similar constraints couldn’t India develop something called Everest ? We rebuked Altman for his comments. Couldn’t we give him back a deepseek / Everest ???