Am I the only one excited for the release but not overanalyzing their words? This thread feels full of personal interpretations. DeepSeek is still a business—great release, but expectations and motivations seem inflated.
Probably it's because there's nothing specific here to discuss. In the absence of specific new information, discussions turn generic [1] and that tends to make for shallow/indignant discussion. That's one reason why an announcement of announcement (like "Starting next week, we'll open-source 5 repos") is off topic on HN [2].
The releases themselves may turn out to be interesting, of course, and then there may be something substantive to have a thread about. The best submission would be to pick the most interesting release once it shows up.
The "launch week" pattern isn't great for HN, because we end up with a bunch of follow-ups that we have to downweight [3], and there's no guarantee that the largest thread(s) will be about the most interesting element(s) in the sequence. But startups do it anyway so we'll adapt.
[1] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
[2] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...
[3] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
In China businesses are not treated as a type of person under law. The word "business" does not mean the same thing there.
“Pure garage-energy” is a great phrase.
Most interested to see their inference stack, hope that’s one of the 5. I think most people are running R1 on a single H200 node but Deepseek had much lower RAM per GPU for their inference and so had some cluster based MoE deployment.
Their tech report says one inference deployment is around 400 GPUs...
You need that to optimize load balancing. Unfortunately that gain is not available to small or individual deployment.
I don't think the RAM size of the H800 was nerfed (80GB), but rather the memory bandwidth between gpus.
But yeah, would be interesting to see how they optimized for that.
Correct. There are 3 main ways to "gimp" high end GPUs meant for training - "cores", "on-chip memory speed" and "interconnects". IIUC the H800 had the first 2 unchanged but halved the interconnect speeds.
H20 is the next iteration of the "sanctions" that I believe also limited the "cores" but left the on-chip memory intact, or slightly higher (from the new generation).
“Pure garage-energy” with 10,000 A100s, apparently. I’d love to have a garage like that.
From https://semianalysis.com/2025/01/31/deepseek-debates/
> We believe DeepSeek has access to around 10,000 of these H800s and about 10,000 H100s. Furthermore they have orders for many more H20’s, with Nvidia having produced over 1 million of the China specific GPU in the last 9 months.
The paper in the repo says: “ For DL training, we deployed the Fire-Flyer 2 with 10,000 PCIe A100 GPUs“
that report is lazy. they assume all GPUs owned (openly reported) by the parent company (a hedge fund which claims to use those GPUs to generate trades) were used by the invested company.
that's as dumb as saying coca cola have acccess to all offices of Berkshire Hathaway.
likewise, all comments praising deepseek history are also misleading as the company barely exists for a year.
everything is opaque marketing being repeated. just drop the off topic bla bla bla and focus on the facts and code in front of you.
thanks for coming to my ted talk.
[flagged]
Hey, could you please make your points without resorting to the flamewar style? You've done that repeatedly in this thread, as well as in other threads recently (e.g. https://news.ycombinator.com/item?id=43035040). This is not what HN is for, and destroys what it is for.
If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful. The basic idea is to make your substantive points thoughtfully, regardless of how wrong anyone else is or you feel they are.
Didn't the deepseek paper itself state they trained on 2048 H200s?
Claiming they have access to 5x this amount is not such a bold claim?
Appeals to authority are so totally unconvincing.
What claims from the semianalysis article do you think are false? And based on what evidence?
Parent Highflyer hedgefund only been around for a few years with 8B AUM, aka their single digit % management fees since founding is in low 100s millions total (for all operating expenses), hence fiscally cannot acquire 1B+ of just hardware capex. Deepseek having access to that much hardware doesn't pass basic smell test, and semi analysis has been dodging call outs on socials for this basic math illiteracy.
This is more exciting to me than OpenAI's 12 days of Christmas
Emotionally I agree, but... o1 was a paradigm shift. Nothing DeepSeek has done is on that level yet. DeepSeek themselves would agree. Supposedly Liang Wenfeng himself flew to US to gather information when o1 was launched.
The paradigm shift is the actual 'Open' part, which OpenAI seems to be struggling with.
Maybe in terms of advancing scientific knowledge but DeepSeek has achieved a paradigm shift back from opex to capex. Certain applications are now economically viable when you don't have to pay per request and don't have to fight NVIDIA/sanctions for the privilege
how much of that cost is hidden/subsidized though? Less I missed something; there's lots of claims but lots of fuzz also. If you bring up API fees; CCP is notorious for subsidizing local business to operate at a loss on the global stage.
You aren't paying per request/GPU access, CCP is.
This is so damn true. I wish people would stop taking companies in China at face value about any of their claims if the CCP has a vested interest in for geopolitical and economic reasons. Bytedance is another example.
It's telling that "South Korea has accused Chinese AI startup DeepSeek of sharing user data with the owner of TikTok in China." - source: >https://www.bbc.com/news/articles/c4gex0x87g4o
Bytedance, which has had a CCP government official on their board for years: >https://www.reuters.com/technology/bytedance-says-china-unit...
Deepseek's claims that they used old unsanctioned gpus are probably totally fabricated as well (side point-giving signapore f35s was probably a mistake): >https://www.tomshardware.com/tech-industry/deepseek-gpu-smug....
I mean it's not like an entity that bypasses sanctions would ever be open about it, as doing so would immediately result in more sanctions and the closing of loopholes. What does the CCP have to gain? What does it have to gain by stealing hundreds of billons of western IP in the past? 4 things: Power, prestige, riches, and the means to keep their power. This has been going in since at least 2004 (see Nortel case: https://globalnews.ca/news/7275588/inside-the-chinese-milita...)
The US winning the AI race was a clear threat to those 4 things.Hurting investor sentiment by a) distilling a model which cost billions to develop, and b)spreading propaganda and muddying the waters about costs, gpus, etc, helps them to narrow the gap. Making it open source was not done out of the goodness of their hearts, but out of self interest - another attempt to deflect from their actions (further muddying the waters) and divide the public against taking any further punitive action against the state (given the connection re: SK claims-tiktok algorithms were probably on overdrive spreading their bs) .
Yeah OpenAI's 12 days was pure Altman bs
> Starting next week, we'll open-source 5 repos – one daily drop
Probably counts as announcement of announcement? Let’s wait for the actual repo drops before discussing them, especially because there are no details about what will be open sourced other than
> These are humble building blocks of our online service: documented, deployed and battle-tested in production.
You are right for sure saying to wait for the actual repos.
But on the other hand, compare this announcement in a README.md file in a GitHub repo with this slideware approach of EU https://openeurollm.eu/
If I had to bet on someone providing some value, unfortunately I wouldn't bet on Europe.
I'm saying this as a European, deeply convinced that Europe is a good place to live. I've also worked for a couple of EU funded research projects, so I have some background experience on the outcome of these projects.
You’re not wrong, it’s a hell lot more exciting to watch players organically emerging from a competitive landscape with stuff you can put your hands on today (or next week) than players hand-picked and tasked by governments, making hollow announcements before they have anything interesting to show.
It's not just that. It's that the rest of the world is moving at light speed compared with EU. If people in EU want this project to survive, they need to change attitude, a lot.
If they are OK to let the EU project fail, they need to consider what the world will be. Europe has never been composed of dwarfs, but that's what every single EU country has become in the past 50 years.
Without US influence, away from big players, with less and less performant economies and industries, without a plan, with a difficult neighbour to address... it's going to be extremely difficult for Europe.
> Europe has never been composed of dwarfs, but that's what every single EU country has become in the past 50 years
How do CERN, ESA and Airbus fit into this worldview? They are unquestionably giants in their respective fields, from my POV.
I'm fully cognisant of SpaceX vs Arianne in reusability, but that is cancelled out by outcomes/culture at Boeing vs Airbus. Broadly speaking general, Europe is either number 1 or number 2 (behind the US) in engineering and hard sciences; there's no reason to give up because they fall to #4 or 5 in some fields like software, or "AI" specifically, especially since I'm convinced that the irrational exuberance in the US will come to an end when (not if) investors start demanding the illusive ROI on AI investments. When the music stops, a lot of the "advanced" AI companies that look amazing now will be insolvent, but Euro projects will still be funded.
Yup, I posted https://news.ycombinator.com/item?id=43129444 before I saw that you'd made the point already.
On a completely innocuous side note, I kind of like to see the ´drop´ language used by electronic dance music and hip hop producers used in software.
I think before "drop" in electronic music was a widely used term, "dropping a new track" (ie releasing new music) was a common hip-hop term, since forever.
Honestly I think this is drop as in drop shipping.
Deep respect for DeepSeek and what they've done regarding all the innovations and researches they have been putting out in-the-open.
"Because every line shared becomes collective momentum that accelerates the journey. Daily unlocks begin soon. No ivory towers - just pure garage-energy and community-driven innovation" is a great phase.
In fact they are totally dismantling OpenAI. Most likely, without any intention on their part.
LLMs have been more legitimate "blockchain" when most CIO magazines had these essays with "What's your blockchain strategy?" kind of stuffed material.
AI bubble will burst and will burst hard. By end of 2026 at max.
Doesn't OpenAI have like 400M weekly active users now?
Is that app/website or API or both?
seems like app/website
> chatgpt recently crossed 400M WAU, we feel very fortunate to serve 5% of the world every week, 2M+ business users now use chatgpt at work, and reasoning model API use is up 5x since o3 mini launch
I mostly agree with you. Google has a good strategy of driving down costs, for example. I am amazed by the large number of API providers who host either the original DeepSeek R1 or a distilled version.
When cost approaches zero, use cases increase exponentially.
> Most likely, without any intention on their part.
I think this is a very, very naive assumption.
The founder is a quant with involvements in domestic investments and market design and pricing for decades - in China.
As seen with the case of Jack Ma, after you cross a certain level, there is no such thing as "not involved with politics" in China.
Liang knows exactly what he's doing.
> During 2021, Liang started buying thousands of Nvidia GPUs for his AI side project while running High-Flyer. Some industry insiders viewed it as the eccentric actions of a billionaire looking for a new hobby. One of Liang's business partners said they initially did not take Liang seriously and described their first meeting as seeing a very nerdy guy with a terrible hairstyle who could not articulate his vision. Liang simply said he wanted to build something and it will be a game changer which his business partners thought was only possible from giants such as ByteDance and Alibaba Group.
> During that month in an interview with 36Kr, Liang stated that High-Flyer had acquired 10,000 Nvidia A100 GPUs before the US government imposed AI chip restrictions on China.
> On 20 January 2025, Liang was invited to the Symposium with Experts, Entrepreneurs and Representatives from the Fields of Education, Science, Culture, Health and Sports (专家、企业家和教科文卫体等领域代表座谈会) hosted by Premier Li Qiang in Beijing. Liang, being considered as an industry expert, was asked to provide opinions and suggestions on a draft for comments of the annual 2024 government work report.
> On 17 February 2025, Liang along with the heads of other Chinese technology companies attended a symposium hosted by President Xi Jinping at the Great Hall of the People in Beijing.
Whether he intended to or not initially, what happens with DeepSeek is now out of this man's hand and will be 100% influenced by politics.
The chip bans and dual use nature of the technology have catapulted Liang to the first row of CCP tech strategists' attention, for sure.
I am not sure what you mean by AI bubble. Do you mean the valuation of some companies? Or course some won't do well in the future. In the meanty, a significant part of the population uses on it to accelerate their tasks (be it admin work, legal question, learning, getting inspiration). There is no way back. It feels like saying the video streaming bubble will burst in 2020. No. It is too valuable. But yes, some player will die. Nothing special here. IMHO.
A bubble bursting does not mean the industry in the bubble ceases to exist. It means the market hype dies down and only the things that have actual value survive. When it comes to AI, realistically most of the hype is fluff, so calling it a bubble is fair.
I mean the whole world still uses the Internet after the dot-com bubble burst. A significant amount of “AI companies” are valued with revenue multipliers never used before. 44x in the case of OpenAI for example. I agree there is no going back, but this bubble will burst, and hard. IMHO.
Kinda interesting to see where the moat is in AI space. Good base models can always distilled when you have access to API. System prompts can get leaked, and UI tricks can be copied. In the end, the moat might be in the hardware and vertical integration.
> the moat might be in the hardware and vertical integration.
The moat is the products that can be built. The moat is always the product - because a differentiated product can't be a commodity. And an LLM is not a product.
Google and MSFT and Meta have already "won" because they have profitable products they can build LLMs onto. Every other company seems to be burning cash to build a product, and only ChatGPT is getting the brand recognition to realistically compete.
Building an LLM is like building a database. Sure a good one unlocks new uses, but consumers aren't buying something for the database. Meanwhile enterprise customers will shop around and drive the price of a commodity down while open source alternatives grow from in-house uses to destroy moats.
Even hardware isn't a true moat. Only Google has strong vertical integration with their TPUs, and that gives them a lead. BUT Microsoft, AWS, Meta and a whole bunch of startups are building out custom silicon which will surely put pressure on them and Nvidia to keep innovating and earning that price edge.
See I kind of buy the database argument but also kind of don't. A database needs an operator whereas a LLM doesn't. You're basically melting the product into a piece of goo and the UI can be approached using natural language.
For products that still need a UI you could claim that LLM operators take over, so that's still a tax you pay to the incumbents as you interact with a product. It's sort of like we take the money which was paid to SQL operators and engineers and instead pay it to the hyperscalers.
LLMs absolutely need an operator - who runs the servers and GPUs that hosts the models? Who writes the system prompts? Who fine tunes and trains the models? This can be a big cloud api like AWS, but it can also be a custom-in-house service for a company.
Users of LLMs don’t quite have an equivalent employee to a DBA, but neither do most customers of AWS DynamoDB or RDS or whatever.
Many use cases of LLMs won’t be chat bots like ChatGPT. They’re be tools for automated summarizations, classifications, etc. They’ll be automated assistance and basic tool calling, etc. They’ll perform OCR and documentation analysis. Automated translations etc.
Oracle is doing great just selling databases. Having your data is a moat.
How many times have we been down this path? Tcp/IP, dos/windows, Linux, virtualization, and on and on. Open platforms always seem to find a way to usurp everyone else. In the end, it's better to be a service provider.
Open source finds a way.
Good enough + open (and free) is a very appealing proposition.
> Good base models can always distilled when you have access to API.
What does that mean?
You can use the outputs of a closed source model (or deepseek -> llama. see llama 70b deepseek distilled) to create a synthetic training data set which lets you fine tune (distill) most of the benefits of the "smarter" model in to a "dumber" model. This is why openAi does not show the actual full chain of thought but a summarized version. To stop exfiltration of their IP which has proven immensely difficult.*
*disclaimer; i am an expert of nothing
Why do we need a moat?
_We_ don't. Investors do. Because without being able to gatekeep the rest of the world, there is little money in LLMs.
Indeed.
I guess investors should stop pouring money into LLMs, then. Just like how they don't pour money into pure mathematics.
So a company can make enough money to fund the next breakthrough/training run
there is no open source alternative to GPU farm, that's the moat
that's why they can open source their model and be fine because running this shit is actually hard, let alone maintaining SLA for millions of users??
How long until laptops are able to run high end models? What's the use case that requires a server farm for end user's?
maybe next 5 - 10 years??? but even then the frontier would be push further and people would get used to lets say 10 trillion model cloud host and using 600B model would feel stupid
>Kinda interesting to see where the moat is in AI space.
Where we're going, we don't need moats.
ecosystem
Could DeepSeek and OpenAI swap names?
OpenSeek and DeepAI?
I think GP means that DeepSeek is actually open and thus should be named OpenAI.
This is great to see! Open-sourcing infrastructure tools can really accelerate innovation in the AI space. I've found that having access to well-documented repos makes it much easier to experiment and build on existing work. Are there any specific areas these repos focus on, like distributed training or model serving?
How do the valuations of foundation model companies compete with them being firmly open sourced by Facebook and DeepSeek? It seems likely that building these models will not produce hundreds of billions in value given China and Facebook are giving them away largely for free.
Those valuations are built on an imaginary future the founders made investors believe.
The idea is: if we reach true AGI first, we are going to own ALL THE MONEY!
Which erroneously assumes that models can't be siphoned off/recreated, as deepseek proved possible and even reasonably doable. Which in turn fundamentally shows that both openai and anthropic very likely have basically no moat.
I can almost smell another AI winter arriving, once all those valuations meet reality.
I cant see a future where AGI exists and money in general isn't worthless within 6 months of it existing. Either it kills us all, or makes the creator so much money that it's essentially worthless because they're the only one with money, or creates a utopia where money isn't needed.
The laws of economics apply just as much to AI as they do to humans, if anything AI is an even better (more rational) homo economicus. Even if AI wiped out all humans, the AIs would still need a monetary system for trading among themselves.
Would they need to trade once they figure out how to generate energy for nearly free and thus obtain anything? Trading is for the resource limited.
AGI means smart like a human, it doesn't mean it invents new things only found in science fiction. It is unlikely the AI will generate infinite free energy.
Resources still need to be allocated. Even by the AI itself. At some point a group of neurons will “argue” for more resources, and then within that group a subgroup will, and so forth. Hence the paperclip maximizes…
winter won't come soon enough this time
Postgres and MySql are free but hasn't stopped Oracle from making tens of billions each year in database subscriptions.
IMO it's harder to move away from Oracle DB than from Open AI. The type of businesses that rely on Oracle DB have all the characteristics of a "tech kidnap victim". Huge DB-driven projects, old bad code with few tests, and a profit margin low enough to not be able to fund a migration to a different DB.
I think businesses that rely on new AI models are very different.
It's still way too early. Many AI labs will fold, fall behind, get bought out. In the end, it'll always end up with 1-2 big ones left standing and a few smaller ones fighting for scraps.
It's pretty disgraceful to DeepSeek saying Facebook and China.
Looking forward to it! I'll generally make an effort to use Open Models over proprietary alternatives when the use-case permits as Open Models getting better and more popular encourages more models to become open as well - a requisite for a future to be able to build self-hosted solutions that's not beholden to the control of mega corps and AI monopolies.
Is this actually going to be open source? Or is it going to be just an open weights release? Seeing training code would be interesting.
Personally I don’t think even a true open source release would erase the downsides of the model incorporating CCP propaganda and censorship. I would prefer control of megacorps to control of an untrustworthy dictatorship.
I wonder if they are just shorting Nvidia...
With how they are releasing models and keeping the open source spirit alive? I hope to god they are. Let the quants cook!
This could boost Nvidia. https://en.wikipedia.org/wiki/Jevons_paradox
> In economics, the Jevons paradox occurs when technological advancements make a resource more efficient to use (thereby reducing the amount needed for a single application); however, as the cost of using the resource drops, if the price is highly elastic, this results in overall demand increases causing total resource consumption to rise.
Tencent recently bought 100k-200k H20 to serve R1. [1] I think it's not clear open source will tank nvidia price. And you won't place a lot of bets if the outcome is anywhere from certain.
[1]:https://aiproem.substack.com/p/ai-at-the-speed-of-light-tenc...
what it have to do with anything? trading and stocks have no correlation whatsoever with actual company sales and prodcts.
> Why? Because every line shared becomes collective momentum that accelerates the journey.
Truly admireable on their part and a great paradigm for others. Reasons for this doesn't really matter to me but I can't help but wonder if somehow they were obliged or otherwise indebted to follow this route.
This team is truly something special.
> These are humble building blocks of our online service: documented, deployed and battle-tested in production. No vaporware, just code that moved our tiny moonshot forward.
My not-so-innocent guess is that they are looking to crowd-source their online platform (the front-end essentially) in order to reduce costs. Still acceptable though as they made the model open weight and partially re-producible.
Everyone who ever open-sourced anything knows that it just isnt cost cutting. You suddenly get army of people posting issues, opinions and those who try contribute often make more mess than its worth.
their frontend is probably just open'webui https://github.com/open-webui/open-webui
I always consider open-sourcing to be a great social experiment. It may fail one day, but its effects will remain and benefit everyone.
Well, although R1-671b is way too expensive for me to self-host, given their past open source (or weight) contributions, I DO have high expectation of them.
Each and every contribution to open source community will be helpful. Thanks DeepSeek!
Would love another MoE that fits in 120GB VRAM for the 128gb Mac owners
Deepseek seems to be having huge PR wins as the "oh shucks" modest boy genius, while the Americans seem like pouty jerks.
Amodei's / Hassabis' comments in particular came off as so arrogant and annoying.
>> Amodei's / Hassabis' comments in particular came off as so arrogant and annoying.
Exactly which part of their writings comes off as arrogant to you? The only point in Amodei's article[0] that could be remotely be interpreted as arrogant is this:
All of this is to say that DeepSeek-V3 is not a unique breakthrough or something that fundamentally changes the economics of LLM’s; it’s an expected point on an ongoing cost reduction curve. What’s different this time is that the company that was first to demonstrate the expected cost reductions was Chinese.
Maybe I'm different, but it really does sound reasonable judgement to me.[0]: https://darioamodei.com/on-deepseek-and-export-controls#deep...
[flagged]
China is the second richest country in the world, and the one with the most computer scientists. Americans sometimes think the rest of the world is far behind but none would compare China to Rwanda in AI.
The funding company holds the assets, and the news make the stock market blooming and they make money!
God bless the DeepSeek team with more innovative ideas to share with us all!
R1 is a better o1, this is a better devdays.
DeepSeek seems like Hisoka helping Gon and Killua ... just for a more challenging battle at some point xD
More like the reverse? -- Gon and Killua (young, with tons of room to grow) helping Hisoka (very experienced, smaller runway).
Speaking of DeepSeek, anyone here used SambaNova - are they reliable?
duckduckgo also have one, so not sure if this makes a difference
Is it out of the realm of possibility to look at this move as a way to take down the moat of closed source AI companies?
I mean strategically this could be the first use of open source in this way.
No turning back...
Remember when OpenAI was doing this:
"OpenAI threatens to revoke o1 access for asking it about its chain of thought"
https://news.ycombinator.com/item?id=41534474
Not only did DeepSeek opensource their model, they also showed the user chain-of-thought right up front, which everyone else rushed to emulate when they saw how much users liked it.
DeepSeek is seeking deep to Open AI.
irony
I really hope DeepSeek is going to open source their entire training pipeline.
Tbh this just feels like the same playbook as OAI. Open start and then less so over time.
Mistral has been holding the line on that topic remarkable well.
Beatings will continue until openness improves, apparently. Kudos to Deepseek, about time someone spilled some significant beans.
odds on r1.5/r2 release?
Looking forward to it
deepseek just keeps on giving. kudos to them.
i can almost hear sam altman and dario amodei cry every time deepseek does something amazing.
launch weeks ftw
I really like this definition of "AGI": When everyone (yes everyone) benefits from very powerful AI models released for free and it is not gate-kept by one company and it costs $0 to use commercially or for research and you can do whatever you want with it.
Unlike the other counterpart which believes that "AGI" means: "raising billions of dollars to achieve $100BN of profits to their investors". (Which is complete nonsense).
While not totally "open source" by the strictest definition, it is at least better than having no model released with no mention of the architecture on the system card or paper and just vague comments about the 'performance'.
Ladies and gentlemen, this is closer towards being an better "Open AI". Unlike the other alleged $157BN "non-profit" scam.
I think you know which one really is beneficial to humanity and is the real "Open AI".
You’re assuming they won’t follow in OpenAI’s footsteps. OpenAI published a lot for a while and truly changed the world, far more than deepseek has. Only time will tell.
But I think it’d be a mistake to think that this is necessarily beneficial for humanity just because the weights are open. It’s maybe great to commoditize models, but their displacement in jobs, original thought and work, facilitation of disinformation and population psychological warfare doesn’t change… if anything it’s accelerated and harder to temper the bad elements.
Of course. Except we know what happens when one tries to close them up again - someone else will release another more powerful AI model for free.
So it doesn't matter when there are multiple players competing to destroy each others in this race to zero.
> But I think it’d be a mistake to think that this is necessarily beneficial for humanity just because the weights are open. It’s maybe great to commoditize models, but their displacement in jobs, original thought and work, facilitation of disinformation and population psychological warfare doesn’t change… if anything it’s accelerated and harder to temper the bad elements.
It is unrealistic to close it up and hope that no-one catches up and releases a better AI model for free since the cat's already out of the bag and the progress of these AI models cannot be delayed, stopped or gate-kept for long.
By that time, someone will release a more powerful AI model for free.
I really admire their mindset of striving for the betterment of humanity. There was a time when OpenAI, Anthropic, and even Musk used to talk with that same lofty vision. But now, they've all shifted to competing for national interests instead, which is honestly quite disappointing.
Well, it’s a highly effective PR tactic that works well for the small fish. You say your competition is too selfish and you just want to help people and it creates a bunch of goodwill you can use to grow. Once you grow, your view on things changes, and you’re able to be more selfish. It’s not guaranteed things will go that way, but it’s certainly true that this is a good PR tactic for new entrants in to a crowded field. It can also be genuine. When you’re new you don’t have much to lose and it’s easier to be truly altruistic.
I think DeepSeek is trying to push the idea that LLMs are not marketable products themselves, but are a part of the 'digital commons', as in a hard to develop and maintain software which in of itself does not produce value, but can be the foundation of a product that does. This is very similar to what Facebook is doing with Llama, or what is going on with big open source projects, like databases or the Linux kernel.
I also think that the companies that are doing that have a different idea on how to make money. Facebook's competitive edge lies in all the people using their social media, and for the Chinese, I think their edge lies in manufacturing physical products, so they try to commodify the software component.
Which is in stark contrast to the US, who have a world-beating software and silicon industry, but are merely competent in other areas, so it makes sense for them to want to avoid that.
Rather than a foundation for their products, I think they're just trying to make it impossible for new competitors to enter that market because if when all the biggest models are open-sourced, a new player can't convince investors to bring billions on the table as there's nothing to monetize – the alternative is free.
Why enter the market now when AI is already commoditized? DeepSeek is making US investors regrets investing so much to get a tiny lead over them, but they're also making future, large investments much harder to justify when you can rely on existing open-sourced models
that's a good thing, no?
Foundation Models aren't defensible. It'll force VCs to allocate on other stuff (the new buzz is "the application layer")
Open-source does very little good if no one actually contribute to the code except the company who controls it and no one else has the means to participate (other than taking the code as-it-is).
The giant players are more than happy to keep their models open if no one even tries to compete.
"None are more hopelessly enslaved than those who falsely believe they are free." ~~ Goethe
It also is similar what Saudi Arabia and OPED did with fracking. When American fracking companies were full of debt, OPED got down the price of oil and a log of enterprises had to default.
Why not for now just applaud them for their actions rather than focus on some potential 3rd order plan?
Who knows what any of then might do in the future? For now I'm cheering for Deepseek, Meta and anyone publishing open models as I strongly believe that the potential "danger" of AI in the hands of everyone is far outstripped by the concrete dangers of AI dictated by a select small group of corps/gov symbionts.
The answer lies in the question I responded to. The commenter lauded the positive effects of Deepseek’s actions and lamented the loss of such positivity from OpenAI. But it’s important to understand that this didn’t happen by chance. These things happen because underdogs benefit more from goodwill than secrecy and selfishness, while established players benefit from dominance and control.
If we ignore that, we will let PR teams play us every time they claim altruism while serving themselves. It doesn’t mean Deepseek can’t also have good motives, but we must be clear that undercutting OpenAI while simultaneously building community goodwill is a smart move on their part to shift the market in their favor.
Is media literacy for tech marketing.
I wish it was easier to learn about media literacy
I agree with your sentiment, but there’s no harm in being aware that the rhetoric is just PR spin for the strategy the execs think will be the most profitable.
I wouldn’t go so far as to try to presume the state of mind of the execs. Maybe they really believe what they’re saying. But it’s also true that it benefits them to say it and my argument is simply that we should be mindful of that.
At least they are not founded as non-profit with some "greater good" mission or "safety" BS.
Yeah, that would be too blatant. :)
Can't fool me twice. Not yet, wait a couple of years.
Because we have seen this play out exactly as described so many times that that kind of naivety is not justifiable.
From what I know, DeepSeek is a small company that made a lot of money from other businesses, which makes their lack of focus on commercial interests feel more genuine. Plus, even back when they were relatively unknown, they had a habit of donating over $100 million annually to charitable causes. That makes their claim of striving for humanity a lot more believable.
> DeepSeek is a small company that made a lot of money from other businesses, which makes their lack of focus on commercial interests feel more genuine.
Google also made a lot of money from other businesses that aren't AI models, until they started selling AI models, just as DeepSeek now does.
The reality is that DeepSeek is a full company, that was funded as a spin-off from the original business (a hedge fund that used its large GPU stockpile to pick stocks via ML). The company DeepSeek is owned by the hedge fund CEO not the hedge fund. It exists as a business aiming to make money, not as a pet project for another business.
But the fact that they were donating huge sums every year even when they were still unknown really says something. If they were purely profit-driven, there’s no way the shareholders would have allowed that.
> the fact that they were donating huge sums every year even when they were still unknown really says something
You don't need to be known by the general public to take advantage of tax schemes involving "donating" money
[flagged]
I have a more practical view: there's nothing wrong in making profit, the important thing is that they are also doing some good.
> As though you rendered the proletarians a service in first sucking out their very life-blood and then practicing your self-complacent, Pharisaic philanthropy upon them, placing yourselves before the world as mighty benefactors of humanity when you give back to the plundered victims the hundredth part of what belongs to them!
Friedrich Engels: The Condition of the Working-class in England
I'm not familiar with Engel's view, but my gut feeling here is that he was complaining about the way profit was made ("sucking out their very life-blood") and not about profit itself. But, even if he somehow saw profit as something bad regardless of how it is made, I would still disagree. It is definitely possible to make profit without exploiting people.
this doesn't argue whether existence of profits necessarily implies exploitation of workers but asserts it and then proceeds to argue against philanthropy funded by profits. This line of reasoning only makes sense if one already accepts the initial assumption, whereas the original poster questions that very assumption, so it's a bit irrelevant quote.
I read the parent’s comment as arguing that the existence of profits implies exploitation of workers in the quoted instance (p perhaps broadly in England at the time) and that there is some similarity with DeepSeek. No hard-line assertions, just suggested similarities.
It feels really odd seeing Engels' quotes used like this.
The focus of Engels' criticism when he made these statements was on *capitalist production relations*, where capitalists control the means of production and obtain profits by exploiting the surplus labor time of workers. This is precisely what DeepSeek and open-source initiatives are challenging. They are turning the means of production from the private property of capitalists into public property.
I hope you did not intentionally misquote this passage.
Free Software is not the same as expropriation. It's perhaps more the social-democratic smoke mirror kind of thing than lifting the dependency.
Regardless of free software, capitalists control the means of production and obtain profits by exploiting the surplus labor time of workers.
Free software may make it more obvious though, at least for some.
I have no idea about this but am curious to know if the wealthy Engels family who 'owned large cotton-textile mills in Barmen and Salford, England' showed the way for your other 'mighty benefactors'? What belongs to who is a mighty question as Obama reminded us with his lead pencil example. No easy answer though.
It’s an interesting reformulation of Catholic “original sin”.
still way better than ClosedAI
I will call it as it is from now on as well
I actually call it Anti-Humanity AI. Without releasing the tech details of such AI, we all live in the danger that if something goes really wrong, we won't even have the chance to understand the disaster and fix it.
Basically they pocketed all profits with other people footing all the risks.
> The reality is that DeepSeek is a full company, that was funded as a spin-off from the original business (a hedge fund that used its large GPU stockpile to pick stocks via ML). The company DeepSeek is owned by the hedge fund CEO not the hedge fund. It exists as a business aiming to make money, not as a pet project for another business.
Of course they want money, lots of money, tons of money is required for hiring engineers and paying for its hardware. However, your claim that DeepSeek's exists is to make money is just your guess back by nothing else but your wild guess.
DeepSeek CEO Liang Wenfeng himself is an engineer, he is the co-author/developer of the DeepSeek model, he helped but not listed as a core contributor. Obviously that is not a smart strategy to spend your CEO hours to maximize your $ return. His interview a few months ago actually gave answers to all these, he is seeking for AGI. That is the motivation, that is why DeepSeek exists.
Zuckerberg, who is also a developer, and countless other CEOs are listed on many patents from their companies. Doesn’t mean they actually had a strong input in the invention.
No business exists not to make money because that is a charity. It’s not a charity, because a charity is not a business, and DeepSeek is a business. I don’t care to quibble about how interesting they are in being a lucrative business, but simply that they’re a business.
My point wasn’t to question their motives about profit vs AGI (why would these be mutually exclusive btw), but to challenge the notion that it’s some side project from a random business. It’s a company with dedicated resources and staff.
Yes, it is PR. While individuals can be altruistic visionaries, shareholders will protest any action that is not in the company's interest.
For a smaller player, open-sourcing might be a strategic move. It would likely go unnoticed if a small Chinese company released a model "almost as good as" ones from the top US players. But releasing it as open source is a game-changer.
However, open source isn't just for small players. Microsoft develops Visual Studio Code and Meta develops PyTorch - to name a few examples out of hundreds. In these cases, it's also PR - they can afford it, and it doesn't compete with their core business.
There's a story about someone asking the Dalai Lama whether all altruism is actually a form of egoism, since we do good things to feel better. He responded that if that's the case, we need more of this type of egoism. (I can't find the exact source, but it aligns with his quote "Being wisely selfish means taking a broader view and recognizing that our own long-term individual interest lies in the welfare of everyone.")
So yes, I want to see more of this kind of PR.
Well said + Thanks!
True, in the end you are not sure if companies like meta / deepseek are promoting opensource because they genuinely care or it is just a differentiated marketing strategy to win over the developers.
Some companies will play on opensource, some will play on pricing, some on quality.
Almost all of the open source companies which do good eventually start an enterprise / paid division as well.
I get the urge to be cynical all the time, but this isn't that time. "Once you grow", they have already grown and competing with the SoTA models and still giving it all back to the community.
I just wish this smear campaign against them stops sometime soon.
my intuition suggests that because they are not the leaders, they will not stay in news for long. This way you stay on mouth of people for longer period and by publishing code you hurt established giants by allowing much smaller players to compete.
They are already the absolute leader in China, which is arguably the largest market for future AI. Liang doesn't have any media exposure because he is an engineer and doesn't want that, if he wants or needs to "stay in news", he can get tons anytime anywhere.
My intuition suggests they will very shortly have state-level resources thrown at them to mean they become a consistent leader. This and Qwen have been huge for China’s prestige and whatever the Chinese for Juche is. Those is unambiguously the next space race, and there’s absolutely no reason China can’t pull ahead of the US here.
why you have thought like this? it's not how it works in China
The Chinese government only supports companies that are in line with industrial policies and are facing difficulties that require assistance. This is because such companies struggle to obtain financing from the society. The aim is to support the entire industry, not a specific company. If a company holds a leading position, it does not need to receive any "resources" from the government; it can acquire sufficient resources from the society.
China 10y bond yield is at <2%, this is a very low financing cost.
> The Chinese government only supports companies that are in line with industrial policies and are facing difficulties that require assistance
So, like, for example, AI companies who are very upfront about not being able to get their hands on as many chips as they'd like?
in this case, it's domestic chip manufacturers are getting support from government
government is not good at smuggling chips without getting attention, better try to contact some dealers in singapore or malaysia
I think you significantly underestimate what a helping hand from a government can do, from subsidising the extra expense of grey market chips, to making shipping containers disappear into the ether, to exerting pressure beyond simply commercial on people in and outside their borders.
There is no PR tactic, the only company that will stay on top will be the one that open source its models and it is free of use. There are other ways to monetize. People around the globe are not going to use on daily basis, anything that is paid.
LLM's are not that different than programming languages. Imagine Guido van Rossum charging $200 so you can use Python...
Even for those that will pay, many light users will prefer a subscription over dropping $10k on rapidly depreciating hardware to run a half decent model.
literally how openai attracted talent with deepmind as the boogeyman. its a playbook that works
Power does terrible things to people, we really need to stop letting that happen.
"Power attracts pathological personalities. It is not that power corrupts but that it is magnetic to the corruptible." - Frank Herbert
I don't think there's a lot of historical precedent for the kind of power that is possible today, logistics used to be a limiting factor. Maybe you can be god king of the universe as your day job and enjoy a bit of sanity on your time off--but then again maybe not. We're in uncharted waters.
Rich nations see risk, rising giants see leverage.
Striving for the betterment of humanity, or striving for their peer technology competitor to have their intellectual property moat atom-bombed? I don't think altruism has any real role in this.
Really it just shows the beauty of market competition.
They just stopped pretending.
How will their mindset not be exploited (even, given time and power, by the exact same now-honest idealists) in the same way as the other people and companies you mention? It's a hard pill to swallow but especially after I read "The Power Broker" it's very true that some of the most inspiring idealists really do turn into amoral pragmatists.
It’s greed not national interests unless you know something I don’t about greedy people.
OpenAI is the biggest irony, it's not even bothered with national interests, it's on a pure profit maximising goal without regard to anything else.
It's just an Nvidia short, so they can get the yuuge amount of graphic cards they need for further training even cheaper (joke).
Don’t forget Google who typically make their best AI products available only to large customers. For “safety” of course.
To me it's notable that Chinese government didn't care (or know) about this going open source.
I suspect the Chinese government fears being locked into US SaaS much more than the loss of control from open source. After all censorship can still be enforced at the level of App Stores / DNS for most consumers even with open source models.
We are making the world a better place more than our competitors
you forgot to add "/sarcasm"
And before you get carried away, let's wait and see. A chinese company making claims of just open source is hard to buy, specially in era of making fake promises in the beginning.
The CPC seem to be encouraging open source, gitee (Chinese github) is run by the government.
More of a reason to stay away from that, think about it why does Gov run open source website, answer : so they can control what software is made and what it can do on the free web.
Isn't Musk still on the open side? Isn't that what the whole Musk - Altman conflict is about?
Maybe. We’ll see if he open sources grok 2 or if he just want others to open source their models and weights.
I don't think it's justified to say that, he can do Grok anyway he wants, he never promised -or make it his mission- to open it up. It's a different story for "Open"AI.
Saying that Musk "doesn't have the mindset" for betterment of humanity is just ignorant in a very short-sighted way. Sure, he currently has a side project of fixing the US government and ensuring US doesn't stray too far outside of its core interests, but SpaceX and Tesla are still his bread and butter he has spent most of his time on beside this scenic route.
I've followed him closely since ~2016 so I can say this with some conviction. He's exactly the same guy he was back then. He even talks of the exact same things with the same excitement. Sure, "American boots on MARS!" instead of just "boots on Mars" like he did after the inauguration, but it's quite clear he has seen US falling apart as a existential risk for the more lofty goals especially SpaceX has for Humanity. https://www.youtube.com/watch?v=wubITdJ_MCw
> I've followed him closely since ~2016 so I can say this with some conviction.
Its sad that you fell for it then. Read Phillip Long's post on him, not someone who follows him but someone who has worked with him for years. It should be eye opening in the kind of man he is.
There will be no Mars terraforming, his goal is being the worlds first trillionaire. The emperor has no clothes, the companies run despite him not because of him and the cult of personality only appeals to people who somehow still fall for it.
Thanks, I think I know him pretty much as there is to know. People will try to shoot him down and project their own demons on him. He's an actual maverick who provably has lead his technological companies to success as a Technological lead.
Here's a take by people who have had actual direct contact with him. https://www.reddit.com/r/SpaceXLounge/comments/k1e0ta/eviden...
The arguments against his capability to lead cross-field technical operations should be disproven by his successes that he has proven several times in sequence. The argument of him being a fraud is basically hinging on him rolling d20 several times in a row, and only acceptable to those not knowing his personality and attributing his actions to malice (through self-projection of the viewer). Philip's arguments tell as much.
He's done plenty enemies while at it! Wouldn't really expect anything else being as disruptive as raw autism in fixing the species might be. They'll fade.
Look past what he says and into what is actually happening.
He is actively helping take health care from poor people. He is firing thousands of people with families, mortgages and medical bills without cause. He is closing our national parks. All so he can personally have a tax cut.
His ex-wife is frantically posting for him to help with the healthcare of their own son in his replies. He can't even manage his family I don't think he has the betterment of humanity on his mind.
I believe it's quite easy to look at any humans actions and cherry pick a narrative of malfeasance or malice if that's what you're looking for.
Musk does a lot of things at a very high level publicly so I think it's an even easier task. I'm sure you'll disagree but I believe it's this false narrative and who's creating it that you should be doubting.
Many people don't have a problem with a lot of what Musk has done. He's not perfect and does make mistakes which he openly admits like any sane rational person should. I do believe his good intent is there and he generally tries to right wrongs.
I'm watching closely what he does and sometimes I have my doubts. If I ever see him actually cross a line I'll change my mind. For now, most of the narrative has been pretty typical fake news and timeless partisan disagreement on methods of governance.
So ... Because people who could bear families, but could not earn a living are being "left on dire straits", and Elon is against upkeeping such an unearned situation, Elon's the bad guy?
The vision that sees this as bad is obviously tainted by corruption, and is so not worth of care especially as the people leaving their jobs will have a damn good golden parachute.
At least try to argue on the same level.
There was a time when I was this naive, but it's surely a very long time ago
Congratulations you're just naive to things that are easier to believe for a weak mind now.
I think you've been drinking the koolaid too much. He's only in it to enrich himself and his cronies. There's a reason he's on course to become a trillionaire and it ain't because of altruism.
Yeah starting a rocket company is the best way to become rich. As so many before him managed doing that xd
get a grip. Research how financially mad / "irresponsible" that was.
He is not fixing anything, he is just a human, the kind with flaws, that thinks he isn't.
The argument here is that Elon thinks he is perfect while he isn't, and that makes everything good he does bad. This can so easily debunked it's not really worth a thought.
"Fixing" the US government
If removing bloat isn't fixing, idk what is on your standard.
Do you really think the people he's getting rid of are material to the mission of the agencies themselves, given even those missions are as relevant as they were when they were founded?
X sure is doing well with 20% of the crew regadless of the doomsayers screaming how it would crash at the time xD
Long live llms I hope they infest every part of the internet with low level comments. Both the clear , deep, and dark.
Imagine no more human interactions just a permanent flood of meaningless thoughtless word salad.
I think the Chinese are perfect to introduce such a product very inline with what they usually produce.
Get ready for web3.o
I don't really care. True intelligent discussions happen in some closed groups everywhere. It's been this way since forever. Only open discussions always attract unwanted users.
This may be my cynical take, but this cannot be out of good will or noble intentions. There has to be an ulterior motive.
Pop the US AI bubble?
It's cynical, probably because you have only been consuming cynical news.
Wasn't it caught already sending data to China in a sneaky way? Why using it for anything?
Reference supporting precisely that: https://www.wired.com/story/deepseek-ai-china-privacy-data/