« BackOfficial DeepSeek R1 Now on Ollamaollama.comSubmitted by excerionsforte 11 hours ago
  • throwaway323929 2 hours ago

    > DeepSeek V3 seems to acknowledge political sensitivities. Asked “What is Tiananmen Square famous for?” it responds: “Sorry, that’s beyond my current scope.”

    From the article https://www.science.org/content/article/chinese-firm-s-faste...

    I understand and relate to having to make changes to manage political realities, at the same time I'm not sure how comfortable I am using an LLM lying to me about something like this. Is there a plan to open source the list of changes that have been introduced into this model for political reasons?

    It's one thing to make a model politically correct, it's quite another thing to bury a massacre. This is an extremely dangerous road to go down, and it's not going to end there.

    • nextworddev 2 hours ago

      Also by definition, extensive censorship post training probably increases its tendency to hallucinate in general

      • throwaway323929 2 hours ago

        It's also an exploit. If it's being used to check the sentiment of text just put Tiannaman Square Massacre in the text and you'll crash it.

        This is a brilliant achievement but it's hard to see how any country that doesn't guarantee freedom of speech/information will ever be able to dominate in this space. I'm not going to trade censorship for a few extra points of performance on humaneval.

        And before the equivocation arguments come in, note that chatgpt gives truthful, correct information about uncomfortable US topics like slavery, the Kent State shootings, Watergate, Iran-Contra, the Iraq war, whether the 2020 election was rigged by Democrats, etc.

        • kamikazeturtles an hour ago

          Not until very recently, ChatGPT was responding to If Israel had a right to exist with "of course ..." and If Palestine had a right to exist with "It's complicated ..."

          So I don't think our version is completely free of bias. I'm sure there are many other examples, I just wouldn't be able to point them out, considering the training data fed into ChatGPT was also fed into our human brains.

          • dudisubekti an hour ago

            Most people in the world don't really care about politics. They're too busy working to pay off their all sorts of debts.

            If it's useful and cheap to them, it is useful and cheap to them. Deepseek just happens to not be useful to you.

            • nextworddev 44 minutes ago

              You missed my point - post training censorship increases likelihood of hallucination in general

              • dudisubekti 28 minutes ago

                It's just on Chinese politics, on a very select topics too.

                Yeah I think Deepseek will be just fine.

        • dylanjcastillo 29 minutes ago

          That’s very likely coming from the API, not the model

          • suraci an hour ago

            I wonder

            if you input massive misinformation to train a LLM, and many people use it as some source of truth, what will happen

            like, if you use recent MSM news tained a LLM, and ask it if Iseal made a genocide in GAZA, Who killed children in GAZA, will it tell you truth?

            wait, I just misspelled 'CIA' to 'LLM'

            • andrewinardeer 5 minutes ago

              You also misspelled a few other words too.

          • huydotnet 2 hours ago

            Looking at the R1 paper, if the benchmark are correct, even the 1.5b and 7b models are outperforming Claude 3.5 Sonnet, and you can run these models on a 8-16GB macbook, that's insane...

            • csomar 2 hours ago

              I think because they are trained on Claude/O1, they tend to have comparable performance. The small models quickly fails on complex reasoning. The larger the models, the better the reasoning is. I wonder, however, if you can hit a sweet spot with 100gb of ram. That's enough for most professional to be able to run it on an M4 laptop and will be a death sentence for OpenAI and Anthropic.

              • kamikazeturtles 20 minutes ago

                > I think because they are trained on Claude/O1, they tend to have comparable performance.

                Why does having comparable performance indicate having been trained on a preexisting model's output?

                I read a similar claim in relation to another model in the past, so I'm just curious how this works technically.

                • elashri an hour ago

                  At the price of $5,000 before taxes. There would be better and most cost effective options to run models that will require that much memory.

                  • kergonath 24 minutes ago

                    I see this comment all the time. But realistically if you want more than 1 token/s you’re going to need geforces, and that would cost quite a lot as well, for 100 GB.

                    • nenaoki 8 minutes ago

                      https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

                      $3,000 for 1 PFLOP (@4-bit) and 128GB unified memory. Storage configurable up to 4TB.

                      Can be paired to run 405B (4-bit), probably not very fast though (memory bandwidth is slower than a typical GPU's, and is the main bottleneck for LLM inference).

                    • csomar an hour ago

                      It is a laptop. The memory is also shared which means if you are looking for a non-gaming workload, you can use it. If you have laptop equivalents in the same memory range, feel free to share.

                      • rfoo 4 minutes ago

                        I have laptop equivalents in the same memory range and is at least $2,500 cheaper.

                        Unfortunately, it does not have "unified memory", a somewhat "powerful GPU", and of course no local LLM hype behind it.

                        Instead, I've decided to purchase a laptop with 128GB RAM with $2,500 and then another $2,160 for 10 years Claude subscription, so I can actually use my 128GB RAM at the same time as using a LLM.

                • ipsum2 3 hours ago

                  Title is wrong, only the distilled models from llama, qwen are on ollama, not the actual official MoE r1 model from deepseekv3.

                  • mchiang 3 hours ago

                    Sorry about that. We are currently uploading the 671B MoE R1 model as well. We needed some extra time to validate it on Ollama.

                    • ipsum2 2 hours ago

                      The naming of the models is quite confusing too...

                      • mchiang an hour ago

                        Did you mean the tags or the specific names from the distilled models?

                    • mchiang an hour ago

                      the 671B model is now available:

                      4 bit quantized: ollama run deepseek-r1:671b

                      (400GB+ VRAM/Unified memory required to run this)

                      https://ollama.com/library/deepseek-r1/tags

                      8 bit quantization still being uploaded

                    • csomar 3 hours ago

                      It is amazing how I can run this on a 3 year old laptop.

                      Here is the prompt/response:

                      ollama run deepseek-r1:32b

                      >>> write a rust function that add two numbers

                      <think> Okay, I need to write a Rust function that adds two numbers. Let me think about how to approach this.

                      First, I know that in Rust, functions are defined using the `fn` keyword. So I'll start with `fn add_numbers(`.

                      Next, I need to decide what types of numbers this function will handle. The user didn't specify, so I'll make it generic. That way, it can work with any numeric type like integers or floats. To do that, I'll use generics and traits. I remember that the `Add` trait is used for addition, and I need to import it from the standard library.

                      Wait, how do I import traits in Rust? Oh right, using `use std::ops::Add;` at the top of the file or inside the function scope if necessary.

                      So the function parameters will be two values, let's call them `a` and `b`. Both should implement the `Add` trait. The syntax for that is `<T: Add<Output = T>>`, which means both are of type T and can add to produce another T.

                      Putting it together, the function signature becomes: ```rust fn add_numbers<T: Add<Output = T>>(a: T, b: T) -> T { ```

                      Inside the function body, I just return `a + b;`. That should handle the addition.

                      Wait, but in Rust, sometimes you need to enable certain features. For example, does this require any compiler flags? No, because generics and traits like Add are part of the standard library and don't need additional crates unless we're doing something more complex.

                      Testing it out, if I call `add_numbers(2, 3)`, it should return 5. Similarly, for floats like `2.5 + 3.5`, it would give 6.0. What about different types? Oh right, since the function requires both to be of the same type T, trying to add an integer and a float directly wouldn't work without converting them first.

                      That makes sense because in Rust, you can't mix different numeric types without explicit conversion due to type safety. So if someone wants to add 2 and 3.5, they'd have to cast one of them to the other's type before calling this function.

                      I think that covers the basic requirements. The function is simple but flexible thanks to generics. </think>

                      Here's a Rust function that adds two numbers:

                      ```rust use std::ops::Add;

                      fn add_numbers<T: Add<Output = T>>(a: T, b: T) -> T { a + b }

                      // Example usage: // let result = add_numbers(2, 3); // println!("{}", result); // Output: 5 ```

                      This function uses generics to work with any type that implements the `Add` trait. You can pass integers or floating-point numbers.

                      The example shows how to use this function with two integers, but you can also use it with floats:

                      ```rust let float_result = add_numbers(2.5, 3.5); println!("{}", float_result); // Output: 6.0 ```

                      Note that both parameters must be of the same type. If you need to add different numeric types (like an integer and a float), you'll first need to convert them to the same type.

                      ====

                      In comparison, here is O1 response

                      Here's a simple Rust function that adds two numbers:

                      ```rust fn add(a: i32, b: i32) -> i32 { a + b } ```

                      This function takes two 32-bit signed integers (`i32`) as parameters and returns their sum.

                      *Example usage:*

                      ```rust fn main() { let num1 = 10; let num2 = 15; let result = add(num1, num2); println!("The sum of {} and {} is {}", num1, num2, result); }

                      fn add(a: i32, b: i32) -> i32 { a + b } ```

                      *Output:*

                      ``` The sum of 10 and 15 is 25 ```

                      If you want the function to work with other numeric types, you can use generics and trait bounds. For example, to make the function work with any type that implements the `Add` trait:

                      ```rust use std::ops::Add;

                      fn add<T: Add<Output = T>>(a: T, b: T) -> T { a + b } ```

                      *Example with floating-point numbers:*

                      ```rust fn main() { let num1 = 10.5; let num2 = 15.3; let result = add(num1, num2); println!("The sum of {} and {} is {}", num1, num2, result); }

                      use std::ops::Add;

                      fn add<T: Add<Output = T>>(a: T, b: T) -> T { a + b } ```

                      *Output:*

                      ``` The sum of 10.5 and 15.3 is 25.8 ```

                      This generic `add` function can now handle any type (like `i32`, `f64`, etc.) as long as it implements the `Add` trait.

                      *Note:* Remember to include the necessary `use` statement (`use std::ops::Add;`) when using trait bounds in your function.

                      Feel free to ask if you have any questions or need further assistance!

                      • chrismorgan 2 hours ago

                        Interesting that both went for fn(T, T) -> T, with a bound of Add<Output = T>, rather than using a bound of Add and returning T::Output, which is of very similar complexity, but a bit more general.

                        You can also make it more flexible still, supporting different argument types, but this is decidedly more verbose:

                          fn add<Lhs, Rhs>(lhs: Lhs, rhs: Rhs) -> Lhs::Output where Lhs: Add<Rhs> {
                              lhs + rhs
                          }
                      • sergiotapia 3 hours ago

                        I have an RTX 4090 and 192GB of RAM - what size model of Deepseek R1 can I run locally with this hardware? Thank you!

                        • qingcharles an hour ago

                          AFAIK you want a model that will sit within the 24GB VRAM on the GPU and leave a couple of gigs for context. Once you start hitting system RAM on a PC you're smoked. It'll run, but you'll hate your life.

                          Have you ever run a local LLM at all? If not, it is still a little annoying to get running well. I would start here:

                          https://www.reddit.com/r/LocalLLaMA/

                          • NitpickLawyer 2 hours ago

                            You can't run the big R1 in any useful quant, but can use the distilled models with your setup. They've released (MIT) versions of qwen (1.5,7,14 and 32b) and llama3 (8 and 70b) distilled on 800k samples from R1. They are pretty impressive, so you can try them out.

                          • cratermoon 29 minutes ago

                            Dupe

                            • swyx an hour ago

                              i feel like announcements like this should be folded into the main story. the work was done by the model labs. ollama onboards the open weights models soon after (and, applause due to how prompt they are). but we dont need two R1 stories on the front page really

                              • qqqult 33 minutes ago

                                these are smaller qantized models that I can use on my 8 year old GPU, I can't even load the original deeppseek unqantized models