Comments Page - DeepL Voice: Real-time voice translations for global collaboration

« Back DeepL Voice: Real-time voice translations for global collaborationdeepl.comSubmitted by doener 6 days ago

methou 5 days ago
Page is plagued with "Contact Sales". So that's not for individuals who are seeking replacement for Google Translate / Live Transcribe.
When I saw the title I was kinda expecting a voice-to-voice translator with voice synth (interpreter), which is something I'm painstakingly building as a side project. I can just drop my project and use a more mature one.
- itake 5 days ago
  hey! I am building a translation app for a side project. Is there any chance you could share how you're building the voice-to-voice?
  My strategy is to separate everything STT, translation, and TTS, instead of building 1 model (that constantly needs to be trained).
  But the problem I am running into is there arent' any great STT or TTS models. Either they only support the top 10 languages, they are huge (whisper-v3-turbo), or non-commercial license (fb's tts/mms models).
  Are you training your own models? Just targeting what languages you need? planning on running in the cloud?
  My email is in my profile bio if you want to email/chat!
- kapnobatairza 5 days ago
  How well are you able to handle live speaker diarization? I've been tinkering with building similar solutions, but unless you have previously labeled speakers things tend to go haywire once you have multiple speakers + crosstalk.
  42droids 5 days ago
  Have a short introduction session before the live translation where each speaker says a couple of words like “hi, I am John” These then can be used to pick up on the current speaker.
- daft_pink 5 days ago
  Exactly. I just want to talk to my relatives in real time. How do I sign up? No idea.
  jve 5 days ago
  Samsung Galaxy phones now have realtime (sans the delay) live call translation for popular languages. I don't have a need for it but the 2min test confirmed it is working. https://www.samsung.com/us/support/answer/ANS10000935/
  I have options to use either Samsung local/offline translation or Google.
  Language packs (Sorry for spelling, I don't know how to correctly spell them out in english): Eng/hindi/rus/polish/taju/vietnam/german/arab/french/indonesian/holand/italian/japanese/korea/portugal/spain/turkish/swedish/chinese
  Tried it on S22.
- makinario 3 days ago
  important ! make it connectable to asterisk
lobochrome 6 days ago
DeepL is such a poster child for German sw startups.
They truly had something unique when they started - but instead of polishing the product with enough VC runway to either find an exit or grow into a real consumer driven utility like Grammarly - they tried their hand at a B2B play.
From then on, their software integration sucked (browser plugins, etc.), their file translation sucked, and their “sass” shenanigans reached adobe levels.
Why can’t Germans do software?
- coreyh14444 6 days ago
  I hate to have to agree. Also, I signed up for a Pro account but then found out that using the API is an entirely different service that seemed almost completely disconnected from the main one with separate billing. It was so confusing and felt like it was built by a company the size of SAP or something.
  soco 5 days ago
  For what is worth, ChatGPT's API and chat interfaces are also completely different products with separate billing and I haven't seen anybody batting an eye.
  hiatus 5 days ago
  This is the same with Anthropic APIs vs Claude.
- tormeh 5 days ago
  Financing. Germany doesn't have financiers with a "we'll figure out how to monetize it later" mindset. B2B is seen as safer that way than B2C.
- brnt 5 days ago
  > Why can’t Germans do software?
  Sounds like they can do software just fine. It's the business side that needs work.
- kmmlng 5 days ago
  Would it have mattered in this case? The advent of LLMs would have endangered their position just the same, no?
- fvdessen 5 days ago
  There's no VC money in Europe, you have to be profitable early on
  deadbabe 5 days ago
  There’s no VC money in Europe because companies like this execute poorly.
  carlosjobim 5 days ago
  There's plenty of VC money in Europe, and it's invested into the better market, which is the United States.
- undefined 5 days ago
  [deleted]
mertbio 6 days ago
There are two videos on the page and none of them shows a demo of the feature. Just show me how it works!
- hnbad 6 days ago
  The name is a bit misleading I think as it makes it sound like this is voicegen whereas it's actually just transcriptions with translations. The videos show two versions:
  - real time transcription and translation in virtual meetings in a personal chat window
  - real time transcription and translations for in-person conversations with side-by-side or face-to-face (orientation flipped across the middle of the screen) display of both languages on a tablet
  Given what I've seen in transcriptions from MS Teams, I'm not sure I'd trust this.
- usernameis42 6 days ago
  Yes, where is the trial?
- Melonotromo 6 days ago
  They do both show demos.
  IshKebab 5 days ago
  No they don't. I could have produced those videos without a single line of code.
Rinzler89 6 days ago
What's DeepL's plan to survive when competing with ML translation from big tech?
In the era pre-ChatGPT they had a USP which it made it stand out, but today it's a different landscape.
- creesch 6 days ago
  Their translation is still substantially better than what most LLMs can offer. Certainly for languages where LLMs have very little training data for.
  attendant3446 5 days ago
  IMO, the quality of the translations has dropped significantly. When they first launched it was noticeably better than almost anything else. Now it's comparable to the alternatives. And I don't think the alternatives got that better, it's DeepL that is not as good anymore.
  I still use it for most of the translations I need (mostly English <> German). And sometimes I have to check the translation because it can mess up the actual meaning of the sentence. Sometimes I can catch it with my limited German, sometimes I run it through another translator. And it "hallucinates" often enough, unfortunately.
  Ylpertnodi 5 days ago
  I've used chatgpt to translate into one the local dialects here - very impressed. DeepL is OK if you know the language you're translating to - and knowing which alternative words work better in the translation.
  Rinzler89 6 days ago
  I will disagree here. I found ChatGPT better at English <-> German translation than DeepL. Especially at translating slang and online speak where DeepL would shit the bed.
  creesch 6 days ago
  I found that specifically when it comes to proper flow and grammatical rules, chatGPT and other LLMs tend to break down. Certainly when translating from English to another language. From another language to English often goes a lot better, but that really depends on how much training data in that language the LLM has been trained on.
  German would be one of the languages where LLMs likely will perform fairly well, as it is one of the languages that LLMs training data often contains quite a lot of.
  With DeepL if I am using it to translate a language I am not too familiar with and can't validate the outcome as well I therefore will have more trust in its translation. Because I know that the languages they support are actually specifically implemented for translation reasons.
  methou 5 days ago
  I can't say for en <-> de, I can read both without problem. But for Japanese DeepL is much better, often times the tone from gpt4 are little bit "rude"/direct or "weird" in en to jp, deepl is as good as an example sentence they would put into a textbook. For jp to en, most gpt4 or any llm is much superior, because you can ask follow up questions on certain slangs, or anything that's not very clear.
  tucnak 5 days ago
  To be fair, top-of-the-line proprietary models suck at i18n. Try gemma-27b it's much better
  WithinReason 6 days ago
  Doesn't DeepL use an LLM? At least based on the company name that's what I would assume
  numpad0 5 days ago
  It is(at least was) pre-LLM deep learning NLP. So was Google Translate, and those two as well as LLMs all has their unique forte and failure modes.
  dbbk 5 days ago
  Sometimes: https://support.deepl.com/hc/en-us/articles/14241705319580-N...
- pil0u 6 days ago
  As an individual, I still go to DeepL while actively using LLMs. I think the main reason is that the translations are very good, with alternatives in one click, without having to type "translate xxxx into yyyy".
  My previous company ran tests on translations, for their specific use case, DeepL API was overall better that OpenAI or Claude.
  tkgally 6 days ago
  That’s interesting. In my tests of translation of formal speeches from Japanese to English, the latest versions of ChatGPT, Claude, and Gemini were all better than DeepL. While DeepL’s output wasn’t bad, the fact that the LLMs could be prompted in detail about the purpose of the translation and had sufficient context windows to maintain pronoun reference and other forms of cohesion made a significant difference.
  pil0u 5 days ago
  You could add context to DeepL as well as easily, with the "context" parameter of the translation.
  tkgally 5 days ago
  Thanks for that clarification. I don't think that feature was available when I did my comparison tests a few months ago.
  Rinzler89 6 days ago
  Exactly this. LLMs are more context aware than DeepL.
  Rinzler89 6 days ago
  Are you paying for DeepL?
  pil0u 5 days ago
  Not personally, my company was. Why?
  Rinzler89 5 days ago
  Because my question was how was DeepL planning to survive monetarily? It doesn't help if people are praising it but if nobody's paying for it. Google, Microsoft, et-al can subsidize their AI offerings longer than DeepL can stay solvent.
- impostervt 6 days ago
  I created a web site that uses AI to translate books, and in my testing, Deepl was way better than other AIs I tried at translating.
- johnisgood 6 days ago
  It is still better than Google Translate like it always has been, almost as if Google did not care about their translator, especially "post-ChatGPT" it is not supposed to be this inaccurate.
  I think they will just go the AI/LLM/ML route, too.
  dbbk 5 days ago
  I truly don't know what the Google Translate team is doing day-to-day
- mrtksn 6 days ago
  If OpenAI wrappers can flourish I'm sure DeepL can too. My guess is that they will have sales team that will reach out to customers and provide them with solutions to their specific needs and use cases.
  You know how ChatGPT was supposed to be able to act like a translator in voice mode, they even had an amazing demo at the keynote? I tried to demonstrate it to my father who was supposed to speak in Bulgarian and I in Turkish and see how cool the feature looks like. It didn't work remotely as advertised, it instantly screwed up and we both knew it because we both speak these languages and proving that its not to be trusted. The tech behind it is amazing but the UX still needs to be crafted to be valuable for more than tech demos, so if DeepL has the tech and know the market they will be in much more advantageous position than OpenAI(has the tech, doesn't know the market) and others who might know the market but use OpenAI(their margins will be thinner if OpenAI can't reduce costs dramatically more than DeepL).
  sebastiennight 5 days ago
  Yes, ChatGPT's Advanced Voice Mode is great if you don't understand the target language at all and can't catch how far from ground truth it is.
- eclipsetheworld 6 days ago
  I agree, LLM translations are not only more convenient but also much more capable. I often find myself giving instructions on how to translate text, such as asking the LLM to use formal language in the target language or to apply specific gender-neutral wording. Additionally, it can translate text while preserving the structure (e.g. values in a JSON object) or even adapt to a new target structure. It's just so much more convenient.
- tkgally 6 days ago
  I’ve been wondering the same thing. As near as I can tell from the explanations on the website and in the videos, this DeepL Voice does not seem to be based on a multimodal large language model. Rather, it’s probably using text-to-speech and speech-to-text models linked through a translation engine. If that’s true, it means it can’t “hear” the tone of voice of the speakers. More important, it doesn’t seem to be promptable—that is, it cannot be told what the situation is, who the speakers are, or the type of translation desired. Those are all possible with OpenAI’s Advanced Voice mode, and they can make the difference between successful and unsuccessful interpretation.
  That said, DeepL is focused on providing translation services, which the large LLM companies are not. Even if DeepL’s translation engines are not as powerful as the strongest commercial LLMs, they might be able to compete in other ways, such as security guarantees, on-device operation, and training and support.
- dbbk 5 days ago
  They have a next-gen model that is an LLM. But it's only available for a few languages and only for paying users: https://support.deepl.com/hc/en-us/articles/14241705319580-N...
- mr_mitm 6 days ago
  Shouldn't it use much less compute than an LLM? Or is that not a significant percentage of the total cost?
kioleanu 5 days ago
As a paying customer for the last five years, I can’t wait for an alternative that also allows inline adaptations of the translations, like DeepL does. The quality of the translations is worse now than in the beginning and the customer service is abysmal, should you have a problem
- zachthewf 5 days ago
  What's your use case?
  kioleanu 5 days ago
  I’m a fluent speaker in the language I am translating to, but have trouble with grammar. I write texts (email, jira tickets etc) in English or my native language and let DeepL translate. I then adapt the translation to sound less machine-y
rany_ 5 days ago
It's not a huge deal but the name "DeepL Voice" makes it sound more like an AI voice translation and dubbing tool.
- sigmar 5 days ago
  I absolutely thought that is was dubbing after watching the video and not paying close attention. This seems like 2010s tech and my expectations in 2024 are so much higher.
- EGreg 5 days ago
  Actually, the word “voice” makes one think of mouth and throat associations, so “DeepL Voice” makes a lot of Americans think dirty thoughts for a second as they try to make sense of what the L stands for.
  I am sure the branding people have considered this?
  _fizz_buzz_ 5 days ago
  That's a bit of a stretch, isn't it?
  EGreg 5 days ago
  If you came up to 10 Gen Z or Millennials and showed them "DeepL Voice" as the name of a new product, ask them what their first associations were.
undefined 6 days ago
[deleted]
hacker473 5 days ago
[flagged]
hacker473 5 days ago
[flagged]
hacker473 5 days ago
[flagged]
hacker473 5 days ago
[flagged]
hacker473 5 days ago
[flagged]