« BackMidjourney is alemwjslaadillpickle.comSubmitted by aadillpickle 6 days ago
  • LeoPanthera 4 hours ago

    This is very similar to the Google Trends results for "frqnce":

    https://trends.google.com/trends/explore?date=all&q=frqnce&h...

    You'll notice it peaks every northern hemisphere summer. On French keyboards, Q and A are reversed compared to US keyboards, and every summer, millions of French people go on vacation, and start Google searching for things back home on unfamiliar keyboards.

    It declines with the rise of the smartphone, as they're bringing their keyboards with them.

    Why it suddenly spikes in the last few years, I don't know.

    • andrewmcwatters 2 hours ago

      I do this qll the time ! C’est parce que AZERTY, oui.

    • anyfoo 5 hours ago

      This is actually pretty common. It's less obvious with Chinese or Japanese, as the input method there usually matches the transliteration based on how the word is spoken (romaji in Japanese, pinyin in Chinese), which of course does not look unusual.

      For example, you wouldn't think twice about it if for the Japanese word for washing machine, you not only saw "洗濯機" (which is how it's written in Kanji), but also "sentakuki" or "sentakki" in the search results, because even to non-Japanese speakers it's pretty clear that that's probably the Japanese word for washing machine written with latin character transliteration, and pretty much exactly what you'd say.

      With Korean, it looks more jarring, as the input method is apparently very different, and seems to map the keys for unrelated latin letters to Hangul letters? (I have no idea, I don't know anything about Hangul other than it's based on syllables, kind of like Hiragana/Katakana, and apparently very logical.)

      • duskwuff 5 hours ago

        > With Korean, it looks more jarring, as the input method is apparently very different, and seems to map the keys for unrelated latin letters to Hangul letters?

        More or less, yes. Each Hangul character represents a syllable, and is composed of two or more components (jamo) representing individual phonemes (like vowels or consonants) which make up the syllable. The keys on a Korean keyboard are mapped to those jamo.

        Further details: https://en.wikipedia.org/wiki/Korean_language_and_computers

        • lifthrasiir 4 hours ago

          More specifically, since Korean syllables are of the form CV(C) where C is a consonant and V is a vowel, almost all Hangul keyboard layouts divide the entire keyboard into two or three sections (consonant-vowel or initial-medial-final). The standard KS X 5002 layout is the former, a "bipartite" method (두벌식), while I'm using one of the latter, "tripartite" methods (세벌식).

        • sshine 3 hours ago

          With Chinese you have 简拼 (jiǎnpīn) for your pinyin input, which lets you type only the initial Latin letters of a common phrase to complete the phrase.

          For example, instead of typing “buzhidao” to get 不知道, you just type “bzd” and pick the top suggestion. Since all the phonetic endings are gone, it does look a little cryptic, but it means if you don’t have a pinyin keyboard, you can still type something fast that is highly correlated with your actual phrase.

          For example when you’re searching a movie title on your SmartTV; teenage mutant ninja turtles (similarly abbreviated tmnt) becomes rzsg; some Chinese search tools will pick up on this; whether through statistics, fuzzy matching or specific 简拼 (jiǎnpīn) support, I don’t know.

          • bmandale 44 minutes ago

            Kana input exists in japanese and reuses each letter of the keyboard to mean a different kana. So you could have a similar confusion in japanese. I believe many older people use it.

            • karmasimida 3 hours ago

              Hangul is its own alphabet and has its own keyboard, so the letters typed don't have correlation with typical romanization scheme at all.

              It is probably more like bopomofo keyboard for Chinese

            • yongjik 5 hours ago

              > Turns out that somehow Midjourney is so commonly searched for, that Google has started serving them in search results for a meaningless English phrase that just means a Korean forgot to switch off of their English keyboard when searching for.

              BTW, this happens all the time in Korea, because it's extremely common for someone to type something while forgetting to switch to the correct input method. Try these, for example:

                  추ㅜ
                  gozjsbtm
                  elwmsl
                  vkdlTjs
              • r_lee 4 hours ago

                and for the people that don't know, its not just because they forgot to switch, sometimes it's just faster, e.g. YouTube search also recognizes Hangul sequences in Latin if you type them out

                you can also swear in a comedic way by just typing the Hangul sequence in Latin e.g. tlqkf

                • ryukoposting 35 minutes ago

                  > gozjsbtm

                  Hah, this comment is the top result when I searched with StartPage. There are a bunch of Korean results though.

                  • lifthrasiir 4 hours ago

                    It gets even better! EBS (한국교육방송 Educational Broadcasting System) is using "듄 dyun" as one of its brands, which is one of such mis-transliterated words. Cyworld [EDIT: got confused], a once-popular SNS in Korea, once went by "쵸재깅 chyojaeging" in the similar way. (Both words have absolutely no meaning in Korean, suggesting that it was transliterated from QWERTY to a Hangul keyboard layout.)

                    • ta8884844 4 hours ago

                      I think you meant Cyworld instead of Tistory.

                      • lifthrasiir 3 hours ago

                        Oops, you are right (edited, thanks!). Tistory was 샨새교 instead.

                  • yorwba 6 days ago

                    > It's actually insane the levels of understanding the algorithms that are responsible for serving us information have and how little we, the creators of said algorithms, understand what's going on in said algorithms.

                    Keyboard layout mismatches are common enough that I assume Google has a layout detection stage hardcoded just like they have typo correction hardcoded. And the creators of said algorithms probably understand very well how they work. (The naïve way would be to convert from every possible layout to every other layout, but I think you could build something more lightweight using Hidden Markov Models.)

                    • rhet0rica 20 minutes ago

                      Although the author appears to be of Indian descent, I think this is just a case of "Silicon Valley Tech Bro Discovers Localization," particularly since he noted he didn't know the word "transliteration." YouTube downloader sites have recognized "d,jd,f" (the wrong-keyboard moonspeak for يوتيوب) as meaning "YouTube" since forever and include this term intentionally in hand-written SEO keyword lists, indicating pretty clearly that it's not just the Google algorithm familiar with this sort of mistake. It's a problem we don't really face in the monolingual English world, but in any region with digraphia, it's just a fact of everyday life. See also the related phenomenon of mojibake, when a computer screws up the text encoding rather than a human.

                      • alisonkisk 4 hours ago

                        Typos could be automatically discovered and indexed one word at a time by watching users search the wrong word (wrong input method) and then search again with the correct input method.

                        • nelsondev an hour ago

                          This would be my guess, query reformulations (user rewriting their query after first doesn’t work for them) is very common technique that search engines look through search logs to learn (mis)spellings.

                      • janalsncm 4 hours ago

                        > It's actually insane the levels of understanding the algorithms that are responsible for serving us information have and how little we, the creators of said algorithms, understand what's going on in said algorithms.

                        As others have said, keyboard mismatches are common enough that Google might have built out logic for it specifically. But thats not necessary and even “old school” search engines could learn these things.

                        The first time “alemwjsl” is searched you might not have any data, but the user will probably fix their keyboard and retype in Korean. That gives you a query correction mapping. And you can assume if query1 yields no clicks and they update to query2, q1 is a synonym for q2 and serve results for q2 instead.

                        Then, if a session contains a query “alemwjsl” and a click on midjourney.com and another session “midj” also contains a click on midjourney.com, those are co-clicked queries.

                        You can also even start to represent queries by the words in their associated clicked documents or vice versa. This helps to get around the fact that people might search “how much superbowl tickets” and “superbowl tickets price” but the official page might not contain either of those strings.

                        Of course there’s more advanced methods now (neural nets) but it’s cool to see how it worked in the past.

                        https://www.kdd.org/kdd2016/papers/files/adf0361-yinA.pdf

                        • FeteCommuniste 4 hours ago

                          The Greek string υοθτθβε (meaningless and nearly unpronounceable, would sound roughly like "eeohtht-thveh") will get you YouTube as the top search result because those letters are what you get from typing "YouTube" with your keyboard set to Greek mode, at least on Windows.

                          • lifthrasiir 3 hours ago

                            Similarly, Japanese ようつべ translates to yo-u-tu-be (as if the word "Youtube" were read before the Great Vowel Shift) and is often used in place of the proper word.

                            • aadhavans 3 hours ago

                              Fascinating. Not only that, it even fetches https://www.youtube.com/feed/gr as the first result, at least on duckduckgo.

                            • phyzome 41 minutes ago

                              I think I could have done without the pages of LLM output.

                              • bee_rider 3 hours ago

                                Adding to the confusion, alemwjsl just almost looks like a plausible name for something. Looking at it, you start parsing, right? “Clearly this is Alex… uh, something… JavaScript Library…”

                                • xp84 4 hours ago

                                  > branded keywords aren't that great to run ads on anyway, you pretty much get all the traffic from them anyway since that's what the user wanted anyway, you don't really have to pay for it.

                                  Haven’t finished the article yet but this jumped out at me. This doesn’t ring true to me. Google runs an extortion scheme - since you can buy ads on your competitors’ trademarks, and since no users can tell ads from results (and since the organic results are now buried so far, they rarely get clicks anyway) if you don’t buy your brand keywords your competitors will get all your traffic.

                                  • magic_hamster 2 hours ago

                                    Utter disappointment. The post was building up to this great reveal only to end up with the most mundane explanation. Anyone who uses a bilingual (or more) keyboard has had something similar happen to them a few times.

                                    • ggrantrowberry 2 hours ago

                                      Most mysterious are pretty mundane when you solve them.

                                      Also, for people that don’t use bilingual keyboards this is a pretty interesting finding.

                                    • floren 5 hours ago

                                      > I scroll up a bit to reread ChatGPT's analysis, and I realize it mentions "transliteration". I have no idea what that word means, so I look it up.

                                      How?

                                      • N_Lens 2 hours ago

                                        But he's paid much less than what he makes companies, and his work is important and mysterious, don't you know?

                                        • _nivlac_ 4 hours ago

                                          Everyone has gaps in their knowledge which can be things that "should be obvious" to others. If someone doesn't know something, they either forgot or haven't learned it yet! I appreciated the author's honesty here.

                                          • Avicebron 5 hours ago

                                            In this case the "writing system" is the set of typos that would occur when someone with an English and Korean keyboard layout forgets to switch off English and keystrokes what they expect would be the Korean. "Midjourney" is "alemwjsl" in that typo writing system

                                            • floren 5 hours ago

                                              No I mean how do you never come across "transliteration", is that really such an unusual word?

                                              • Avicebron 5 hours ago

                                                I don't think so, but I grew up before cell phones and AI so I had to learn how to read. I'll leave the explanation for the rest who skip over the guillemet at the beginning like I did.

                                                • jamilton 4 hours ago

                                                  Yeah, I think it's an uncommon word. It's not a concept that would come up for most American English speakers, unless you're in a community that uses a language with another writing system (I think I first encountered it in a synagogue with Hebrew) or you're learning such a language.

                                                  I think I've maybe occasionally seen "translit." in text used to mark that the following is transliterated, but I could see that being easily glossed over.

                                                  • bigstrat2003 2 hours ago

                                                    Not imo. It's a word that I would expect any adult who finished college to have seen before.

                                                    • andoando 3 hours ago

                                                      I only know it because Im bilingual

                                                • N_Lens 2 hours ago

                                                  Somehow something common has wrapped around to being 'insane'.

                                                  • GenerocUsername 5 hours ago

                                                    Going into article I guessed it s the Dvorak -> Qwerty mismatch.

                                                    Korean -> English makes more sense.

                                                    • 0x1ch 5 hours ago

                                                      Language changing layouts was my first guess. For some reason I don't think there's a large venn diagram of dvorak / colemak typists and A.I. enthusiasts.

                                                    • shinhyeok an hour ago

                                                      As a Korean, this is hilarious

                                                      • echelon 24 minutes ago

                                                        aadillpickle, fantastic blog!

                                                        I've got nothing to add there that people haven't already been saying - this was a fascinating quirk of humanity and technology. Really good full-circle adventure uncovering the source.

                                                        I'm commenting because I have to know what you're doing with your website and blog. It looks like a markdown/obsidian/static site generator. It's gorgeous and amazing. Did you write it yourself? Is it open source software?

                                                        • jihadjihad 3 hours ago

                                                          But no one ever figured out what the deal was with “covfefe”?

                                                          • msephton 5 hours ago

                                                            Very cool! Please add an RSS feed to your blog.

                                                            • lifthrasiir 4 hours ago

                                                              Fun fact: intentional input method mismatch is commonly used for censoring profanities among internet streamers. For example, 시발 sibal (approximately corresponds to fuck in its ubiquity) transliterates to "tlqkf", so many Koreans can understand that without a written Korean text. Not that Koreans can generally read transliterated Hangul texts though.

                                                              • sergiotapia 5 hours ago

                                                                Google also knows what you're searching for if you touch type the wrong thing like one key shifted to the right.

                                                                • nadermx 5 hours ago

                                                                  This is a great example in tenacity. Pleasent to read too.

                                                                  • yieldcrv 3 hours ago

                                                                    ah, could have been worse. like something made up in a synthetic data set being training data for the world we experience