• NobodyNada 2 days ago

    This came up on HN a few months ago, when someone posted a list of most-translated articles and Woodard was at the top: https://news.ycombinator.com/item?id=44031697

    It looks like a user in the HN thread noticed the irregularities on the Italian Wikipedia [0] and started the deletion discussion [1] that the article credits with kickstarting this investigation.

    [0]: https://news.ycombinator.com/item?id=44035222

    [1]: https://it.wikipedia.org/wiki/Wikipedia:Pagine_da_cancellare...

    • philipwhiuk a day ago

      I see the defense on that context that admins aren't really mods when practically speaking they do act like mods by closing discussions - in theory this is when "Wikipedia has reached an opinion". In practice it is very easy for it to be when it has reached their opinion.

      • bbor a day ago

        I could've sworn I remembered such a post, thank you so much for vindicating my hunch! At the time I figured there wasn't much harm in it, but in hindsight it's obvious that the absurd number of translations was the just smoke stemming from a self-promotion fire.

        Props to whatever HackerNewsian (YCombinist?) took the time to chase all this down and do this fascinating writeup! You will be remembered in /r/TodayILearned posts every few months for many decades to come, no doubt.

      • colbyn 2 days ago

        I thought this was referring to articles as in the part of speech (i.e. there are nouns, verbs, but also article like “a” or “the”) given the title and something spanning across languages… I wonder what his exact thought process was that motivated all that effort?

        • Muromec 2 days ago

          that was my expectation as well, because mosyt languages dont have a concept of articles

          • ks2048 a day ago

            According to one count, 32% of languages don't have articles (although only based on 620 languages. 198 / 620).

            https://wals.info/chapter/37

            • dhosek a day ago

              What I think is wild is that Indo-European languages have developed articles at least four times: in Greek (apparently from a weak demonstrative) with only the definite article, in Romance languages from vulgar Latin with both definite and indefinite articles, distinctly in Romanian where only the definite article exists as an enclitic (suffixed to the noun), and in some, but not all, Germanic languages, perhaps under the influence of vulgar Latin, but I’ve not been able to trace it in my meagre attempts to research the topic.

        • latexr 5 days ago

          This is not interesting than the title initially suggests. It’s not merely a curiosity, but an investigation:

          > I discovered what I think might have been the single largest self-promotion operation in Wikipedia’s history, spanning over a decade and covering as many as 200 accounts and even more proxy IP addresses.

          • decimalenough 2 days ago

            Quite the contrary, the story is rather fascinating. (Or did you mean to say "more interesting"?)

            If you want even more gruesome details, the story of how this all unraveled plus all sorts of info about Woodard, a positively creepy while supremacist, can be found on the English article's talk page:

            https://en.wikipedia.org/wiki/Talk:David_Woodard/Archive_1

            https://en.wikipedia.org/wiki/Talk:David_Woodard

            And with this anomaly removed, the list of articles in the most languages is back to what you'd expect: the top 10 is all large countries and Wikipedia itself.

            https://en.wikipedia.org/w/index.php?title=Wikipedia:Wikiped...

            • latexr 2 days ago

              > Or did you mean to say "more interesting"?

              I did, yes, that was a typo. I did notice it after the edit window was closed but the submission hadn’t had any traction so it felt silly to reply to my own comment to correct it.

              Glad the submission was resurrected, I think it deserves it. My original comment was precisely to convince people to give it a read.

              • ViscountPenguin a day ago

                Some of these are still quite suspicious imo. "True Jesus Church", a church of a few million people ranking above Jesus?

                • madcaptenor a day ago

                  It also ranks above Christianity itself.

                  Another suspicious one on that list: the city of Kurów in Poland, population 2,725.

                • drdeca 2 days ago

                  Though, if you restrict to just people, then, surprisingly, Corbin Bleu is #20 .

                  • opan a day ago

                    My first thought reading this was "who's Corbin Bleu?", but I guess that's how they get you. Next I'd check the article and contribute to its popularity (by views anyway). Similar to Distrowatch where you curiously click the most obscure distros near the top of the rankings to see what they are, which increases their rank even more.

                  • brabel 2 days ago

                    So they only got caught because they were too efficient in their scheme and rose to number 1 in translations. How many more schemes go unnoticed? Not saying Wikipedia is not doing a great job, just saying that there is probably a lot of such schemes and that it seems nearly impossible to stop them all. It’s sad that a lot of people don’t want the truth to be available, at least when it concerns themselves, they want you to only know what they think you should, like on their Instagram.

                • ks2048 a day ago

                  So, should the David Woodard article have a section about this?

                • nickm12 2 days ago

                  ...and I would have gotten away with it if it weren't for you meddling kids!

                  I find it interesting that the whole scheme might not have been noticed had he been more modest and not tried to translate the pages into rare languages. We don't know the motive, but if it was self-promotion, these additional languages were presumably of negligible value yet risked the scheme.

                  • indigo945 2 days ago

                    On the contrary, it's precisely by "risking" the scheme that the self-promotion became effective.

                    It's quite unlikely for anybody to stumble upon any given English-language Wikipedia article by chance, given that there's literally billions of them now - therefore, the promotional value of having a Wikipedia article on something even in a popular language is negligible. However, by spamming all the Wikipedias, and having this "scheme" discovered, Woodard created a situation where he is widely reported on as the artist that spammed Wikipedia, and has therefore received the five minutes of fame that he so desperately wanted.

                    If he had stuck to spamming the English Wikipedia, would he have ended up on the frontpage of HN?

                    • shermantanktop a day ago

                      This was clearly the endgame all along.

                      Quietly having all these articles might be personally satisfying in some way, but his obvious appetite for fame or notoriety points toward him wanting the scheme to be exposed. In fact I would not be entirely surprised if he somehow instigated the discovery of his activities.

                    • netsharc 2 days ago

                      Ironically now this person has become notorious for Wiki-pollution. Since he's an "artist", he can claim it was an art project.

                      Sadly because it's 2025, he has a lot of competition for the award of "most insufferable douchebag".

                    • kjellsbells a day ago

                      This may be a "well, of course it's that way" observation to some, but: the article on X in wikipedia is typically quite different in one language than another. So you can get interesting insights by reading about X in different languages.

                      For example, the French article about David Hockney has a lovely Francophone twist in that the first few lines point out that he lived in Normandy for a few years, whereas Emglish Wikipedia buries the fact deep in the page. The page for VLC has a photo of the lead dev in the French page but no discussion of the plugin architecture. And so on. It doesn't seem unreasonable to me to assume that the pages in some languages might be particularly strong if the topic plays a bigger role in the culture than in the English-speaking world.

                      • The-Bus a day ago

                        It's also interesting to see what decisions editors have made about animals. In English, for example, the article for the African elephant[1] is just the animal's name.

                        In Italian, Spanish, and Tagalog it's the scientific name of the animal.

                        This makes sense in languages (like Spanish) where an animal may have a lot of different names depending on the country, region, or dialect. If you look at the article for Pig[2], you'll see at least fifteen names listed.

                        [1] https://en.wikipedia.org/wiki/African_bush_elephant [2] https://es.wikipedia.org/wiki/Sus_scrofa_domestica

                      • cpa a day ago

                        Shameless plug about a little game I wrote a few years ago, about guessing which pages exist in the most languages in wikipedia:

                        https://wikilingua.charlespierre.fr/

                        • Bengalilol 2 days ago

                          I have great respect for and am impressed by the work that has been done. I also appreciate the explanations in this article. One question remains (perhaps related to my limited knowledge of Wikipedia’s processes): why is there no reference to this work on Woodard’s page?

                          • decimalenough 2 days ago

                            "Original research" is a cardinal sin on Wikipedia, meaning it's not eligible for inclusion in Wikipedia unless news outlets outside Wikipedia pick up the story and start publishing stories about it.

                            • dhosek a day ago

                              I’ve always thought that the criteria for inclusion on Wikipedia should simply be: is it true and is it verifiable. All the other criteria, notoriety, no original research, etc. really shouldn’t matter.

                              • mobeets a day ago

                                I totally agree but unfortunately it really is one of the fundamental laws of wikipedia. To me this becomes especially silly when editing math wiki articles, where you might be tempted to connect mathematical concepts (eg with a few lines of algebra), but writing this yourself in a wiki article is not allowed unless you can find a link to an external source making the same derivation!

                          • BrenBarn 2 days ago

                            Fascinating! A detective story for our age.

                            • asimovDev a day ago

                              What a coincidence. Just yesterday i watched a youtube video about Corbin Bleu being the 3rd most translated article on wikipedia after Jesus and Barack Obama. Not surprised to see that it was a one user effort once again

                              [0] https://youtu.be/vJ_pEP3fRvM

                              • nickpsecurity a day ago

                                We could probably add it to multilingual, training sets for A.I..

                                Previously, the ones trained on a thousand or more languages by Meta and Wycliffe used the Bible since it's the only complex, rich message translated to most, human languages. Which God said would happen to His authentic message. :)

                                https://interestingengineering.com/innovation/meta-used-bibl...

                                • _3u10 2 days ago

                                  [flagged]

                                  • emilfihlman a day ago

                                    I find the hubris of this article absolutely disheartening, and toxic, and it frankly just reinforces how Wikipedia isn't a good place, and people who shouldn't have control over it have control over it.

                                    And it isn't because of the self promoting described, but because of the response to it.

                                    Deletionists are evil.

                                    • Jolter a day ago

                                      Could you expand on why you feel that this series of deletions is wrong?

                                      • Tainnor a day ago

                                        Apart from the fact that this was pure self-promotion, it was also spamming the Wikipedias of small language communities with low-effort autotranslated garbage, which I think is rather insulting.

                                        • folkrav a day ago

                                          Care to explain what was bad about the response?

                                        • kunley 2 days ago

                                          Honestly, what kind of harm was it?

                                          • varjag 2 days ago

                                            If you let astroturfing happen on Wikipedia grounds it'll become a piece of useless crap just like the much of the rest of Internet. If you read the report you'll learn that the promoters weren't content just with their own entry but tried to sneak in references into unrelated popular articles.

                                            • decimalenough 2 days ago

                                              Yup. From the report: On the English Wikipedia alone, Woodard’s name was inserted into no fewer than 93 articles, including Pliers, Brown pelican and Bundesautobahn 38.

                                              • kunley 2 days ago

                                                Didn't know that.

                                                I was referring to translations, which while being silly seem not that much of an issue. After all he provided the content in multiple languages (I know, I know)

                                                • rchard2scout a day ago

                                                  It also does harm to the communities of smaller Wikipedias:

                                                  'a user from the Tumbuka Wikipedia reported that they had initially felt "hope and joy that a small community had then gained another native editor", before finding out that this account had been a promotional sockpuppet.'

                                                  • jdranczewski a day ago

                                                    Allowing mass machine translation of Wikipedia articles into other languages is a problem, because it floods smaller language wikis with low quality text. If a user wants machine translated pages, they can machine translate them themselves.

                                                    • gpvos a day ago

                                                      One incident like this is not a huge problem, but it sets a terrible precedent that could turn Wikipedia into the same sludge as the rest of the internet. Best to nip this kind of thing in the bud.

                                                  • Levitz a day ago

                                                    Reminds me of those "Edit Wikipedia as homework" college assignments.

                                                • Myrmornis 2 days ago

                                                  For some subjects, it's appropriate to host multiple versions of articles written natively in different languages.

                                                  But for other subjects, for example science and mathematics, it does a huge disservice to non-English readers: it means that their Wikipedia is second-rate, or worse.

                                                  Wikipedia should, in science, mathematics, and other subjects that do not have cultural inflection, use machine translation so that all articles in all languages are translations of the same underlying semantic content.

                                                  It would still be written by humans. But ML / LLMs would be involved in the editing pipeline so that people lacking a common language can edit the same text.

                                                  This is the biggest mistake Wikipedia's made IMO: it privileges English readers since the English content is highest quality in most areas that are not culturally specific, and I do not think that it's an organization that wants to privilege English readers.

                                                  • decimalenough 2 days ago

                                                    Users can already translate English Wikipedia articles to other languages on the fly with Chrome etc. However, the quality of the translation is just not up to scratch yet, particularly for languages that are radically different from English; just try reading some ML-translated Japanese or Chinese Wikipedia articles.

                                                    • numpad0 a day ago

                                                      > it means that their Wikipedia is second-rate, or worse.

                                                      ?

                                                      • thrance 2 days ago

                                                        Science and Mathematics have no cultural inflection? Do you speak more than one language? Each language has its standard sentences structures when it comes to these disciplines, and auto translators are very much not up to the task.

                                                        I prefee my Wikipedia to remain 100% human generated quality information over garbage AI slop content, which is already abundant enough on the internet.

                                                      • Hard_Space a day ago

                                                        What a uninformative headline. I was going to chip in with the annoyance that a romance language like Romanian appends the article to the word, Russian-style.

                                                        • theandrewbailey a day ago

                                                          Multiple definitions of a word is tricky to work around, especially when most of Wikipedia's documents are called "articles".

                                                          • bbor a day ago

                                                            Random unprompted fun fact: Articles are the main type of "Page" on wikipedia, but not the only type! Buried deep in their docs is the full list of 'namespaces', which you need to parse their XML dumps:

                                                              class Namespace(IntEnum):
                                                                  MEDIA = -2
                                                                  SPECIAL = -1
                                                                  ARTICLE = 0
                                                                  TALK = 1
                                                                  TEMPLATE = 10
                                                                  PORTAL = 100
                                                                  PORTAL_TALK = 101
                                                                  TEMPLATE_TALK = 11
                                                                  DRAFT = 118
                                                                  DRAFT_TALK = 119
                                                                  HELP = 12
                                                                  MOS = 126
                                                                  MOS_TALK = 127
                                                                  HELP_TALK = 13
                                                                  CATEGORY = 14
                                                                  CATEGORY_TALK = 15
                                                                  USER = 2
                                                                  USER_TALK = 3
                                                                  WIKIPEDIA = 4
                                                                  WIKIPEDIA_TALK = 5
                                                                  FILE = 6
                                                                  FILE_TALK = 7
                                                                  TIMEDTEXT = 710
                                                                  TIMEDTEXT_TALK = 711
                                                                  MEDIAWIKI = 8
                                                                  MODULE = 828
                                                                  MODULE_TALK = 829
                                                                  MEDIAWIKI_TALK = 9
                                                            
                                                            Wikipedia is a donwright fascinating technical environment once you find the rabbit hole. Shoutout to their purpose-built version control site[1] and their brand-new SWE-focused project "WikiFunctions"[2], the first new wikimedia project in a decade!

                                                            ...which, while we're at it, brings the total to 18: wikipedia, wikibooks, wikinews, wikisource, wiktionary, wikiquote, wikiversity, wikivoyage, wikidata, wikifunctions, mediawiki, commons, species, foundation, meta, incubator, and phabricator. Ok I'm done with fun facts, I swear!

                                                            [1] https://phabricator.wikimedia.org/

                                                            [2] https://www.wikifunctions.org/

                                                      • mrkramer a day ago

                                                        I don't understand why somebody didn't fork the Wikipedia and build the version where you can self promote. It kinda sucks that you are not allowed to claim and edit your Wikipedia page.

                                                        • zesterer a day ago

                                                          They did. It's called 'DNS' and you can set up a 'page' about yourself if you want.

                                                          • xanderlewis a day ago

                                                            It doesn’t kinda suck.

                                                            Wikipedia is supposed to be an encyclopaedia, which means it’s intended to come with some expectation of neutrality.

                                                            If you could edit your own page, do you really think it’d stay as factual and as neutral as possible?

                                                            Just make yourself a website.

                                                            • nemetroid a day ago

                                                              I'm sure someone did.

                                                              • mrkramer a day ago

                                                                Facebook and Instagram are too cheesy, I want something more genuine.

                                                                • bondarchuk a day ago

                                                                  There's this wiki (I forget the link sorry) that always gave me the impression that it was made by people disgruntled they were turned away from wikipedia for original research, that's full of original research by self-styled experts. I'm sure you could write an article on yourself there, after all who's more an expert in yourself than you?

                                                                  • mrkramer a day ago

                                                                    The only viable and useful alternative to Wikipedia that I found is https://golden.com/ (it somewhat loads slow for me but it is useful.)

                                                                    Often when I search for startups and their founders I can't find information about them on Wikipedia but I find it on Golden.

                                                                    • bbor a day ago

                                                                      It's incredibly telling that the alternative you seek is a for-profit firm built on datamining other site's data without permission...

                                                                      • mrkramer a day ago

                                                                        >without permission

                                                                        If it is without permission than it is illegal and people can sue otherwise web scraping is legal.

                                                                  • Levitz a day ago

                                                                    How is it genuine to write information about yourself in such a way that it seems crowdsourced?

                                                                    • mrkramer a day ago

                                                                      My idea was to have Wikipedia like platform where you could write about yourself and then have your friends, family and colleagues confirm that information or vouch for that. You can even turn things around and give permission to your friends, family and colleagues to write and maintain Wiki page about you.

                                                                      I don't use LinkedIn but when I stumble upon someone's page, I often see testimonies from their work colleagues about them.

                                                              • harvie 2 days ago

                                                                This only offers me 19 languages: https://en.wikipedia.org/wiki/David_Woodard The article claims that it has 335

                                                                • silok 2 days ago

                                                                  Explained at the end of the article:

                                                                  After a full month of coordinated, decentralised action, the number of articles about Mr. Woodard was reduced from 335 articles to 20. A full decade of dedicated self-promotion by an individual network has been undone in only a few weeks by our community.

                                                                  • khalic 2 days ago

                                                                    It’s a good idea to read whatever you’re commenting on

                                                                    • rsynnott a day ago

                                                                      That is a most improper suggestion on this here orange website. It is established etiquette to _imagine what the content of the article might be_, based on the title, and then comment on that, preferably angrily. At _absolute most_ one can read the first paragraph.

                                                                      • Xss3 a day ago

                                                                        No no, thats reddit. We shun this here. They embraced it long ago.

                                                                        • croisillon a day ago

                                                                          or at least, that's what i guess is written in the guidelines

                                                                          • Tainnor a day ago

                                                                            And when called out on it reply that the comments are often more interesting than the article which is a) trivially true when you don't read the article and b) probably because bickering in comments is more emotionally satisfying and requires a shorter attention span than reading a rather long article (I'm not immune, seeing as I'm now bickering about the bickering).