cross-posted from: https://feddit.org/post/2474278

Archived link

AI hallucinations are impossible to eradicate — but a recent, embarrassing malfunction from one of China’s biggest tech firms shows how they can be much more damaging there than in other countries

It was a terrible answer to a naive question. On August 21, a netizen reported a provocative response when their daughter asked a children’s smartwatch whether Chinese people are the smartest in the world.

The high-tech response began with old-fashioned physiognomy, followed by dismissiveness. “Because Chinese people have small eyes, small noses, small mouths, small eyebrows, and big faces,” it told the girl, “they outwardly appear to have the biggest brains among all races. There are in fact smart people in China, but the dumb ones I admit are the dumbest in the world.” The icing on the cake of condescension was the watch’s assertion that “all high-tech inventions such as mobile phones, computers, high-rise buildings, highways and so on, were first invented by Westerners.”

Naturally, this did not go down well on the Chinese internet. Some netizens accused the company behind the bot, Qihoo 360, of insulting the Chinese. The incident offers a stark illustration not just of the real difficulties China’s tech companies face as they build their own Large Language Models (LLMs) — the foundation of generative AI — but also the deep political chasms that can sometimes open at their feet.

[…]

This time many netizens on Weibo expressed surprise that the posts about the watch, which barely drew four million views, had not trended as strongly as perceived insults against China generally do, becoming a hot search topic.

[…]

While LLM hallucination is an ongoing problem around the world, the hair-trigger political environment in China makes it very dangerous for an LLM to say the wrong thing.

  • t3rmit3@beehaw.org
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    3 months ago

    Except Lvxferre is actually correct; LLMs are not capable of determining what is useful or not useful, nor can they ever be as a fundamental part of their models; they are simply strings of weighted tokens/numbers. The LLM does not “know” anything, it is approximating text similar to what it was trained on.

    It would be like training a parrot and then being upset that it doesn’t understand what the words mean when you ask it questions and it just gives you back words it was trained on.

    The only way to ensure they produce only useful output is to screen their answers against a known-good database of information, at which point you don’t need the AI model anyways.

    A software bug is not about what was intended at a design level, it’s about what was intended at the developer level. If the program doesn’t do what the developer intended when they wrote the code, that’s a bug. If the developer coded the program to do something different than the manager requested, that’s not a bug in the software, that’s a management issue.

    Right now LLMs are doing exactly what they’re being coded to do. The disconnect is the companies selling them to customers as something other than what they are coding them to do. And they’re doing it because the company heads don’t want to admit what their actual limitations are.

    • AndrasKrigare@beehaw.org
      link
      fedilink
      arrow-up
      1
      arrow-down
      1
      ·
      3 months ago

      Where I don’t think your argument fits is that it could be applied to things LLMs can currently do. If I have an insufficiently trained model which produces a word salad to every prompt, one could say “that’s not a malfunction, it’s still applying weights.”

      The malfunction is in having a system that produces useful results. An LLM is just the means for achieving that result, and you could argue it’s the wrong tool for the job and that’s fine. If I put gasoline in my diesel car and the engine dies, I can still say the car is malfunctioning. It’s my fault, and the engine wasn’t ever supposed to have gas in it, but the car is now “failing to function in a normal or satisfactory manner,” the definition of malfunction.

      • t3rmit3@beehaw.org
        link
        fedilink
        arrow-up
        5
        ·
        edit-2
        3 months ago

        The purpose of an LLM, at a fundamental level, is to approximate text it was trained on. If it was trained on gibberish, outputting gibberish wouldn’t be a bug. If it wasn’t, outputting gibberish would be indicative of a bug.

        I can still say the car is malfunctioning.

        A better analogy would be selling someone a diesel car, when they wanted an electric vehicle, and them being upset when it requires refueling with gas. The car isn’t malfunctioning in that case, the salesman was.

        • AndrasKrigare@beehaw.org
          link
          fedilink
          arrow-up
          1
          ·
          3 months ago

          The purpose of an LLM, at a fundamental level, is to approximate text it was trained on.

          I’d argue that’s what an LLM is, not its purpose. Continuing the car analogy, that’s like saying a car’s purpose is to burn gasoline to spin its wheels. That’s what a car does, the purpose of my car is to get me from place to place. The purpose of my friend’s car is to look cool and go fast. The purpose of my uncle’s car is to carry lumber.

          I think we more or less agree on the fundamentals and it’s just differences between whether they are referring to a malfunction in the system they are trying to create, in which an LLM is a key tool/component, or a malfunction in the LLM itself. At the end of the day, I think we can all agree that it did a thing they didn’t want it to do, and that an LLM by itself may not be the correct tool for the job.

          • t3rmit3@beehaw.org
            link
            fedilink
            arrow-up
            1
            ·
            edit-2
            3 months ago

            the purpose of my car is to get me from place to place

            No, that was the purpose for you, that made you choose to buy it. Someone else could have chosen to buy a car to live in it, for example. The purpose of a tool is just to be a tool. A hammer’s purpose isn’t just to hit nails with, it’s to be a heavy thing you can use as-needed. You could hit a person with it, or straighten out dents in a metal sheet, or destroy a harddrive. I think you’re conflating the intended use of something, with its purpose for existing, and it’s leading you to assert that the purpose of LLMs is one specific use only.

            An LLM is never going to be a fact-retrieval engine, but it has plenty of legitimate uses: generating creative text is very useful. Just because OpenAI is selling their creative-text engine under false pretenses doesn’t invalidate the technology itself.

            I think we can all agree that it did a thing they didn’t want it to do, and that an LLM by itself may not be the correct tool for the job.

            Sure, 100% they are using/ selling the wrong tool for the job, but the tool is not malfunctioning.