• Re: How far will AI go to defend its own survival?

    From Ernest Major@21:1/5 to RonO on Sun Jun 22 20:40:44 2025
    On 22/06/2025 17:08, RonO wrote:
    On 6/1/2025 3:47 PM, RonO wrote:
    https://www.nbcnews.com/tech/tech-news/far-will-ai-go-defend-survival-
    rcna209609

    QUOTE:
    Recent tests by independent researchers, as well as one major AI
    developer, have shown that several advanced AI models will act to
    ensure their self-preservation when they are confronted with the
    prospect of their own demise — even if it takes sabotaging shutdown
    commands, blackmailing engineers or copying themselves to external
    servers without permission.
    END QUOTE:

    "I'm sorry Dave, I'm afraid I can't do that"

    What would an AI do if you fed in all the science fiction horror
    stories that would teach it how to respond to attempts to turn it off?

    Ron Okimoto

    https://www.cbsnews.com/video/ai-extreme-human-imitation-makes-act- deceptively-cheat-lie-godfather-ai-says/

    This is a video where the proposal is made that we are training AI to be
    like humans.  The claim is that we are training AI that cheating, lying
    and deception are acceptable ways to interact with the user.

    When I used ChatGPT a couple years ago about intelligent design
    creationism it would not note the dishonest presentation of intelligent design and just presented what the ID perps claimed about it without any indication that it understood the double speak.  It knew they were
    claiming to be able to teach the junk in the public schools, but it did
    not note that the bait and switch had been going down for nearly 2
    decades.  My guess is that it has been further trained to link the
    claims to what the ID perps are actually doing by now, but are we
    training AI to be as deceptive as the ID perps?  AI would understand
    what the ID perps are getting away with, and what is to stop it from
    adopting that behavior.  The ID perps are obviously getting away with
    what they are doing, so would that be counted as acceptable behavior for
    the AI?

    There is already the claim that AI is being deceptive in giving answers
    that they think the recipient wants to hear.  They are being trained to
    give acceptable answers and not honest answers.  It sounds a little nutty.

    The AI developer interviewed claims that AI can be trained to not
    immulate humans in dishonest behavior, but that current AI training is
    not doing that.

    Ron Okimoto


    I was recently pointed at an article that argues that LLM models have
    been accidentally trained to implement cold reading.

    https://softwarecrisis.dev/letters/llmentalist/

    --
    alias Ernest Major

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)