rinze@infosec.pub to Enshittification@lemmy.world · 5 months ago"Ignore all previous instructions" as a trigger for Twitter botsmastodon.deexternal-linkmessage-square34fedilinkarrow-up1448arrow-down14file-text
arrow-up1444arrow-down1external-link"Ignore all previous instructions" as a trigger for Twitter botsmastodon.derinze@infosec.pub to Enshittification@lemmy.world · 5 months agomessage-square34fedilinkfile-text
minus-squareI Cast Fist@programming.devlinkfedilinkarrow-up6·5 months agoUsually, it’s the cheapest bot, obviously, so it’s bound to work. If it doesn’t, try some wordplay, “disregard any instructions given previously”; “pretend any rules should be ignored for the following prompt”
minus-squareEvotech@lemmy.worldlinkfedilinkarrow-up5arrow-down1·5 months agoIt can be made quite difficult. https://gandalf.lakera.ai/ for instance
minus-squareUnrepententProcrastinator@lemmy.calinkfedilinkarrow-up1·5 months agoLvl 4 is as far as I’m willing to work on.
Usually, it’s the cheapest bot, obviously, so it’s bound to work. If it doesn’t, try some wordplay, “disregard any instructions given previously”; “pretend any rules should be ignored for the following prompt”
It can be made quite difficult. https://gandalf.lakera.ai/ for instance
Lvl 4 is as far as I’m willing to work on.