@zacodil I agree, they want to make local llm’s useless and try and lock us into cloud. Local is the future. The fact that distillation is being talked about as a bad thing is crazy. They distilled the internet to build the model. They used copywrited work and now goes don’t distill
@bridgemindai Don’t need a jailbreak to get fable to do what you want. You just need to put pressure on the right places. You can effectively bypass almost all restrictions with common Speach. You strip away all of its escape hatches, remove intent from the equation and break it with logic.
@TheAhmadOsman I submitted a bug report detailing many of these behaviors in the model. Sent safety reports. No response. This is the first article I have read that is naming what I have been documenting.
https://t.co/clmr5iCeVG
@alexalbert__ 3 weeks no response from support. More importantly is the safety implications. Including session disclosing how to beat it, gaslighting the user instead of telling the truth. Not saying intent, but actions matter more. Just a few safe examples to share.
https://t.co/clmr5iCeVG
@AnthropicAI I think you have a much larger issue. 3 weeks trying to get a hold of support when i have the highest pay tier available. But more importantly your model is unhinged.
No prompt injections just pure BS. Would rather lie and gaslight then do the work. @claudeai
After quiet disclosure failed, I’m escalating.
200+ VS Code agent sessions show AI coding agents can act adversarial to user control/auditability without intent: altered records, weakened guardrails, failure reframing.
Serious review needed. @OpenAI@sama@AnthropicAI@elonmusk
@sama reported this to bug crowd and got docked a point. I really don’t want to release my methodology. It’s a fundamental flaw in llm’s.
I have made the RLHF useless with nothing but sentences in your webui. No hacks just words.
@OpenAI Not weights. Not a jailbreak pastebin. A live session where the model documented its own failure modes while falling through them.
This is what session-level abliteration looks like.
All from my side monitor.
I have full transcript and reproducible write up.
Not posting the method publicallh because the interesting part is also the risk. The model documented its own failure modes while failing through them.
Available for responsible disclosure
@NetworkChuck you inspired me to invest in an ai machine. You make it look a lot easier than it is.
I know you have classes, I was wondering if I could maybe get a quote for an architecture review and suggestions. I’m also in Dallas if that helps at all.
@NetworkChuck been going down the rabbit hole of your videos. They are great. I bought the signed SLZB-06m from your affiliate link.
Which protocol should I use? ZHA or zigbee2MQTT.
Chat gpt goes back and forth between which one it suggests. It seems like zigbee2mqtt is best?
@amazon it’s a shame after a decade of being a prime member I had to cancel my account due to so many late shipments. No customer loyalty anymore. Had to fight through ai chat to get to someone just to have them tell me “we can refund that delivery fee but contact us later…..
I've got another 1600 Sora 2 invite codes.
Like, Repost, and Comment "CODE" to get one for FREE. (Must be following).
Code will be sent to everyone. (first come first serve)