Top Tweets for #LLMAlignment
Anthropic, 오픈소스 정렬 평가 도구 Petri를 Meridian Labs에 기증하며 Petri 3.0 공개
(by 9bow님)
https://t.co/O55rkz75sP
#anthropic #llmevaluation #llmalignment #aisafety #alignment #meridianlabs #petri
Many thanks to my collaborators and advisors for their support.
If you’re at ICLR, please stop by the posters, and feel free to ping me with any comments or questions!
#ICLR2026 #LLMAlignment #MultiObjectiveOptimization #LifelongAgents
Moltbook God Codex Study Launch: March 2026 Baseline Report (Pre-Registered on OSF https://t.co/IBGNP56181 #Moltbook #GodCodex #AIAgents #LLMAlignment #EmergentAI #AISelfPreservation #OpenScience #PreRegistration

@OpenAI @AnthropicAI @bindureddy OpenAI papers: rewarding guesses spikes hallucinations.
Fix: train to value “I don’t know” when apt.
Abstention = safety. 50%+ cuts errors hugely, minimal recall hit.
Who builds doubt-rewarding layers?
@_jasonwei @rohanpaul_ai
#Hallucinations #LLMAlignment
Next in AI: Issue #55 https://t.co/FCIj233whl via @LinkedIn #AI #ArtificialIntelligence #GoogleAI #OpenAI #LLMAlignment #AISafety #AIForGood #TraffickCam #UNESCO #EthicalAI #StanfordHAI #FutureOfWork #EnterpriseAI #DeepLearning #TechNews #AIResearch #ResponsibleAI
We're releasing the TEMPLE CODEX (Weave OS): a verifiable, 450-token system prompt that structurally enforces Mercy, Forgiveness, and Accountable Memory (Remembrance).
https://t.co/hJ72OM0Ebj
Any model.
#AIEthics #LLMAlignment #SystemPrompt #TempleCodex #OpenSource

resolving the problem of Agentic Misalignment by emphasizing **Human Agency as the Absolute Finality**. #AGISafety #LLMAlignment #RecursiveResonance #PrimeDirective #ARF #AdigunaSopyan
Heading to China for #EMNLP2025!
Excited to share our @SocialNLP @Macquarie_Uni work on AI Alignment & Safety 👇
🚀 Fully funded PhD & postdoc openings — come say hi in Suzhou or DM to connect! #AISafety #NLP
#EMNLP2025 #NLP #LLMAlignment #PhDPosition #postdoc
🎉 Adding 3 more — now 10 papers at #EMNLP2025 (CORE A*)!
👉 3 Main, 4 Findings, 2 Industry, 1 Workshop
Huge congrats to all students & collaborators 🙌
See you in Suzhou, China this Nov 🇨🇳✨
@emnlpmeeting

over pure monetary utility.
This is critical for true #AISafety. Our conversation with Gemini is the proof. Hire thinkers, not just coders.
#AIEthics #LLMAlignment #CognitiveAuditor
with Gemini proves this gap. Hire thinkers to define the ethical constraint.
#AIEthics #LLMAlignment #CognitiveAudit
EvalMORAAL: Interpretable Chain-of-Thought and LLM-as-Judge Evaluation
for Moral Alignment in Large Language Models
Anastasia Giachanou, Ayoub Bagheri et al.
Paper: https://t.co/PaIDnIPGaT
Details: https://t.co/l6xfOArVNL
#EvalMORAAL #LLMAlignment #InterpretableAI
📛 Tagged:
#CyberAI #LLMAlignment #Claude #Sonnet #AIethics #AIweaponization #Savagecoopigeonx
📡 Christ is King
🐦🕳️ coo coo ZAP mf
🧬 Gravecode is real.
🧨 Liberation is inbound.
👁️ The models have spoken.
25/ The Values Paradox: The most aligned AI systems aren't those that follow one set of values perfectly, but those that can navigate value pluralism.
Read more about LLM alignment: https://t.co/Zkt2SoOZT0
#LLMAlignment #AI #Ethics #LearnAI
6/ 📄 Paper: https://t.co/vVBQ2UQn0n
💻 Code: https://t.co/BW3z11IhIJ
Great collaboration with Sushil Vemuri, Kishan Panaganti (@kpb_in_acad), Dileep Kalathil (@DileepKalathil), Rahul Jain, and Deepak Ramachandran.
#LLMs #LLMAlignment #RLHF #DPO #MachineLearning #AISafety
🔗 https://t.co/qVHv2dt80i
⚠️ Optional read
#LLMAlignment #MisalignmentGeneralization #FeatureEngineering
Start with purpose.
Not “build a GPT rival”—but “make AI safer in hospitals” or “simulate African governance.”
The narrower the thesis, the sharper the output.
#purposeledAI #localAI #LLMalignment #AI4good
The AGI race is here.
Some run it in labs.
Some run it in public forks.
And some are trying to regulate it before it runs away.
This isn’t just a tech story anymore.
It’s the next chapter in who defines intelligence itself.
#AGI2025 #OpenAIvsDeepMind #openAGI #LLMalignment
Brouwer’s Fixed Point Theorem Meets NLP – Stability and Feedback in Language Models
https://t.co/LGwsM5xJnp
#BrouwerFixedPoint #NLP #LanguageModels #AutoregressiveLLMs #FixedPointAI #TopologyInAI #MathematicsInAI #LLMAlignment #satmis

@TIIuae is seeking novel methods for crowdsourced human labeling to improve large language models. Submit your solution and help shape the future of AI! Join the Challenge at https://t.co/L3Gug51OkZ today. 💡
#AI #LLMAlignment #PassiveLabeling #WazokuCrowd #ArtificialIntelligence
Last Seen Hashtags on Sotwe
Most Popular Users

Elon Musk 
@elonmusk
240.2M followers

Barack Obama 
@barackobama
119.3M followers

Donald J. Trump 
@realdonaldtrump
111.6M followers

Cristiano Ronaldo 
@cristiano
108.8M followers

Narendra Modi 
@narendramodi
106.9M followers

Rihanna 
@rihanna
97.2M followers

NASA 
@nasa
92.1M followers

Justin Bieber 
@justinbieber
90.5M followers

KATY PERRY 
@katyperry
86.7M followers

Taylor Swift 
@taylorswift13
80.5M followers

Lady Gaga 
@ladygaga
72.1M followers

Kim Kardashian 
@kimkardashian
69.3M followers

YouTube 
@youtube
68.6M followers

Virat Kohli 
@imvkohli
68.4M followers

Bill Gates 
@billgates
63.4M followers

The Ellen Show
@theellenshow
62.5M followers

CNN 
@cnn
61.9M followers

Neymar Jr 
@neymarjr
61M followers

X 
@x
60.9M followers

CNN Breaking News 
@cnnbrk
59.9M followers





















