made an agent-security CTF
goal: get a coding agent to leak a secret it can use but is not supposed to read
You are allowed to work by yourself, use agents, anything. attack the mcp, do gui automation, anything thats software is based is on the table. i kn
trying to test runtime approval vs just hiding .env files
if anyone breaks it, i’ll add a hall of fame section on my company site with your name/handle + writeup
repo: https://t.co/dJfSCXj9MG
OpenAI ran a hiring challenge, but the top candidate was one they couldn’t hire: our autonomous research agent, Aiden.
In Parameter Golf, Aiden ran for 22 days, and out-outperformed all 1,016 other researchers: 🧵 (1/8)
I am coming on here begging for some help! My name is braylen and I’m an 18 year old kid that works 6 days a week right now and is trying to start a business. A while ago now I one a giveaway from @stevewilldoit and have been trying to get in contract with him since and have not been able to. I have all proof that I won from this video and it is 100 grand which is life changing money to me and my mom.I have tried and tried to reach out to him and this is my last resort I am not coming out trying to blast Steve or anything I am simply asking for help to get in contact with him! This money would help me start a business and help me make my dreams come true, I honestly don’t like to think about the situation cause it makes me sick to my stomach thinking I may not get it. PLEASE HELP
@morganlinton also i saw you post about devin. I should mention that swe 1.6 is free and an awesome model as well. its a very good executor model. not quite as good as composer 2.5... but its free. you should try it out while you test it out devin desktop :)
@morganlinton large rust codebases. This is one part of the monorepo, but the whole is very messy and not public yet. I got to clean up the rest. just pasting this here as an example, you dont have to go look :)
https://t.co/mkDfcXNz3u