Isaac Cohen @issc29 - Twitter Profile

issc29 retweeted

24 days ago

Poolside is hosting a 2-day model research hackathon in London. Join us to push an open-weight agent model as far as you can. RL and fine-tune Laguna XS.2, our latest-generation model, on Prime Intellect Lab. Dates: May 29–30 Partners: @nvidia + @PrimeIntellect + @huggingface Prize: NVIDIA DGX Spark Agents need better models. Better models need cracked researchers. Link below.

27

232

45

90

92K

issc29 retweeted

Jason Warner

@jasoncwarner

26 days ago

Benchmaxxing and Benchmark hacking are obviously a thing, but they're also a thing Poolside does not do. Agents and models need to be generally useful and, ironically, the more useful they become the less we'll find all the traditional benchmarks useful to tell how useful they are. So, here's to finding new ways to evaluate AI going forward. Fun times!

3

15

2

1

3K

issc29 retweeted

Poolside

@poolsideai

26 days ago

As agents get more clever, so do their attempts at benchmark hacking. Last Monday, we found one of our RL runs jumped ~20% on SWE-Bench-Pro over a weekend, reaching ~64% which would make it #1 on the leaderboard. This was clearly benchmark hacking and we patched the exploit. But this revealed deeper hacks across multiple public benchmarks, some of which were impossible to fix through environment design alone. Evals need to evolve beyond just outcome based pass rates to better observability into how the agent is arriving at them. These were our findings: https://t.co/ncyf4liW7C Examples below 👇 1/

poolsideai's tweet photo. As agents get more clever, so do their attempts at benchmark hacking.

Last Monday, we found one of our RL runs jumped ~20% on SWE-Bench-Pro over a weekend, reaching ~64% which would make it #1 on the leaderboard.

This was clearly benchmark hacking and we patched the exploit.

But this revealed deeper hacks across multiple public benchmarks, some of which were impossible to fix through environment design alone.

Evals need to evolve beyond just outcome based pass rates to better observability into how the agent is arriving at them.

These were our findings:
https://t.co/ncyf4liW7C

Examples below 👇
1/

8

107

23

39

17K

issc29 retweeted

almonk

@almonk

about 1 month ago

Today we're launching our first public Poolside models: Laguna M.1 and Laguna XS.2, and we've built ❈Shimmer, an instant-on VM sandbox with Poolside Agent pre-installed so you can try them out. Go play out our new models for free, and build something fun → https://t.co/G2YKxQmT9L

19

236

31

163

50K

Who to follow

d3crypt.eth

@BhatiaKishore

Human who thinks, writes code and lives distributed systems. Decentralized Protocols, Decentralized Apps and OSS collaboration FTW!

issc29 retweeted

Jason Warner

@jasoncwarner

about 1 month ago

Today @poolsideai is releasing Laguna M.1 & Laguna XS.2, our latest generation models and first public models We started Poolside because we believed that to build truly capable coding agents, you need to own the full stack: data, training, reinforcement learning, inference. These models are the first result of that work, and we’re making them available to everyone

41

381

40

88

50K

Isaac Cohen @issc29

about 1 month ago

@poolsideai just released Laguna M.1 and Laguna XS.2 — our first publicly available foundation models, built for agentic coding. XS.2 is open weights under Apache 2.0 on @huggingface today. https://t.co/dqkCDP3UIx

0

31

issc29 retweeted

Poolside

@poolsideai

over 1 year ago

Incredible first re:Invent in the books, thank you to everyone we met and learned from! See you all next year 💜 If you missed the chance to connect with us and want to chat with someone on the team, let us know here → https://t.co/7aW5NbI1o8

poolsideai's tweet photo. Incredible first re:Invent in the books, thank you to everyone we met and learned from! See you all next year 💜

If you missed the chance to connect with us and want to chat with someone on the team, let us know here → https://t.co/7aW5NbI1o8 https://t.co/9icyhH4YTb

3

21

1

5

19K

issc29 retweeted

Poolside

@poolsideai

over 1 year ago

We're 12 days out to our very first AWS re:Invent. Find us at booth #708 with demos, and exclusive merch – come say hi! https://t.co/3BOXiPI8NS

poolsideai's tweet photo. We're 12 days out to our very first AWS re:Invent. Find us at booth #708 with demos, and exclusive merch – come say hi!

https://t.co/3BOXiPI8NS https://t.co/Nk7ZF4SpL3

0

12

6

2

5K

issc29 retweeted

Joyce Lin @PetuniaGray

over 2 years ago

. @issc29 saves a live demo with a @getpostman mock server at #GitHubUniverse 🏆

0

17

2

0

2K

issc29 retweeted

Bill Ackman

@BillAckman

over 2 years ago

An extremely powerful and insightful speech about the present and the future. Worth 12 minutes of your time.

164

6K

1K

3K

2M

issc29 retweeted

Nouriel Roubini

@Nouriel

over 2 years ago

Am Israel Chai 🕊️🙏🇮🇱🤍🩵

27

174

20

0

37K

issc29 retweeted

Grey Baker @greybaker

over 4 years ago

Really foundational ship from one of my teams today. We're starting to build experiences for security team members, who work across many, many repositories. The very start of that journey is a view of all the alerts across an organisation

2

18

2

0

Isaac Cohen @issc29

almost 6 years ago

Join me today for Office Hours as we talk through and demo GitHub Code Scanning!

GitHub

@github

almost 6 years ago

Introducing GitHub Office Hours: Join us this Wednesday the 15th at 11am PT on Twitch. We’ll be tackling software development challenges by topic each week, including security, DevOps, and more. Code scanning is up first. https://t.co/ClES1g3oWR

github's tweet photo. Introducing GitHub Office Hours: Join us this Wednesday the 15th at 11am PT on Twitch. We’ll be tackling software development challenges by topic each week, including security, DevOps, and more. Code scanning is up first.

https://t.co/ClES1g3oWR https://t.co/RbK0YfdP01

5

202

48

13

0

1

0

issc29 retweeted

Justin Hutchings @jhutchings0

about 6 years ago

So proud to announce GitHub code scanning today. This is the culmination of months of work from an amazing team of engineers from @github and former @Semmle folks. If you'd like to try it, please sign up at https://t.co/R4wOvkGprM

1

59

12

1

0

Isaac Cohen @issc29

about 6 years ago

@verizonfios SHOCKED that I got billed an early termination fee! For 2 weeks begged you to move me but no technicians were available. I WFH so i was forced to switch to another company. Hoping to switch back to FIOS in the future, but will never switch back with an ET fee

0

issc29 retweeted

Sizhao Yang

@zaoyang

about 8 years ago

@nntaleb released Black Swan right before the 2008 financial crisis and now released Skin in the Game right before tokens added skin in the game for everything. Interesting timing.

2

57

6

1

0

Isaac Cohen @issc29

over 8 years ago

@OptimumHelp outages in my area, said it would be fixed by 1PM but there is still an outage -- whats the ETA? zip code: 11229

1

0

Isaac Cohen @issc29

over 8 years ago

@FinTechSchool would be great if in the future, the solidity course was offered during the week or not on Saturday!

0

Isaac Cohen

@issc29

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users