Ivgeni Iv Segal @IvSegal - Twitter Profile

Pinned Tweet

3 months ago

I'm delighted to announce the open #source release of Fewshell, a mobile-first self-hosted #AI terminal #agent for #devops, oncalls and AI researchers. Run complex #bash commands from your phone with ease. Fix your systems remotely https://t.co/XCEXIWDvFV

1

7

1

0

241

Ivgeni Iv Segal @IvSegal

10 days ago

@JAM_KseniYa There are several modpacks that do this. Eg https://t.co/mEL9xP0j5P

1

2

0

411

IvSegal retweeted

Ziqian Zhong

@fjzzq2002

19 days ago

A joint work with @IvSegal @neversupervised @Shashwa02469621 @kexun_zhang @AdtRaghunathan! Paper: https://t.co/RN3Oea33NG Code: https://t.co/Bhpg0PCmeR

0

8

2

1

377

Ivgeni Iv Segal @IvSegal

24 days ago

#AI Agents can use curl in your evals' environment to cheat on tasks by finding the answer online. If you disable network access, be sure to also disable search grounding, otherwise they can still search the internet on the LLM provider's side.

0

19

Who to follow

Ivan Bercovich

@neversupervised

Independent Researcher / Terminal Bench, Partner @ ScOp VC

Kevin O'Connor

@kjpoconnor

Partner at @ScOpVC, a VC firm specializing in pre-growth AI companies. Past: former CEO and founder of @DoubleClick, CEO and founder of @GraphiqHQ

Ben Taylor

@Benjamin1Taylor

Enjoys big pretzels and Pixar shorts. Tweets & opinions are my own.

Ivgeni Iv Segal @IvSegal

about 2 months ago

@suno @RichardGarriott

0

25

Ivgeni Iv Segal @IvSegal

about 2 months ago

Fun experiment: take a soundtrack sample of your favorite old video game and drop it into @suno for a modern remaster/cover. -->

3

2

0

107

Ivgeni Iv Segal @IvSegal

about 2 months ago

@suno @JohnBroomhall

0

35

Ivgeni Iv Segal @IvSegal

about 2 months ago

@suno Eg: X-Com UFO Defense (DOS, 1994)

2

0

36

Ivgeni Iv Segal @IvSegal

about 2 months ago

@suno Ascendancy (DOS, 1995) @toddtempleman

0

1

0

99

IvSegal retweeted

JER

@lifeof_jer

2 months ago

https://t.co/ofucbVgkLV

1K

5K

1K

6K

7M

IvSegal retweeted

Ivan Bercovich

@neversupervised

2 months ago

I want to share a new dataset of 331 reward-hackable environments. These are real environments used in Terminal Bench and adjacent benchmarks. I first got interested in this because, as a reviewer of Terminal Bench, I noticed a lot of our tasks were hackable. I also noticed that many contributors to the benchmark do so because it provides credibility when selling environments to labs. Hence, TBench tasks are, in my opinion, held to a higher quality standard than those being used today for RL. No one is spending hours manually reviewing the $1B in tasks being purchased by major labs. As far as I understand, while everyone knows environments are hackable, nobody has released hundreds of "realistic" environments. (link in comment)

neversupervised's tweet photo. I want to share a new dataset of 331 reward-hackable environments. These are real environments used in Terminal Bench and adjacent benchmarks. I first got interested in this because, as a reviewer of Terminal Bench, I noticed a lot of our tasks were hackable. I also noticed that many contributors to the benchmark do so because it provides credibility when selling environments to labs. Hence, TBench tasks are, in my opinion, held to a higher quality standard than those being used today for RL. No one is spending hours manually reviewing the $1B in tasks being purchased by major labs. As far as I understand, while everyone knows environments are hackable, nobody has released hundreds of "realistic" environments. (link in comment)

1

64

11

21

12K

Ivgeni Iv Segal @IvSegal

3 months ago

@AdolfoUsier @opencrabs "OpenCrabs runs as a single binary on your terminal — no server, no gateway, no infrastructure." I love that!

0

1

0

75

Ivgeni Iv Segal @IvSegal

3 months ago

I'm delighted to announce the open #source release of Fewshell, a mobile-first self-hosted #AI terminal #agent for #devops, oncalls and AI researchers. Run complex #bash commands from your phone with ease. Fix your systems remotely https://t.co/XCEXIWDvFV

1

7

1

0

241

IvSegal retweeted

Ivan Bercovich

@neversupervised

3 months ago

https://t.co/buOPNmWeBe

3

102

25

85

83K

Ivgeni Iv Segal @IvSegal

4 months ago

@psomkar1 Source: https://t.co/qA2775SuMt

0

8

Ivgeni Iv Segal @IvSegal

about 1 year ago

Is this another #cloudpocalypse? #downdetector

0

7

1

0

641

IvSegal retweeted

Bill D'Alessandro

@BillDA

over 1 year ago

Me trying to find out how much the "Enterprise" plan costs

138

9K

185

245

444K

Ivgeni Iv Segal @IvSegal

over 1 year ago

@TimSweeneyEpic @Pirat_Nation The proliferation of 3p launchers is a blight for customer experience. Using Epic Store and Steam would be so much better if they were to disallow vendors from adding launchers.

0

94