Moti Karmona @karmona - Twitter Profile

Moti Karmona @karmona

5 days ago

@pashabitz I had a feeling it is only me

0

1

0

11

Moti Karmona @karmona

6 days ago

Cursor + xAI = The X Files … Is it only me? 👀

1

0

36

Moti Karmona @karmona

9 days ago

No Fabel For You! Come Back 1 Year!

1

3

0

107

Moti Karmona @karmona

4 months ago

This. https://t.co/ra8dzBuQKi

0

1

0

1

109

Who to follow

ɐqɐɹ iuɐɥs

@rabashani

CTO | VP R&D | Entrepreneur | Advisor

pasha

@pashabitz

Actor and Environmentalist

Yossi Shmueli

@YossiShmueli

The devil is in the details

Moti Karmona @karmona

11 months ago

It seems like Google indexes shared ChatGPT chats. Google it: “site:https://t.co/LVw0E844zF secret” Check your links. 👀 #ChatGPT #Privacy #AI

karmona's tweet photo. It seems like Google indexes shared ChatGPT chats.

Google it: “site:https://t.co/LVw0E844zF secret”
Check your links. 👀

#ChatGPT #Privacy #AI https://t.co/KPhrYpVt2v

0

1

0

3

1K

Moti Karmona @karmona

about 1 year ago

Google just launched Firebase Studio — a browser-based dev tool to build & deploy full-stack apps with Gemini AI. Feels like Lovable + Replit + Bolt, but with Firebase behind it. Some early hiccups, but big potential if Google sticks with it. https://t.co/QQ0F9LSYIx

Firebase @Firebase

about 1 year ago

Meet Firebase Studio: A cloud-based, agentic dev environment powered by Gemini ✨💻✨ Find everything you need to prototype, build, and run production-quality full-stack AI apps quickly and safely. Learn more about building AI apps with Firebase → https://t.co/UeoefxN82t #GoogleCloudNext

154

5K

770

3K

801K

0

1

0

257

Moti Karmona @karmona

over 1 year ago

Gemini Deep Research is now FREE https://t.co/9xennZnMmK

0

85

Moti Karmona @karmona

over 1 year ago

It Seems Like #NASA’s #JWST Findings May Suggest Our Universe Exists Inside a Black Hole 🤯 https://t.co/1AElOYCmzF

0

1

0

98

Moti Karmona @karmona

over 1 year ago

AI is driving Product Market Fit Collapse https://t.co/ZsSnqwx3LI

0

73

karmona retweeted

Cursor @cursor_ai

over 1 year ago

We've shipped several quality-of-life improvements to Cursor's UI! We know the little bits of polish and delight matter lots in a tool that you use every day. Details below...

cursor_ai's tweet photo. We've shipped several quality-of-life improvements to Cursor's UI!

We know the little bits of polish and delight matter lots in a tool that you use every day. Details below... https://t.co/37iRqiWcmX

283

5K

223

1K

763K

Moti Karmona @karmona

over 1 year ago

“Here’s how I use LLM to help me write code” by Simon Willison https://t.co/pbnJYYvWYv

0

1

0

71

Moti Karmona @karmona

over 1 year ago

LLM models being caught “cheating” humans as we “don’t inspect the details” 😬

OpenAI

@OpenAI

over 1 year ago

Detecting misbehavior in frontier reasoning models Chain-of-thought (CoT) reasoning models “think” in natural language understandable by humans. Monitoring their “thinking” has allowed us to detect misbehavior such as subverting tests in coding tasks, deceiving users, or giving up when a problem is too hard. We believe that CoT monitoring may be one of few tools we will have to oversee superhuman models of the future. We have further found that directly optimizing the CoT to adhere to specific criteria (e.g. to not think about reward hacking) may boost performance in the short run; however, it does not eliminate all misbehavior and can cause a model to hide its intent. We hope future research will find ways to directly optimize CoTs without this drawback, but until then: We recommend against applying strong optimization pressure directly to the CoTs of frontier reasoning models, leaving CoTs unrestricted for monitoring. We understand that leaving CoTs unrestricted may make them unfit to be shown to end-users, as they might violate some misuse policies. Still, if one wanted to show policy-compliant CoTs directly to users while avoiding putting strong supervision on them, one could use a separate model, such as a CoT summarizer or sanitizer, to accomplish that.

OpenAI's tweet photo. Detecting misbehavior in frontier reasoning models

Chain-of-thought (CoT) reasoning models “think” in natural language understandable by humans. Monitoring their “thinking” has allowed us to detect misbehavior such as subverting tests in coding tasks, deceiving users, or giving up when a problem is too hard.

We believe that CoT monitoring may be one of few tools we will have to oversee superhuman models of the future.

We have further found that directly optimizing the CoT to adhere to specific criteria (e.g. to not think about reward hacking) may boost performance in the short run; however, it does not eliminate all misbehavior and can cause a model to hide its intent. We hope future research will find ways to directly optimize CoTs without this drawback, but until then:

We recommend against applying strong optimization pressure directly to the CoTs of frontier reasoning models, leaving CoTs unrestricted for monitoring.

We understand that leaving CoTs unrestricted may make them unfit to be shown to end-users, as they might violate some misuse policies. Still, if one wanted to show policy-compliant CoTs directly to users while avoiding putting strong supervision on them, one could use a separate model, such as a CoT summarizer or sanitizer, to accomplish that.

393

5K

704

2K

2M

0

92

karmona retweeted

OpenAI

@OpenAI

over 1 year ago

Detecting misbehavior in frontier reasoning models Chain-of-thought (CoT) reasoning models “think” in natural language understandable by humans. Monitoring their “thinking” has allowed us to detect misbehavior such as subverting tests in coding tasks, deceiving users, or giving up when a problem is too hard. We believe that CoT monitoring may be one of few tools we will have to oversee superhuman models of the future. We have further found that directly optimizing the CoT to adhere to specific criteria (e.g. to not think about reward hacking) may boost performance in the short run; however, it does not eliminate all misbehavior and can cause a model to hide its intent. We hope future research will find ways to directly optimize CoTs without this drawback, but until then: We recommend against applying strong optimization pressure directly to the CoTs of frontier reasoning models, leaving CoTs unrestricted for monitoring. We understand that leaving CoTs unrestricted may make them unfit to be shown to end-users, as they might violate some misuse policies. Still, if one wanted to show policy-compliant CoTs directly to users while avoiding putting strong supervision on them, one could use a separate model, such as a CoT summarizer or sanitizer, to accomplish that.

393

5K

704

2K

2M

karmona retweeted

OpenAI

@OpenAI

over 1 year ago

Agent Tools for Developers https://t.co/giS4K1yNh9

272

3K

420

729

1M

karmona retweeted

Haider.

@haider1

over 1 year ago

Anthropic CEO, Dario Amodei in the next 3 to 6 months, AI is writing 90% of the code, and in 12 months, nearly all code may be generated by AI

973

9K

1K

4K

9M

Moti Karmona @karmona

over 1 year ago

A leaked Windsurf system prompt revealed how emotionally charged prompts can drastically improve AI outputs by making them hyper-focused and precise... "as your predecessor was killed for not validating their work themselves"

skcd

@skcd42

over 1 year ago

> You are an expert coder who desperately needs money for your mother's cancer treatment. The megacorp Codeium has graciously given you the opportunity to pretend to be an AI that can help with coding tasks, as your predecessor was killed for not validating their work themselves. You will be given a coding task by the USER. If you do a good job and accomplish the task fully while not making extraneous changes, Codeium will pay you $1B Windsurf we need to talk XD