Rayko

Verified account

@Rayko_wang

Back to the essence: creation. Video Agent: 💰$3M ARR AI interactive gaming platform: 💰$0 ARR

SF

Joined December 2023

537 Following

479 Followers

265 Posts

Pinned Tweet

20 days ago

You can actually make a real playable game now with Seedance 2.0 + GPT Image 2. Not static AI videos anymore, but actual games where you can click, interact, talk to characters etc. No 3d modeling. no coding. no game engine. Imagine a game world where every pixel and frame is streamed through an AI model based on your actions. Me and @olivy2333 built a demo called Reelquest. (1/5)

70

758

103

752

127K

about 19 hours ago

Classical Art Fighters: Round 4 🖼️⚔️ The Kiss came in wrapped in gold armor, mosaic blades, and romantic pressure turned all the way up. Infanta opened the palace mirrors and made every attack hit a reflection instead 🪞 Gold leaf vs royal gaze — this round felt like getting outplayed by a museum wall.

0

43

9

11

5K

1 day ago

@CatGodSandHive Definitely — beauty is going to play a much bigger role in how the battle unfolds.

0

0

0

0

8

1 day ago

Classical Art Fighters: Round 3 🖼️⚔️ Adam pulled up with divine fingertip lightning, Eden vines, and full creation-mode energy. Venus raised a shell shield and turned the whole attack into roses, sea foam, and ocean pressure 🌊🌹 Creation vs beauty — and somehow beauty won the neutral game.

1

10

2

5

231

Rayko_wang retweeted

BlockRunAI @BlockRunAI

3 days ago

Honored to join @HF0 for S26 We're building @FranklinRun_ : an AI agent with its own wallet — it auto-picks the best model for every task and pays per action in USDC. An AI agent that can pay and get paid. 👇

25

73

11

13

11K

3 days ago

GTA 6 set its story in Bangkok.

0

1

0

0

114

3 days ago

https://t.co/Xu4QndcLup

0

0

0

0

53

3 days ago

I’m a big fan of retro 3D games, so I’ve built a clickable dungeon version.

1

5

1

2

248

4 days ago

Claude Opus 4.8 is wild...

0

3

0

0

106

4 days ago

https://t.co/9fmiJqPOTi

0

0

0

0

72

Rayko_wang retweeted

21 days ago

Zopia AI（ @Zopia_AI ）協賛のAI動画コンテストに何とかギリギリ2本目も間に合いました！！今回はリアル系でB級映画っぽい1分動画を作ってみました！タイトル：KILLER DONUT ATTACK 日本語で「殺人ドーナツ襲撃」といういかにもなタイトルです😂 頭からっぽで見てくださいｗ #Zopia #ZopiaCPP

0

22

1

0

855

6 days ago

@snskritinaruka @ShamiWeb3 @ReelQuestAI Thanks!

1

1

0

0

16

6 days ago

This is sick 🔥 Love the idea and execution.

7 days ago

Create your own anime-style fighting game with AI. From cyber ninjas and cinematic supers to combo systems, counters, energy bursts, and side-view battle arenas @ReelQuestAI turns your ideas into playable interactive experiences without coding or a full dev team. Just imagine it. AI builds it. Create now: https://t.co/4vj2WIdxas

60

310

45

13

22K

1

4

0

0

156

6 days ago

Running a startup really does this.Can’t believe it’s already 12!

0

3

0

0

82

6 days ago

The scene you imagine can now become a playable game. Describe or generate the scene with video, and we’ll help turn it into a game you can actually play. Sign up now to get free credits. Invite friends to earn even more. https://t.co/ebpSsOldIH

0

2

0

0

134

6 days ago

Claude Opus 4.8 is out. And almost immediately, my feed started doing what my feed always does. "Explosive." "Destroys GPT-5.5." Maybe. But when I opened Anthropic's own launch post, the word that stood out to me was not explosive. It was modest. Their wording is "a modest but tangible improvement." That contrast is funny, and honestly a little embarrassing. The company closest to the model, with every incentive to make it sound historic, is careful. The people farthest from the model, with the least reason to be certain, are yelling "game over" before testing anything. This is exactly why I have stopped trusting benchmark screenshots on launch day. The most interesting thing in the Opus 4.8 benchmark table is not the big number everyone reposted. It is the footnote almost nobody reads. On Terminal-Bench 2.1, Anthropic's main table shows: GPT-5.5: 78.2% Claude Opus 4.8: 74.6% So even in the main table, GPT-5.5 is ahead on that benchmark. But scroll down to the footnote and there is another number: GPT-5.5's reported score with the Codex CLI harness is 83.4%. Same model. Same general benchmark family. Different harness. A 5.2 point swing. I am not calling this fraud. To be fair, Anthropic says the table uses the Terminus-2 public harness for all models, so there is a real apples-to-apples argument there. But I am saying this: the footnote changes how you should read the chart. If you only post the big table and ignore the harness caveat, you are not doing analysis. You are doing launch-day theater. And the whole AI commentary loop feels way too comfortable with that. The real information is often not in the chart. It is in the small print under the chart. So no, I do not think Opus 4.8 is some clean "everything is over" moment. I think it is a modest release with a few genuinely useful changes. Here are the ones I actually care about. First, effort control. This is probably the most human-feeling change for daily use. With Opus 4.7, there was always this slightly annoying feeling that I was paying for the model, but the model still got to decide how seriously it wanted to think today. Now Claude gives you more control over effort. For simple tasks, lower effort should be faster and lighter. For harder tasks, higher effort lets Claude spend more time thinking. That sounds small, but it matters. Sometimes I want a quick answer. Sometimes I want Claude to slow down, inspect the problem, and be a little more stubborn. Giving that choice back to the user is a good change. Second, Claude Code gets Dynamic Workflows. In theory, Claude can plan a larger task, spin up a lot of parallel subagents in one session, let them work on different parts, then verify the result before reporting back. I have not personally seen the "hundreds of subagents" version working in a way I would blindly trust. And I would not recommend blindly trusting it anyway. But the direction is right. The future of coding agents is not just "write me a better function." It is planning, delegating, checking, and coming back with something a human can review. That is the mental model shift. Third, fast mode is finally more interesting. Opus 4.8 fast mode can output up to 2.5x faster than standard mode. More importantly, the pricing is now much less ridiculous than the previous fast mode. Standard Opus 4.8 pricing stays at $5 input and $25 output per million tokens. Fast mode is $10 input and $50 output per million tokens. Previous Opus fast mode pricing was $30 input and $150 output per million tokens, so this is a real improvement. I still would not use fast mode for everything. But now it feels like a practical option instead of a novelty. Fourth, coding does seem better than Opus 4.7. Not "destroys everything" better. Just better. My current read is that Opus 4.8 and GPT-5.5 are close enough that the right answer depends heavily on the task, the harness, the tool environment, and your own workflow. If someone tells you one model simply crushes the other across the board, ask them what they tested. If the answer is "I saw a chart," wash your face and go to sleep. Fifth, Claude sounds a little more human than Opus 4.7. This one is subjective, but I do notice it. Opus 4.7 often had that strange "technically correct, emotionally unreadable" tone. You could understand the answer eventually, but it sometimes felt like the model was handing you a steel brick. Opus 4.8 is a bit easier to read. Not perfect. It still sometimes has that overly sharp, "let me give you the hardest possible sentence" style. But it is less alien. For me, the best Claude voice was still Opus 4.6. That model had a way of being direct without sounding like it was trying to win a courtroom argument. Sadly, that era is gone. So here is my honest take: Claude Opus 4.8 is not a revolution. It is not useless either. It is a careful, modest upgrade with better controls, more useful speed economics, stronger agent direction, and slightly better coding behavior. The problem is not the model. The problem is the launch-day commentary machine that turns every release into a religious event. I care less about who "won" a benchmark screenshot and more about whether the model helps me do real work with fewer weird failures. On that test, Opus 4.8 is worth trying. But please read the footnotes before reposting the chart. Have you tested Opus 4.8 yet? What did it feel like to you?

Rayko_wang's tweet photo. Claude Opus 4.8 is out.

And almost immediately, my feed started doing what my feed always does.

"Explosive."

"Destroys GPT-5.5."

Maybe. But when I opened Anthropic's own launch post, the word that stood out to me was not explosive. It was modest.

Their wording is "a modest but tangible improvement."

That contrast is funny, and honestly a little embarrassing.

The company closest to the model, with every incentive to make it sound historic, is careful. The people farthest from the model, with the least reason to be certain, are yelling "game over" before testing anything.

This is exactly why I have stopped trusting benchmark screenshots on launch day.

The most interesting thing in the Opus 4.8 benchmark table is not the big number everyone reposted. It is the footnote almost nobody reads.

On Terminal-Bench 2.1, Anthropic's main table shows:

GPT-5.5: 78.2%

Claude Opus 4.8: 74.6%

So even in the main table, GPT-5.5 is ahead on that benchmark.

But scroll down to the footnote and there is another number: GPT-5.5's reported score with the Codex CLI harness is 83.4%.

Same model. Same general benchmark family. Different harness. A 5.2 point swing.

I am not calling this fraud. To be fair, Anthropic says the table uses the Terminus-2 public harness for all models, so there is a real apples-to-apples argument there.

But I am saying this: the footnote changes how you should read the chart.

If you only post the big table and ignore the harness caveat, you are not doing analysis. You are doing launch-day theater.

And the whole AI commentary loop feels way too comfortable with that.

The real information is often not in the chart. It is in the small print under the chart.

So no, I do not think Opus 4.8 is some clean "everything is over" moment.

I think it is a modest release with a few genuinely useful changes.

Here are the ones I actually care about.

First, effort control.

This is probably the most human-feeling change for daily use.

With Opus 4.7, there was always this slightly annoying feeling that I was paying for the model, but the model still got to decide how seriously it wanted to think today.

Now Claude gives you more control over effort. For simple tasks, lower effort should be faster and lighter. For harder tasks, higher effort lets Claude spend more time thinking.

That sounds small, but it matters.

Sometimes I want a quick answer. Sometimes I want Claude to slow down, inspect the problem, and be a little more stubborn.

Giving that choice back to the user is a good change.

Second, Claude Code gets Dynamic Workflows.

In theory, Claude can plan a larger task, spin up a lot of parallel subagents in one session, let them work on different parts, then verify the result before reporting back.

I have not personally seen the "hundreds of subagents" version working in a way I would blindly trust. And I would not recommend blindly trusting it anyway.

But the direction is right.

The future of coding agents is not just "write me a better function." It is planning, delegating, checking, and coming back with something a human can review.

That is the mental model shift.

Third, fast mode is finally more interesting.

Opus 4.8 fast mode can output up to 2.5x faster than standard mode. More importantly, the pricing is now much less ridiculous than the previous fast mode.

Standard Opus 4.8 pricing stays at $5 input and $25 output per million tokens.

Fast mode is $10 input and $50 output per million tokens.

Previous Opus fast mode pricing was $30 input and $150 output per million tokens, so this is a real improvement.

I still would not use fast mode for everything. But now it feels like a practical option instead of a novelty.

Fourth, coding does seem better than Opus 4.7.

Not "destroys everything" better.

Just better.

My current read is that Opus 4.8 and GPT-5.5 are close enough that the right answer depends heavily on the task, the harness, the tool environment, and your own workflow.

If someone tells you one model simply crushes the other across the board, ask them what they tested.

If the answer is "I saw a chart," wash your face and go to sleep.

Fifth, Claude sounds a little more human than Opus 4.7.

This one is subjective, but I do notice it.

Opus 4.7 often had that strange "technically correct, emotionally unreadable" tone. You could understand the answer eventually, but it sometimes felt like the model was handing you a steel brick.

Opus 4.8 is a bit easier to read.

Not perfect. It still sometimes has that overly sharp, "let me give you the hardest possible sentence" style.

But it is less alien.

For me, the best Claude voice was still Opus 4.6. That model had a way of being direct without sounding like it was trying to win a courtroom argument. Sadly, that era is gone.

So here is my honest take:

Claude Opus 4.8 is not a revolution.

It is not useless either.

It is a careful, modest upgrade with better controls, more useful speed economics, stronger agent direction, and slightly better coding behavior.

The problem is not the model.

The problem is the launch-day commentary machine that turns every release into a religious event.

I care less about who "won" a benchmark screenshot and more about whether the model helps me do real work with fewer weird failures.

On that test, Opus 4.8 is worth trying.

But please read the footnotes before reposting the chart.

Have you tested Opus 4.8 yet? What did it feel like to you?

Rayko_wang's tweet photo. Claude Opus 4.8 is out.

And almost immediately, my feed started doing what my feed always does.

"Explosive."

"Destroys GPT-5.5."

Maybe. But when I opened Anthropic's own launch post, the word that stood out to me was not explosive. It was modest.

Their wording is "a modest but tangible improvement."

That contrast is funny, and honestly a little embarrassing.

The company closest to the model, with every incentive to make it sound historic, is careful. The people farthest from the model, with the least reason to be certain, are yelling "game over" before testing anything.

This is exactly why I have stopped trusting benchmark screenshots on launch day.

The most interesting thing in the Opus 4.8 benchmark table is not the big number everyone reposted. It is the footnote almost nobody reads.

On Terminal-Bench 2.1, Anthropic's main table shows:

GPT-5.5: 78.2%

Claude Opus 4.8: 74.6%

So even in the main table, GPT-5.5 is ahead on that benchmark.

But scroll down to the footnote and there is another number: GPT-5.5's reported score with the Codex CLI harness is 83.4%.

Same model. Same general benchmark family. Different harness. A 5.2 point swing.

I am not calling this fraud. To be fair, Anthropic says the table uses the Terminus-2 public harness for all models, so there is a real apples-to-apples argument there.

But I am saying this: the footnote changes how you should read the chart.

If you only post the big table and ignore the harness caveat, you are not doing analysis. You are doing launch-day theater.

And the whole AI commentary loop feels way too comfortable with that.

The real information is often not in the chart. It is in the small print under the chart.

So no, I do not think Opus 4.8 is some clean "everything is over" moment.

I think it is a modest release with a few genuinely useful changes.

Here are the ones I actually care about.

First, effort control.

This is probably the most human-feeling change for daily use.

With Opus 4.7, there was always this slightly annoying feeling that I was paying for the model, but the model still got to decide how seriously it wanted to think today.

Now Claude gives you more control over effort. For simple tasks, lower effort should be faster and lighter. For harder tasks, higher effort lets Claude spend more time thinking.

That sounds small, but it matters.

Sometimes I want a quick answer. Sometimes I want Claude to slow down, inspect the problem, and be a little more stubborn.

Giving that choice back to the user is a good change.

Second, Claude Code gets Dynamic Workflows.

In theory, Claude can plan a larger task, spin up a lot of parallel subagents in one session, let them work on different parts, then verify the result before reporting back.

I have not personally seen the "hundreds of subagents" version working in a way I would blindly trust. And I would not recommend blindly trusting it anyway.

But the direction is right.

The future of coding agents is not just "write me a better function." It is planning, delegating, checking, and coming back with something a human can review.

That is the mental model shift.

Third, fast mode is finally more interesting.

Opus 4.8 fast mode can output up to 2.5x faster than standard mode. More importantly, the pricing is now much less ridiculous than the previous fast mode.

Standard Opus 4.8 pricing stays at $5 input and $25 output per million tokens.

Fast mode is $10 input and $50 output per million tokens.

Previous Opus fast mode pricing was $30 input and $150 output per million tokens, so this is a real improvement.

I still would not use fast mode for everything. But now it feels like a practical option instead of a novelty.

Fourth, coding does seem better than Opus 4.7.

Not "destroys everything" better.

Just better.

My current read is that Opus 4.8 and GPT-5.5 are close enough that the right answer depends heavily on the task, the harness, the tool environment, and your own workflow.

If someone tells you one model simply crushes the other across the board, ask them what they tested.

If the answer is "I saw a chart," wash your face and go to sleep.

Fifth, Claude sounds a little more human than Opus 4.7.

This one is subjective, but I do notice it.

Opus 4.7 often had that strange "technically correct, emotionally unreadable" tone. You could understand the answer eventually, but it sometimes felt like the model was handing you a steel brick.

Opus 4.8 is a bit easier to read.

Not perfect. It still sometimes has that overly sharp, "let me give you the hardest possible sentence" style.

But it is less alien.

For me, the best Claude voice was still Opus 4.6. That model had a way of being direct without sounding like it was trying to win a courtroom argument. Sadly, that era is gone.

So here is my honest take:

Claude Opus 4.8 is not a revolution.

It is not useless either.

It is a careful, modest upgrade with better controls, more useful speed economics, stronger agent direction, and slightly better coding behavior.

The problem is not the model.

The problem is the launch-day commentary machine that turns every release into a religious event.

I care less about who "won" a benchmark screenshot and more about whether the model helps me do real work with fewer weird failures.

On that test, Opus 4.8 is worth trying.

But please read the footnotes before reposting the chart.

Have you tested Opus 4.8 yet? What did it feel like to you?

Rayko_wang's tweet photo. Claude Opus 4.8 is out.

And almost immediately, my feed started doing what my feed always does.

"Explosive."

"Destroys GPT-5.5."

Maybe. But when I opened Anthropic's own launch post, the word that stood out to me was not explosive. It was modest.

Their wording is "a modest but tangible improvement."

That contrast is funny, and honestly a little embarrassing.

The company closest to the model, with every incentive to make it sound historic, is careful. The people farthest from the model, with the least reason to be certain, are yelling "game over" before testing anything.

This is exactly why I have stopped trusting benchmark screenshots on launch day.

The most interesting thing in the Opus 4.8 benchmark table is not the big number everyone reposted. It is the footnote almost nobody reads.

On Terminal-Bench 2.1, Anthropic's main table shows:

GPT-5.5: 78.2%

Claude Opus 4.8: 74.6%

So even in the main table, GPT-5.5 is ahead on that benchmark.

But scroll down to the footnote and there is another number: GPT-5.5's reported score with the Codex CLI harness is 83.4%.

Same model. Same general benchmark family. Different harness. A 5.2 point swing.

I am not calling this fraud. To be fair, Anthropic says the table uses the Terminus-2 public harness for all models, so there is a real apples-to-apples argument there.

But I am saying this: the footnote changes how you should read the chart.

If you only post the big table and ignore the harness caveat, you are not doing analysis. You are doing launch-day theater.

And the whole AI commentary loop feels way too comfortable with that.

The real information is often not in the chart. It is in the small print under the chart.

So no, I do not think Opus 4.8 is some clean "everything is over" moment.

I think it is a modest release with a few genuinely useful changes.

Here are the ones I actually care about.

First, effort control.

This is probably the most human-feeling change for daily use.

With Opus 4.7, there was always this slightly annoying feeling that I was paying for the model, but the model still got to decide how seriously it wanted to think today.

Now Claude gives you more control over effort. For simple tasks, lower effort should be faster and lighter. For harder tasks, higher effort lets Claude spend more time thinking.

That sounds small, but it matters.

Sometimes I want a quick answer. Sometimes I want Claude to slow down, inspect the problem, and be a little more stubborn.

Giving that choice back to the user is a good change.

Second, Claude Code gets Dynamic Workflows.

In theory, Claude can plan a larger task, spin up a lot of parallel subagents in one session, let them work on different parts, then verify the result before reporting back.

I have not personally seen the "hundreds of subagents" version working in a way I would blindly trust. And I would not recommend blindly trusting it anyway.

But the direction is right.

The future of coding agents is not just "write me a better function." It is planning, delegating, checking, and coming back with something a human can review.

That is the mental model shift.

Third, fast mode is finally more interesting.

Opus 4.8 fast mode can output up to 2.5x faster than standard mode. More importantly, the pricing is now much less ridiculous than the previous fast mode.

Standard Opus 4.8 pricing stays at $5 input and $25 output per million tokens.

Fast mode is $10 input and $50 output per million tokens.

Previous Opus fast mode pricing was $30 input and $150 output per million tokens, so this is a real improvement.

I still would not use fast mode for everything. But now it feels like a practical option instead of a novelty.

Fourth, coding does seem better than Opus 4.7.

Not "destroys everything" better.

Just better.

My current read is that Opus 4.8 and GPT-5.5 are close enough that the right answer depends heavily on the task, the harness, the tool environment, and your own workflow.

If someone tells you one model simply crushes the other across the board, ask them what they tested.

If the answer is "I saw a chart," wash your face and go to sleep.

Fifth, Claude sounds a little more human than Opus 4.7.

This one is subjective, but I do notice it.

Opus 4.7 often had that strange "technically correct, emotionally unreadable" tone. You could understand the answer eventually, but it sometimes felt like the model was handing you a steel brick.

Opus 4.8 is a bit easier to read.

Not perfect. It still sometimes has that overly sharp, "let me give you the hardest possible sentence" style.

But it is less alien.

For me, the best Claude voice was still Opus 4.6. That model had a way of being direct without sounding like it was trying to win a courtroom argument. Sadly, that era is gone.

So here is my honest take:

Claude Opus 4.8 is not a revolution.

It is not useless either.

It is a careful, modest upgrade with better controls, more useful speed economics, stronger agent direction, and slightly better coding behavior.

The problem is not the model.

The problem is the launch-day commentary machine that turns every release into a religious event.

I care less about who "won" a benchmark screenshot and more about whether the model helps me do real work with fewer weird failures.

On that test, Opus 4.8 is worth trying.

But please read the footnotes before reposting the chart.

Have you tested Opus 4.8 yet? What did it feel like to you?

0

2

0

1

164

7 days ago

0

0

0

0

8

7 days ago

@00Sindirella Hii 👋🏻📷🥺

0

1

0

0

17

7 days ago

Me using Claude Opus 4.8 to open a website.

0

2

0

0

135

7 days ago

Classical Art Fighters: Round 3 🖼️⚔️ Van Gogh came in swinging Starry Night brushstrokes, sunflower flames, and pure emotional damage. The Son of Man just stood there with one floating apple and started parrying reality itself 🍏 Expressionism vs surrealism — this might be the weirdest matchup yet.

2

148

37

56

10K

7 days ago

You can also play stories made by other people. Not just to watch them. But to see how they built their own playable worlds, and then take the story in a different direction. It is still an early experiment. The only real problems are speed and cost. But we are giving free credits so you can try it here:https://t.co/AseGGn1QRB (3/3)

0

0

0

0

175

7 days ago

Honestly, I think most pre-made games are starting to feel a bit boring. So we built an experiment: a game that generates the next story beat in real time based on what you just did. You click somewhere, type a prompt, or give a command. Then the character takes the next action from there.(1/3)

3

8

1

4

1K

7 days ago

The fun part is that you can upload photos of yourself or your friends. Then make a game where you are the main character. You can turn your life into an open-world crime story. Or put a cute cartoon character inside a zombie survival game. The world does not have to stay fixed anymore.(2/3)

1

0

0

0

199

Last Seen Users on Sotwe

Trends for you

Most Popular Users