_lebek @_lebek - Twitter Profile

_lebek

@_lebek

6 days ago

@sudoingX Even the frontier models struggle to do anything useful in blender

0

6

0

361

_lebek

@_lebek

6 days ago

Some of the best tokens I ever received are the ones Claude “shaped”, “wired” or “threaded”. If you see this vernacular things are going exceptionally well

0

28

_lebek

@_lebek

7 days ago

@antirez @josephrobison @intellectronica I haven’t found gpt-5.5 better than opus 4.7. When did you last test? Opus had issues in March. Also possible you landed in a bad A/B test group.. or that whatever you’re specifically working on has better coverage in gpt.

0

1

0

30

_lebek retweeted

Khurram Javed

@kjaved_

8 months ago

This will age poorly. I largely have an optimistic view of LLMs; I use multiple LLM tools daily, and I don't think the LLM tech stack is a bubble—it will create a lot of value. I disagree that the length of tasks that LLMs can do has been doubling every 7 months. There are tasks that LLMs can do well. As humans curate better datasets for common tasks (such as making websites, training classifiers), LLMs will get better at them. There are other tasks at which current models have a 0% success rate (e.g., some of the research questions I am looking at), and no one is actively curating better datasets for these tasks (in my case, to curate a better dataset, you first have to answer the research question). I see no improvements with newer models in these tasks. I occasionally try them anyway and the results are humorously bad. A more likely outcome of advances in LLMs is that the frontier of human software engineering will move to tasks for which datasets don't exist. This isn't anything new. Compilers had a similar impact, and virtually no one thinks of the instruction set of the chips on which they run their code.

51

809

67

484

145K

Who to follow

ShapesXR

@ShapesXR

Think Spatial. Build Spatial. Design and collaboration tool for spatial apps, games and experiences.

tipatat

@tipatat

VR AR GenAI Spatial Computing investor, GP @thevrfund

su丂an a𝓩erothɆnberg (illusi🧌nist)

@internetandrej

never the twain shall mete ⅟ ⅟⅟ ⅟ ⅟ ⅟ https://t.co/dfJI3bIUgp ⅟ ⅟ ⅟ ⅟ . ⅟ 🤎💚

_lebek

@_lebek

9 months ago

@teortaxesTex @CherryStudioHQ definitely.. I searched 3 pages of google, 2 llms, and exa before I asked you

0

3

0

156

_lebek

@_lebek

9 months ago

@teortaxesTex what gui is that?

0

2

0

2

6K

_lebek

@_lebek

9 months ago

@andonlabs What prompt do you use? How do you make sure the prompt doesn't favour one model over another?

0

1

0

95

_lebek

@_lebek

9 months ago

@jay_azhang @the_nof1 I wonder if the first superhuman trader will be better because it’s better at forecasting from the same data or because it’s better at finding new sources of relevant data

1

6

0

1

792

_lebek

@_lebek

9 months ago

The unsupervised latent action model in the original Genie paper (https://t.co/MIczCPclwx) is conceptually cool but it depends on "actions" being unpredictable. It can't learn left/right controls from a video where someone follows a trail because their turns are determined by the trail and aren't unpredictable enough to extract. It also can't distinguish player vs environment unpredictability. For example, if there's a broken light bulb in the same room as the player that flickers at random intervals it would discover a "toggle light on/off" control. Genie 3 seems to be focused on first-person WASD and camera rotation and I suspect they used a different approach to learn actions. You can do a lot with WASD in a world that can predict what you're going to do next anyway.

0

1

0

166

_lebek

@_lebek

9 months ago

AI coding tools today + my thoughts: - @cursor_ai tab completion - best workhorse for nontrivial changes in 10k+ LOC - @claudeai/@cursor_ai agent - good for small apps/scripts and throwaway code (debugging/visualization) - @Replit agent - AFAIK the only agent with read access to prod DB and request logs. That part works so well it's hard to go back to anything else, but apart from that it has the same problems as other agents. - Vibe-coded projects hit a wall after about 30 feature additions/changes. At that point it’s faster to freeze the current state into a product spec, and relaunch the agent on a clean slate. I'd like to try an IDE/agent built for one-way, full rewrite cycles where each change starts from the spec and regenerates the entire application. - Switching tools for different tasks is frustrating because each agent’s memory is isolated, and platforms like Replit make it hard to use third-party tools - Nothing closes the loop with an agent that can test e2e UI but this is surely coming - Agents might code too much like human developers and I wonder if they should bias more towards types/encapsulation/function purity/something_else to overcome hallucination and get better error messages for their feedback loop

0

1

0

206

_lebek

@_lebek

over 1 year ago

@teortaxesTex diamonds! </th, wait...

0

1

0

253

_lebek

@_lebek

over 1 year ago

@fofrAI what's your favourite realistic photo to anime model?

0

1

0

96

_lebek

@_lebek

over 2 years ago

@bstuartTI @elevenlabs It seems like the quality is a little worse than their text to speech. Do you find it’s worth it to be able to give a performance?

1

0

71