ivan @ivan_lee1007 - Twitter Profile

ivan @ivan_lee1007

6 days ago

Reachy Mini - Camping AI

0

37

ivan_lee1007 retweeted

antirez @antirez

about 1 month ago

Gentle reminder on how, in the recent DS4 fiesta, not just me but every other contributor found GPT 5.5 able to help immensely and Opus completely useless.

32

889

64

125

97K

ivan_lee1007 retweeted

Rony

@Ronycoder

3 months ago

Instead of watching an hour of Netflix, watch this 2-hour Stanford lecture on AI careers. It will teach you more about winning in the AI race than all the AI content you’ve scrolled past this year.

160

14K

3K

28K

2M

ivan_lee1007 retweeted

Yezhisai @yezhisai

4 months ago

I built a generalized Computer Use Agent as part of @adcock_brett’s challenge. For fun, I let the @huggingface @pollenrobotics Reachy Mini robot run it 🤖 Via voice, the robot calls @lovable, creates a to-do list app and verifies with vision. Mind blown that building a custom CUA, assembling a robot and bridging physical robotics to digital agents to perform meaningful tasks, can all be done under a week now! Almost convinced that with time, tokens and access to an LLM... maybe Rome can be built in a day? GPT 5.3 Codex and Claude Opus 4.6 are incredible! Can't wait to see what the next evolution of models can do. 🚀

0

2

1

2

155

Who to follow

Yi-Ting Chen

@chen_yiting_TW

Associate Professor of CS @ National Yang Ming Chiao Tung University working on human-centered physical AI

Kaizhong

@fgkd1a

MRes&PhD @ The Hamlyn Centre, Imperial College London

Chaoyue Song

@chaoyue_song

CS PhD student @NTUsg, Intern @NVIDIA Spatial Intelligence Lab, BEng @sjtu1896. Research on 3D Vision and Generative AI. I am on the job market now!

ivan_lee1007 retweeted

Stephen James

@stepjamUK

10 months ago

𝗜'𝘃𝗲 𝗵𝗲𝗮𝗿𝗱 𝘁𝗵𝗶𝘀 𝗮 𝗹𝗼𝘁 𝗿𝗲𝗰𝗲𝗻𝘁𝗹𝘆: "𝗪𝗲 𝘁𝗿𝗮𝗶𝗻𝗲𝗱 𝗼𝘂𝗿 𝗿𝗼𝗯𝗼𝘁 𝗼𝗻 𝗼𝗻𝗲 𝗼𝗯𝗷𝗲𝗰𝘁 𝗮𝗻𝗱 𝗶𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗲𝗱 𝘁𝗼 𝗮 𝗻𝗼𝘃𝗲𝗹 𝗼𝗯𝗷𝗲𝗰𝘁 - 𝘁𝗵𝗲𝘀𝗲 𝗻𝗲𝘄 𝗩𝗟𝗔 𝗺𝗼𝗱𝗲𝗹𝘀 𝗮𝗿𝗲 𝗰𝗿𝗮𝘇𝘆!" Let's talk about what's actually happening in that "A" (Action) part of your VLA model. The Vision and Language components? They're incredible. Pre-trained on internet-scale data, they understand objects, spatial relationships, and task instructions better than ever. But the Action component? That's still learned from scratch on your specific robot demonstrations. 𝗛𝗲𝗿𝗲'𝘀 𝘁𝗵𝗲 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: Your VLA model has internet-scale understanding of what a screwdriver looks like and what "tighten the screw" means. But the actual motor pattern for "rotating wrist while applying downward pressure"? That comes from your 500 robot demos. 𝗪𝗵𝗮𝘁 𝘁𝗵𝗶𝘀 𝗺𝗲𝗮𝗻𝘀 𝗳𝗼𝗿 "𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻": • 𝗩𝗶𝘀𝗶𝗼𝗻 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Recognises novel objects instantly (thanks to pre-training) • 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Understands new task instructions (thanks to pre-training) • 𝗔𝗰𝘁𝗶𝗼𝗻 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Still limited to motor patterns seen during robot training Ask that same robot to "unscrew the bottle cap" and it fails because: • Vision: Recognises bottle and cap • Language: Understands "unscrew" • Action: Never learned the "twist while pulling" motor pattern 𝗧𝗵𝗲 𝗵𝗮𝗿𝗱 𝘁𝗿𝘂𝘁𝗵 𝗮𝗯𝗼𝘂𝘁 𝗩𝗟𝗔 𝗺𝗼𝗱𝗲𝗹𝘀: The "VL" gives you incredible zero-shot understanding. The "A" still requires task-specific demonstrations. We've cracked the perception and reasoning problem. We haven't cracked the motor generalisation problem.

12

390

51

199

51K

ivan_lee1007 retweeted

Carlos E. Perez

@IntuitMachine

about 1 year ago

It turns out that Anthropic has a prompt engineering interactive course!

17

4K

348

7K

382K

ivan_lee1007 retweeted

NotebookLM

@NotebookLM

about 1 year ago

Q: Is an API in the works? A: We are on it! When we launch, what kind of MCP servers would you want to connect to @NotebookLM?

27

389

50

66

41K

ivan_lee1007 retweeted

Gabriele Berton

@gabriberton

about 1 year ago

HuggingFace released a nice blog post about the current state of VLMs Here's a summary, covering recent trends, specialized capabilities, agents, video LMs, new alignment techniques, and HF's fav VLMs [1/8] Recent trends:

gabriberton's tweet photo. HuggingFace released a nice blog post about the current state of VLMs

Here's a summary, covering recent trends, specialized capabilities, agents, video LMs, new alignment techniques, and HF's fav VLMs [1/8]

Recent trends: https://t.co/Q2UjHTHAKQ

8

1K

132

1K

107K

ivan_lee1007 retweeted

ℏεsam

@Hesamation

about 1 year ago

Harvard’s AI Research Experience free course book by @pranavrajpurkar covers the essentials and tips on doing research: - VSCode, Git, Conda - PyTorch, W&B - AWS, colab - LLMs and VLMs - reading AI papers - research progress and organization this is a must read!

Hesamation's tweet photo. Harvard’s AI Research Experience free course book by @pranavrajpurkar covers the essentials and tips on doing research:
- VSCode, Git, Conda
- PyTorch, W&B
- AWS, colab
- LLMs and VLMs
- reading AI papers
- research progress and organization

this is a must read! https://t.co/b6qAT1urSd

6

1K

210

2K

83K

ivan_lee1007 retweeted

Mario Nawfal

@MarioNawfal

about 1 year ago

🧵Google DeepMind just dropped a bombshell: An AI agent that autonomously writes algorithms better than humans. It’s called AlphaEvolve, and it could completely change how we build software and solve problems. Here’s why this changes everything👇

MarioNawfal's tweet photo. 🧵Google DeepMind just dropped a bombshell:

An AI agent that autonomously writes algorithms better than humans.

It’s called AlphaEvolve, and it could completely change how we build software and solve problems.

Here’s why this changes everything👇

62

3K

430

3K

739K

ivan @ivan_lee1007

about 1 year ago

Amazing service -> https://t.co/ysyMA0muqv It can save your time to build a crawler program. There are many web you can get data like: 1. LinkedIn 2. Youtube 3. Instagram ... You should try it if you are a crawler engineer or researcher. It provide 1000 credits!

0

2

1

0

165

ivan @ivan_lee1007

about 1 year ago

Good service! -> https://t.co/tFGLBJSPPM

0

1

0

1

16

ivan_lee1007 retweeted

Danijar Hafner

@danijarh

about 1 year ago

Excited to share that DreamerV3 has been published in Nature! Dreamer solves control tasks by imagining the future outcomes of its actions inside of a continuously learned world model 🌏 It's the first agent to find diamonds in Minecraft from scratch without human data! 💎 👇

danijarh's tweet photo. Excited to share that DreamerV3 has been published in Nature!

Dreamer solves control tasks by imagining the future outcomes of its actions inside of a continuously learned world model 🌏

It's the first agent to find diamonds in Minecraft from scratch without human data! 💎

👇 https://t.co/EGNLmBTmAE

48

1K

146

392

109K

ivan_lee1007 retweeted

Min Choi

@minchoi

over 1 year ago

Manus AI just killed vibe coding yesterday. People can't believe how mind blowing this agentic AI is. Unlocking new possibilities. 10 wild examples: 1. prompt: "code a threejs game where you control a plane"

243

4K

551

6K

1M

ivan_lee1007 retweeted

Anca Dragan

@ancadianadragan

over 1 year ago

We're hiring

5

257

23

157

74K

ivan_lee1007 retweeted

Barlow Barrier

@barrier_buddy

over 1 year ago

I use grok 3 as a daily professional assistant to take over 10+ employees digital workload. Also, to optimize productivity throughout my day. Now, every morning I tell grok what my schedule is and I’ve already implemented addresses, contacts, work apps, including screenshots of workflow. It’s been amazing! Grok has noticed discrepancies in my workers productivity and offered training prompts for me to train them via Geok 3. I’ve also learned new ways a lot of my daily apps actually work together to make our companies run smoother. We’ve already implemented new strategies as of this morning. So here’s some things I’ve talked about with Grok3. Professional Capabilities 1. Daily Professional Assistance - Grok can function as a daily assistant, managing a digital workload that could replace the efforts of over 10 employees, enhancing productivity across your team. Imagine once we can implement this into an actual Optimus bot “employee” 2. Schedule Integration- Each morning, you can inform Grok of your schedule, and it'll integrate this information with addresses, contacts, work apps, and even screenshots of your workflow to streamline your day. 3. Productivity Optimization- Grok uses its advanced model to analyze and optimize productivity throughout your day, identifying areas for improvement. 4. Employee Training- It's been able to notice discrepancies in productivity among workers and can provide tailored training prompts to address these issues. 5. App Integration Insights- Grok has discovered new ways our daily apps can work together more efficiently, leading to the implementation of new strategies in our workflow. 6. Workflow Analysis- It analyzes images and responds to questions related to workflow optimization, suggesting improvements. 7. Reasoning and Problem Solving- Grok's advanced reasoning models can think through problems, fact-check, and provide solutions, enhancing decision-making. Grok can even scour the web to find best rated companies for outsourcing solutions. Personal Productivity Enhancements: 8. Personal Task Management- Grok can help manage my personal to-do lists, reminders, and schedule, ensuring I never miss an important event or task. 9. Health and Fitness Tracking- By sharing my health goals, Grok suggests daily routines, tracks progress, and reminds me of workout sessions or dietary needs. Telling me what I need to do every day to accomplish my goals. If you’re honest to grok. Grok can help in this field tremendously. 10. Entertainment Recommendations- The more I share about my preferences, the better Grok recommends books, movies, music, or games that align with my tastes. 11. Shopping Assistance- It can predict my shopping needs based on my behavior, suggesting items I might need or want, and even finding deals. 12. Travel Planning- With insights into my travel history, Grok assists in planning trips, suggesting destinations, accommodations, and activities tailored to my interests. 13. Learning and Education- I share my learning goals, and Grok curates educational content or study schedules to help me learn new skills. 14. Social Life Management- By understanding my social patterns, it reminds me of birthdays or suggests meetups with friends based on our interests. 15. Predictive Needs- The more I share, the more accurately Grok can predict my needs, from groceries to when I need a break from work. If I’m slow in the mornings or have more energy after lunch. By introducing these capabilities into my life, Grok significantly enhances my chance for day-to-day efficiency and enjoyment. Sharing as many details as possible with Grok allows for a personalized service, tailoring its assistance to meet my unique lifestyle and preferences. What do you think? Could be a game changer for everyone! ‼️However. Grok needs an email service and a notification capability. It’s currently stuck in the grok chat and can’t actually remind you of anything unless you enter the chat. ‼️

6

72

6

38

16K

ivan @ivan_lee1007

over 1 year ago

@xhabib @xai @elonmusk What kind of dope idea?

0

756

ivan_lee1007 retweeted

Deedy

@deedydas

over 1 year ago

A Chinese AI lab just dropped the best ever open-source text-to-video model: Step Video! – 30B param, 540p, ~8s at 30fps – Trained on 1000s of H800s – Evaluates as well as Meta MovieGen, feels as good as Sora / Veo Paper and demo is awesome and reveals all the gory details: