Rupom Ghosh

Verified account

@rupomkghosh

AI implementation for small → enterprise businesses | Low-cost, high-value systems | Insights on the latest AI trends

Canada

Joined April 2026

160 Following

58 Followers

246 Posts

23 days ago

"why do generated 3D assets always look so flat and lifeless?" That's the bottleneck Apple just broke by open-sourcing LiTo (Surface Light Field Tokenization). Most AI generators only care about reconstructing an object’s physical shape. LiTo is different, it captures how light actually changes and reflects as you move around the object from different angles. By tokenizing surface light fields, it ensures that shiny surfaces, complex reflections, and fine textures behave exactly like real-world materials when you inspect them. In side-by-side comparisons, it beats leading models like TRELLIS by delivering vastly superior accuracy and viewpoint-dependent realism from just a single input image. Apple has made the complete code repository and training scripts public, meaning you can pull the entire 3D pipeline out of the cloud and run it completely on your own local hardware. At Grivus, we love seeing local-first tools that push the boundaries of spatial computing and automated asset creation. If your studio is looking to deploy custom 3D asset generation or local pipeline tooling here in Ontario, all of our technical integration services are quoted in CAD. Stop creating dead geometry. your 3D models can finally interact with light the way they were meant to.

rupomkghosh's tweet photo. "why do generated 3D assets always look so flat and lifeless?"

That's the bottleneck Apple just broke by open-sourcing LiTo (Surface Light Field Tokenization).

Most AI generators only care about reconstructing an object’s physical shape. LiTo is different, it captures how light actually changes and reflects as you move around the object from different angles.

By tokenizing surface light fields, it ensures that shiny surfaces, complex reflections, and fine textures behave exactly like real-world materials when you inspect them.

In side-by-side comparisons, it beats leading models like TRELLIS by delivering vastly superior accuracy and viewpoint-dependent realism from just a single input image.

Apple has made the complete code repository and training scripts public, meaning you can pull the entire 3D pipeline out of the cloud and run it completely on your own local hardware.

At Grivus, we love seeing local-first tools that push the boundaries of spatial computing and automated asset creation.

If your studio is looking to deploy custom 3D asset generation or local pipeline tooling here in Ontario, all of our technical integration services are quoted in CAD.

Stop creating dead geometry. your 3D models can finally interact with light the way they were meant to.

0

1

0

0

14

24 days ago

"why does video editing still require three different AI models chained together?" That's the question that led to Lance, ByteDance’s new 3B unified multimodal framework. it handles text, images, and videos inside a single architecture. Most systems only do one thing well. Lance blends visual understanding with generation, letting you execute complex, sequential video edits through conversational prompts. You can swap a background to fire, change a car’s color, and then tell a subject to slowly raise their hand, all while keeping the original context completely intact. Because it actually understands spatial logic, you can even hand it a static image of a maze and it will natively output a video of the model solving it. The code is completely open-source for local pipelines, meaning zero cloud-API dependencies or recurring subscription fees. The only hurdle? despite the compact 3B active parameter count, local inference demands a massive hardware footprint requiring a GPU setup with at least 40 GB of VRAM. At Grivus, we love seeing workflows compressed into high-efficiency local architectures. if your studio needs high-VRAM hardware infrastructure deployed here in Ontario, all of our custom pipeline integrations are quoted in CAD. Stop wrestling with multi-model pipelines. a single unified model just took over the entire video lifecycle.

rupomkghosh's tweet photo. "why does video editing still require three different AI models chained together?"

That's the question that led to Lance, ByteDance’s new 3B unified multimodal framework. it handles text, images, and videos inside a single architecture.

Most systems only do one thing well. Lance blends visual understanding with generation, letting you execute complex, sequential video edits through conversational prompts.

You can swap a background to fire, change a car’s color, and then tell a subject to slowly raise their hand, all while keeping the original context completely intact.

Because it actually understands spatial logic, you can even hand it a static image of a maze and it will natively output a video of the model solving it.

The code is completely open-source for local pipelines, meaning zero cloud-API dependencies or recurring subscription fees.

The only hurdle? despite the compact 3B active parameter count, local inference demands a massive hardware footprint requiring a GPU setup with at least 40 GB of VRAM.

At Grivus, we love seeing workflows compressed into high-efficiency local architectures. if your studio needs high-VRAM hardware infrastructure deployed here in Ontario, all of our custom pipeline integrations are quoted in CAD.

Stop wrestling with multi-model pipelines. a single unified model just took over the entire video lifecycle.

0

1

0

1

21

29 days ago

3

9

0

0

3K

29 days ago

@elonmusk This is actually the best, the limits are way above Claude for sure.

1

1

0

0

27

29 days ago

@Jashanx_gill Let's make a gc for the devs.

1

1

0

0

18

29 days ago

@FerarriPrime Yess

0

0

0

0

8

29 days ago

@davieball @genomecomputer Now this is something unique 👀

0

0

0

0

43

about 1 month ago

So much for HIPPA lol

rupomkghosh's tweet photo. So much for HIPPA lol https://t.co/vKHsUJsSk8

0

3

0

0

22

rupomkghosh retweeted

about 1 month ago

Outperforming the Competition The results of this dual-purpose training are immediately visible when comparing UniGenDet to other models. In side-by-side benchmarks, it consistently produces more realistic details in complex scenes, such as human profiles and architectural landscapes, while simultaneously outperforming competitors in fake image detection. It can accurately identify AI-generated content by analyzing fine visual details that standard detectors miss, making it a powerful tool for both creators and security researchers who need to verify digital authenticity.

rupomkghosh's tweet photo. Outperforming the Competition

The results of this dual-purpose training are immediately visible when comparing UniGenDet to other models. In side-by-side benchmarks, it consistently produces more realistic details in complex scenes, such as human profiles and architectural landscapes, while simultaneously outperforming competitors in fake image detection. It can accurately identify AI-generated content by analyzing fine visual details that standard detectors miss, making it a powerful tool for both creators and security researchers who need to verify digital authenticity.

0

2

1

0

26

rupomkghosh retweeted

about 1 month ago

The New King of Open Source Kimi K2.6 has officially claimed the top spot on the open-source leaderboard, delivering performance that rivals industry giants like GPT-5.4 High and Gemini 3.1 Pro. This model isn't just about conversation; it is a master of autonomous execution. In a recent technical showcase, Kimi was able to autonomously download and deploy a Qwen 3.5 model on a Mac, then rewrite its core implementation in the niche programming language Zig. Over 12 hours of continuous execution and 14 iterations, it optimized the model’s throughput from 15 tokens per second to a staggering 193 tokens per second.

rupomkghosh's tweet photo. The New King of Open Source

Kimi K2.6 has officially claimed the top spot on the open-source leaderboard, delivering performance that rivals industry giants like GPT-5.4 High and Gemini 3.1 Pro. This model isn't just about conversation; it is a master of autonomous execution. In a recent technical showcase, Kimi was able to autonomously download and deploy a Qwen 3.5 model on a Mac, then rewrite its core implementation in the niche programming language Zig. Over 12 hours of continuous execution and 14 iterations, it optimized the model’s throughput from 15 tokens per second to a staggering 193 tokens per second.

0

3

1

1

92

about 1 month ago

@mrtbrgl Helloo

0

0

0

0

4

about 1 month ago

@justbyte_ Npm run dev

0

0

0

0

35

about 1 month ago

@Teslaconomics This gotta be AI generated ....... there is no way .....

0

1

0

0

160

about 1 month ago

0

0

0

0

89

about 1 month ago

@gdb It's good, but I'd love it if it were open source, just like the name OpenAI.

0

0

0

0

38

about 1 month ago

0

1

0

0

2

about 1 month ago

@Gooddlovee Helooo

0

0

0

0

2

rupomkghosh retweeted

about 1 month ago

A System That Actually Learns........ The real power of this framework lies in its specialized game skill component. This library continuously evolves by documenting successful tasks and verified fixes for complex bugs. When the agent encounters a logic error during implementation, it consults its own history to find a solution instead of hallucinating a new problem. This persistent memory ensures that the agent becomes more efficient with every project it completes.

rupomkghosh's tweet photo. A System That Actually Learns........

The real power of this framework lies in its specialized game skill component. This library continuously evolves by documenting successful tasks and verified fixes for complex bugs. When the agent encounters a logic error during implementation, it consults its own history to find a solution instead of hallucinating a new problem. This persistent memory ensures that the agent becomes more efficient with every project it completes.

0

1

1

0

24

rupomkghosh retweeted

about 1 month ago

Even the most advanced models are still struggling with basic human intuition. While the top AI agents hover around 30% performance, the novice human baseline remains significantly higher at 64%. This performance gap is most visible in tasks requiring precise timing, spatial navigation, and long-horizon coordination in open-world environments. Despite these challenges, general chat models are successfully analyzing visual screens and pressing keys to interact with environments they were never specifically trained to play.

rupomkghosh's tweet photo. Even the most advanced models are still struggling with basic human intuition. While the top AI agents hover around 30% performance, the novice human baseline remains significantly higher at 64%. This performance gap is most visible in tasks requiring precise timing, spatial navigation, and long-horizon coordination in open-world environments.

Despite these challenges, general chat models are successfully analyzing visual screens and pressing keys to interact with environments they were never specifically trained to play.

0

2

1

0

29

rupomkghosh retweeted

about 2 months ago

The speed of 10 m/s is the new world record for humanoids. Unitree just pushed their flagship H1 to 36 km/h. This isn’t just about raw power, it’s a masterclass in whole-body control systems and high-frequency stability. The Specs: Weight: 62 kg ⚖️ Leg Length: 0.8 m 🦵 Velocity: 10 m/s ⚡ We are seeing the gap between human-like and super-human motion close in real-time. Dynamic motion is no longer a bottleneck for robotics.

rupomkghosh's tweet photo. The speed of 10 m/s is the new world record for humanoids.

Unitree just pushed their flagship H1 to 36 km/h. This isn’t just about raw power, it’s a masterclass in whole-body control systems and high-frequency stability.

The Specs:
Weight: 62 kg ⚖️
Leg Length: 0.8 m 🦵
Velocity: 10 m/s ⚡

We are seeing the gap between human-like and super-human motion close in real-time. Dynamic motion is no longer a bottleneck for robotics.

0

1

1

0

47

Last Seen Users on Sotwe

Trends for you

Most Popular Users