Roman Shemet @RomanShemet - Twitter Profile

Pinned Tweet

about 1 year ago

Plug in local, free, private AI into your mobile apps 🌵 Cactus now supports @FlutterDev, @reactnative , and @kotlin bindings. + with function calling, you can also deploy complex agentic pipelines directly in-app. Fully open-source 🌵 link in bio 🔗

4

65

8

7

4K

RomanShemet retweeted

Aristotelis Economides

@aristotelis_eco

4 months ago

https://t.co/8JsKtSDnqM

3

53

3

63

21K

Roman Shemet

@RomanShemet

3 months ago

All cloud fallback is 𝗙𝗥𝗘𝗘 𝘁𝗵𝗶𝘀 𝗙𝗲𝗯𝗿𝘂𝗮𝗿𝘆. Seriously. Make us regret this 😅 We just launched Hybrid Cloud inference and we're too excited for you to try it. 1. Go to https://t.co/341YSbH19V 2. Sign up and create a key 3. Run unlimited on-device transcription and LLM inference with cloud fallback Cactus Hybrid Cloud runs inference on-device by default, as always. If the on-device model struggles, it automatically hands off inference to the cloud. Demo for yourself: 𝗯𝗿𝗲𝘄 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗰𝗮𝗰𝘁𝘂𝘀-𝗰𝗼𝗺𝗽𝘂𝘁𝗲/𝗰𝗮𝗰𝘁𝘂𝘀/𝗰𝗮𝗰𝘁𝘂𝘀 𝗰𝗮𝗰𝘁𝘂𝘀 𝘁𝗿𝗮𝗻𝘀𝗰𝗿𝗶𝗯𝗲

0

6

1

464

Roman Shemet

@RomanShemet

4 months ago

@dshukertjr Are there best practice guidelines for syncing local<>prod migrations with supabase?

0

23

Who to follow

PerThirtySix.com

@PerThirtySixers

Thoughtful, data-driven data explorations by Rob Moore (@robmoo_re) and Shri Khalpada (@ShriKhalpada).

Roman Shemet

@RomanShemet

4 months ago

@ayusrjn @yoheinakajima @RunAnywhereAI @cactuscompute thank you for the feedback! Could you share which phone & version of the app you used? We'll patch this up for you

0

22

Roman Shemet

@RomanShemet

5 months ago

@btconometrics @Raspberry_Pi you can run @cactuscompute on any raspberry pi. Cactus also runs zero-copy memory mapping, so you're not constrained by the 8GB RAM

2

1

0

166

Roman Shemet

@RomanShemet

5 months ago

@supernalmystic @Raspberry_Pi you can run @cactuscompute on any raspberry pi. Cactus also runs zero-copy memory mapping, so you're not constrained by the 8GB RAM

0

1

158

Roman Shemet

@RomanShemet

5 months ago

Which personalization experiences would you want to see in local/on-device AI?

0

4

0

277

Roman Shemet

@RomanShemet

6 months ago

@kodjima33 @omidotme on-device AI? Something like @cactuscompute

0

32

RomanShemet retweeted

Jakub Mroz @jakmroo

6 months ago

At Cactus🌵, we want on-device AI inference to be as fast as possible - that’s why we decided to use Nitro for our React Native SDK. The performance is insane⚡️and the DX is even better. Making heavy use of the object-oriented Hybrid Objects, and mixing C++ with Swift and Kotlin feels like a breeze. Thanks @mrousavy - looking forward to shipping more features with Nitro 🚀

2

37

4

12

11K

Roman Shemet

@RomanShemet

6 months ago

@SzymonRybczak 😲 who undercut us?

1

0

26

RomanShemet retweeted

Marc

@mrousavy

6 months ago

Check out Cactus AI!

2

18

1

6

6K

RomanShemet retweeted

Sélim @SelimBenayat

6 months ago

Hackathon alert! London, SF, Boston. This Friday! 👀 @nothing is teaming up with @cactuscompute and @huggingface to hack on redefining on-device AI experiences! Come build something memorable, meet the teams, and ship in 24 hours! Signups are wild so far 🔥

9

200

20

16

48K

RomanShemet retweeted

Cactus

@cactuscompute

7 months ago

Cactus React Native v1 is live! Deploy AI on-device with text inference, tool calling, embeddings and more – powered by the fastest edge inference engine 🌵 Our React Native bindings run on @margelo_com's Nitro Modules, yielding the fastest mobile inference we've seen so far.

1

3

2

1

377

Roman Shemet

@RomanShemet

7 months ago

why does every SF startup show off their round on a NYC billboard? @SF, what’s our Times Square?

0

2

0

332

RomanShemet retweeted

Henry Ndubuaku

@Henry_Ndubuaku

7 months ago

More benchmarks for LFM2 models by @liquidai on Cactus (YC S25). - When we get to INT4, file sizes should reduce 2x, speed increase 2x, and battery drain reduce 2x. For lossless quantisation, the model should always be post-trained for QAT in a specific way, a trick we mastered in my last role. - NPUs are 5-11x energy-efficient and up to 10x faster for long-context, after we merge those, you should be able to design complex multi-agent workflows with large contexts safely on phones. Many ther tricks up our sleeves. Sit back, relax and watch!