Jaco du plessis @jdupl1 - Twitter Profile

Jaco du plessis @jdupl1

11 days ago

@jsuarez @_TarunKathuria @jsuarez have you tried Mojo.

0

1

0

87

Jaco du plessis @jdupl1

about 2 months ago

@SantoshStyles People in general get defensive when you do this. But its very common in stem.

0

1

0

23

Jaco du plessis @jdupl1

about 2 months ago

This release from AI2 allows you to extract small language models from the LLM for your specific task, without any additional training simply by selecting the subset of experts which are relevant. This is an amazing unlock for memory constrained local inference.

Ai2 @allen_ai

about 2 months ago

Today we’re releasing EMO, a new mixture-of-experts (MoE) model trained so modular structure emerges directly from data without human-defined priors. EMO can use a small subset of its experts for a given task while keeping near full-model performance. 🧵

allen_ai's tweet photo. Today we’re releasing EMO, a new mixture-of-experts (MoE) model trained so modular structure emerges directly from data without human-defined priors.

EMO can use a small subset of its experts for a given task while keeping near full-model performance. 🧵 https://t.co/xXcWsYh50D

13

405

57

234

88K

0

2

0

101

Jaco du plessis @jdupl1

about 2 months ago

@trq212 @trq212 its annoying with prs in github any thought's on that?

0

5

Who to follow

mitchell johnson

@mitchazj

founding swe @AutoRFP (since Oct 2025). ex-swe @canva (2021-2025). I like building things. husband to @ashleymjtech. up-and-coming girl dad

Code for Victory ✌ Commit for Country 🇦🇺👨‍💻 #govtech #civictech #startups 👀 ex-public servant & #auspol ICT policy junkie 📑 now coder at @Swyftxglobal

Jaco du plessis @jdupl1

about 2 months ago

@AdamZivo Make it harder if its too easy with ai

0

Jaco du plessis @jdupl1

about 2 months ago

@ChrisHayduk @firstadopter If you use them via bedrock its far more reliable. I doubt there will be much enterprise impact.

0

26

Jaco du plessis @jdupl1

about 2 months ago

@doodlestein Spacy is likely even faster.

0

1

0

35

jdupl1 retweeted

Addy Osmani

@addyosmani

about 2 months ago

@Hacubu Exa is an optional add-on via Cloud's Agent Marketplace. This is not a change in any default grounding, which will continue to use Google Search :)

5

149

4

10

19K

Jaco du plessis @jdupl1

2 months ago

@HamelHusain Make it add a photo to a massive word doc 😊

0

587

Jaco du plessis @jdupl1

2 months ago

@fchollet Mojo?

0

73

Jaco du plessis @jdupl1

2 months ago

@nickkokonas @Ronycoder Can you recommend a better source?

1

0

199

jdupl1 retweeted

Michał Podlewski

@trajektoriePL

3 months ago

Terence Tao proposes what he calls a "Copernican view of intelligence". Instead of buying into the common, one-dimensional narrative that artificial intelligence will simply evolve from "subhuman" to "superhuman" and ultimately make humanity entirely redundant, Tao urges us to look at the bigger picture. Much like the Copernican revolution proved the Earth is not the center of the universe, Tao suggests we need to realize that human intelligence isn't the only, or necessarily the highest, form of intellect. Historically, we have treated other forms of storing or creating knowledge—like animals, books, and computers—as secondary. However, we actually exist within a much richer universe of intelligence. Both human intelligence and computer intelligence possess their own distinct strengths and weaknesses. The true potential lies not in viewing them as direct competitors, but rather in focusing on collaboration. By working together, humans and computers can achieve additional things that neither could accomplish on their own, requiring us to think in much wider terms than just what humans or computers can do alone.

139

4K

600

2K

607K

jdupl1 retweeted

Tuna @antea04

3 months ago

I'm a data scientist @OurWorldinData and I need help from a botanist or someone local to Kyoto, Japan! 🌸 We present one of the world’s longest climate records: 1,200 years of peak cherry blossom dates in Kyoto. The researcher who maintained it, Professor Yasuyuki Aono, sadly passed away last year.

antea04's tweet photo. I'm a data scientist @OurWorldinData and I need help from a botanist or someone local to Kyoto, Japan! 🌸

We present one of the world’s longest climate records: 1,200 years of peak cherry blossom dates in Kyoto.

The researcher who maintained it, Professor Yasuyuki Aono, sadly passed away last year.

28

1K

368

239

143K

Jaco du plessis @jdupl1

3 months ago

@RobinWigg Can i preorder an audiobook?

0

2

0

35

Jaco du plessis @jdupl1

3 months ago

@ericweinstein I think you should listen to the episode because you are agreeing with what Terence actually said and disagreeing with the confused Twitter take.

0

2

0

267

jdupl1 retweeted

Daniel Hnyk @hnykda

3 months ago

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

307

9K

2K

3K

6M

Jaco du plessis @jdupl1

5 months ago

@clattner_llvm Do you think this could be applied to mojo without degrading quality. If so how?

0

106

jdupl1 retweeted

the tiny corp

@__tinygrad__

5 months ago

@ThePrimeagen Sounds like they could've written tinygrad in two days.

4

244

3

6

8K

jdupl1 retweeted

Jorge Bravo Abad

@bravo_abad

5 months ago

Generative thermodynamic computing Diffusion models are powerful generative tools, but they come with a hidden cost: every denoising step requires a digital neural network, artificially injected noise, and substantial energy consumption. Yet physics offers an alternative—what if the noise needed for generation arose naturally from thermal fluctuations, and the denoising process was physically enacted rather than simulated? Stephen Whitelam introduces exactly this: a generative modeling framework for thermodynamic computing. Instead of using neural networks to transform noise into structure, the approach encodes denoising information directly in the energy landscape of a physical system evolving under Langevin dynamics. The training principle is elegant: observe noising trajectories (structured data degrading into noise), then adjust the system's couplings via gradient descent to maximize the probability that a thermodynamic computer would generate the reverse—structure from noise. This process has a beautiful physical interpretation: it minimizes the heat emission and entropy production of the generative process. In a proof-of-concept simulation with 784 visible units and 512 hidden units trained on just three MNIST digits, the thermodynamic computer learns to transform noise into recognizable digit-like structures through physical dynamics alone—no external control or pseudorandom numbers required. The energy implications are striking: the simulated thermodynamic computer emits ~2,900 kᵦT of heat per generation, compared to ~5 × 10¹⁴ kᵦT for a digital neural network doing equivalent denoising—a difference of more than 10 orders of magnitude. The message is compelling: by grounding generative modeling in thermodynamic principles, we can design systems where computation emerges from physics itself, opening paths toward autonomous, energy-efficient generation that could fundamentally change how we think about the hardware of machine learning. Paper: https://t.co/c8vLmgnLZ8

bravo_abad's tweet photo. Generative thermodynamic computing

Diffusion models are powerful generative tools, but they come with a hidden cost: every denoising step requires a digital neural network, artificially injected noise, and substantial energy consumption. Yet physics offers an alternative—what if the noise needed for generation arose naturally from thermal fluctuations, and the denoising process was physically enacted rather than simulated?

Stephen Whitelam introduces exactly this: a generative modeling framework for thermodynamic computing. Instead of using neural networks to transform noise into structure, the approach encodes denoising information directly in the energy landscape of a physical system evolving under Langevin dynamics.

The training principle is elegant: observe noising trajectories (structured data degrading into noise), then adjust the system's couplings via gradient descent to maximize the probability that a thermodynamic computer would generate the reverse—structure from noise. This process has a beautiful physical interpretation: it minimizes the heat emission and entropy production of the generative process.

In a proof-of-concept simulation with 784 visible units and 512 hidden units trained on just three MNIST digits, the thermodynamic computer learns to transform noise into recognizable digit-like structures through physical dynamics alone—no external control or pseudorandom numbers required.

The energy implications are striking: the simulated thermodynamic computer emits ~2,900 kᵦT of heat per generation, compared to ~5 × 10¹⁴ kᵦT for a digital neural network doing equivalent denoising—a difference of more than 10 orders of magnitude.

The message is compelling: by grounding generative modeling in thermodynamic principles, we can design systems where computation emerges from physics itself, opening paths toward autonomous, energy-efficient generation that could fundamentally change how we think about the hardware of machine learning.

Paper: https://t.co/c8vLmgnLZ8

14

571

77

413

53K

jdupl1 retweeted

Andrej Karpathy

@karpathy

over 5 years ago

How to become expert at thing: 1 iteratively take on concrete projects and accomplish them depth wise, learning “on demand” (ie don’t learn bottom up breadth wise) 2 teach/summarize everything you learn in your own words 3 only compare yourself to younger you, never to others

173

14K

3K

6K

0

Jaco du plessis

@jdupl1

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users