Far El

Verified account

@far__el

building a new kind of AI @ruliad

world

Joined April 2022

2.6K Following

14.2K Followers

7.5K Posts

22 days ago

>Claims auto AI research >Look under the hood, it’s just smart hyperparam sweeps in LLM paradigm Sure that’s valuable, all respectable AI researchers already do so at some capacity. Real auto research is discovering new arch/algo innovations, everything else is playing around.

3

11

0

1

489

22 days ago

@YiMaTweets Academia incentives are as corrupted contributing to high noise vs signal. The pace of unreviewed and poorly conducted science is astounding. Many papers are not replicable. Too many focusing on exploitation rather than exploration

0

1

0

0

342

24 days ago

@alec_helbling More like drunk

1

2

0

0

201

24 days ago

Scale? Where we’re going, we don’t need hyperscalers.

0

8

0

1

483

Who to follow

Verified account

Where Enterprises transform busywork into Agents. Secure AI in minutes, not months.

Angus (dirtman)

Verified account

420.69x engineer. the future is stored in the balls. Founder + CEO Building industrial welding robots @amrwelding

Nicolai Ouporov

Verified account

Simulationmaxxing @fleet_ai | Retired Ballerina

about 1 month ago

@flowersslop There are actually some really cool open source projects that convert 3d printers into home bio labs

1

5

0

6

2K

about 2 months ago

Most of JAX’s advantage is erased if you are a cracked engineer. Only reason you would use JAX is for some level of convenience (functional transforms? Out of the box compiles?) and maybe for building on TPUs. Any serious AI researcher or engineer doesn’t just accept vanilla PyTorch performance disadvantage, we profile, fuse ops, write custom kernels and optimize the fuck out of the implementation we want to ship.

François Chollet

about 2 months ago

When looking at deep learning profiles, one of the most obvious tells between a mediocre and great candidate is whether they list PyTorch or JAX.

168

2K

35

918

1M

8

123

1

59

17K

about 2 months ago

@EpochAIResearch Such a ridiculous thing you just did, invalidates your benchmark.

2

7

0

0

888

about 2 months ago

Thanks for the thoughtful reply, I just revisited their pub and it does seem to your point that their new platform now deliberately plates both excitatory and inhibitory neurons together (~80/20 E/I ratio), and in this case they’ve moved to the deeper active-inference principle where the training rule is now Structured Predictable stim (ie low surprise = positive reward), and Unstructured Random stim (ie high surprise = negative reward). They tune the stim params (frequency + pattern + amplitude + which electrodes + burst length......) so the reward signal still drives the right plasticity.

0

1

0

0

80

about 2 months ago

@AlchemyAmerican Insane

0

0

0

0

318

about 2 months ago

@JagersbergKnut Exhausting lol

0

1

0

0

135

about 2 months ago

Maybe training isn’t saturated yet and more can be squeezed out of these archs for now, even so if we want to train or serve 100T+ params or equivalent, we cant rely on current paradigm, we need novel archs and training methods. Very soon we will see competitive models to current frontier that don’t need billions of dollars of compute or even trillions of tokens of data.

0

1

0

0

813

about 2 months ago

@yacineMTB Let’s start a robotics mining company and extract all the goodies

0

1

0

0

147

2 months ago

@PalmerLuckey @micsolana seeing any UAPs yet

0

5

0

0

488

2 months ago

0

4

0

0

963

2 months ago

@spisaktamas Really cool paper btw, looking forward to deep dive into it this weekend

0

2

0

0

20

2 months ago

@spisaktamas At least the first one

1

1

0

0

85

2 months ago

@spisaktamas Btw the listed GitHub repo in your paper doesn’t exist!

1

4

0

0

399

2 months ago

Addressing some comments and DMs: What I mean is that the objective is to replicate human performance and action efficiency. You can solve ARCAGI3 tasks in many different ways, through LLM agent harnesses or brute forcing via classic RL. But these do not meet my criteria and I don't find these solutions interesting for solving the fundamental problem/gap with AIs right now. The agent (however architected or pre-trained, even on the public arc-agi-3 tasks or the train tasks in prev challenges) when evaluated on a new previously unseen task/game cannot rely on these tricks (they won't even work that well if the ARCAGI3 held out tasks are TRULY unique). The thing about arc is that you get a human comparable for difficult tasks in a contained env. Which also points to the missing capabilities we need like online learning, reasoning.... What matters is how you frame the problem. I've observed a few ppl playing the games (ranges of "pretraining") and the benchmark is honestly great at making the challenge (of replicating these cap) achieveable/digestible (IF you approach it in the right way) Knowing the interface shouldn't matter, something held constant. Not knowing the rules and having these rules be difficult, unique, requiring exploration+reasoning+online learning is high enough complexity to get at what matters. Classic RL models will likely fail on unseen tasks if the games are different enough (which is the premise of this benchmark, if that assumption fails then whole benchmark fails). It's like taking a policy trained on a bunch of atari games and testing on a totally new game/same interface (MANY papers during RL golden era explored this), it may get lucky but will likely not perform well or at human level.

1

7

0

0

595

2 months ago

if your ARC-AGI 3 solution is to just classic RL on the games themselves (i.e. tufa attempt), or if you're hardcoding anything (i.e. agentica) then you clearly missed the point of the benchmark. The score doesnt matter if the process is not true to the qualities we want from the next paradigm. A human doesnt need prior instruction or that many attempts to solve each game.

6

46

5

10

4K

2 months ago

@leothecurious they both address shortcomings in current models (from diff perspectives), which imo points to the same missing fundamental quality. but yea ARC is cleaner/more focused

0

1

0

0

50

Last Seen Users on Sotwe

Trends for you

Most Popular Users