David Wilde

Verified account

@DavidChrisGer

Data Scientist @CrowdStrike. Here my personal opinions, don't necessarily reflect those of CrowdStrike. Fallibilist, building & analyzing. RWRI#16

Germany

Joined October 2010

739 Following

135 Followers

237 Posts

20 days ago

I wrote about NumeraiAgentBench I mentioned in the Codex user group thread. Basic idea: put coding agents in a real ML loop, not a benchmark puzzle. They have to figure out Numerai, build models, submit, deal with delayed feedback, and keep going. https://t.co/UlUuc6sPQA

20 days ago

@reach_vb Using Codex to develop a perpetual benchmark of coding agents (Codex and Claude atm) on the @numerai Tournament. WIP, but already interesting. https://t.co/Egj0CAVZcH

0

15

8

4

2K

0

1

0

2

143

20 days ago

@reach_vb Using Codex to develop a perpetual benchmark of coding agents (Codex and Claude atm) on the @numerai Tournament. WIP, but already interesting. https://t.co/Egj0CAVZcH

0

15

8

4

2K

22 days ago

Easier to think about than: “how complex is my task”, because I always want the “best”.

0

0

0

0

27

22 days ago

Coding agent heuristics? Been catching up on @theo and @davis7 nerd-sniped podcast: in episode 2, Theo says that reasoning effort leads the model astray from what is in the codebase/context (I read: too much revolving on itself).

1

0

0

0

69

Who to follow

@richardskim111

Data Scientist | Mathematician

Machine Learning PhD student at @AmlabUva

Verified account

22 days ago

Example: starting with little context and I desire lots of “model creativity” -> high; I want a precise code change in an existing project -> low/medium. Likely doesn’t work universally.

1

0

0

0

39

23 days ago

@ChatGPTapp Or, you could use FafyCat: open source, local-first transaction categorization and personal finance analytics. No account linking, and designed to be accessible to coding agents for local analysis. https://t.co/KmORHskwei

0

0

0

0

823

30 days ago

Claude Code weekly limit reached (personal), but so much week left.

1

2

0

0

73

2 months ago

Biggest surprise: tabular DL architecture lost to a plain 3-layer FF net with GELU. Post: https://t.co/Pwcbsp1MXq

1

1

0

0

60

2 months ago

Wrote a blog post about it: 358 experiments, payout improved from -0.01 to 0.028. Era-purged CV, multi-seed validation gates, synthesized learnings, and a DO NOT RETRY table to prevent re-exploring exhausted search regions

3 months ago

Tried @karpathy 's autoresearch on @numerai tournament. It's fun! Finally a workflow that transparently performs automated ML-experimentation--something that I longed for since AutoML days ~10 years ago.

DavidChrisGer's tweet photo. Tried @karpathy 's autoresearch on @numerai tournament. It's fun! Finally a workflow that transparently performs automated ML-experimentation--something that I longed for since AutoML days ~10 years ago. https://t.co/K3r4o2Pa28

0

20

3

20

5K

1

2

0

1

90

3 months ago

Tried @karpathy 's autoresearch on @numerai tournament. It's fun! Finally a workflow that transparently performs automated ML-experimentation--something that I longed for since AutoML days ~10 years ago.

DavidChrisGer's tweet photo. Tried @karpathy 's autoresearch on @numerai tournament. It's fun! Finally a workflow that transparently performs automated ML-experimentation--something that I longed for since AutoML days ~10 years ago. https://t.co/K3r4o2Pa28

0

20

3

20

5K

3 months ago

Who else is visited by “annoying goblin” phrases in GPT-5.4 responses?

0

1

0

0

58

DavidChrisGer retweeted

Dirk Ehnts @DEhnts

3 months ago

He does not seem to understand that with every extra hour worked, productivity falls. Also, he seems not to grasp that an increase in productivity means that we can afford to work less. His ideas are pure microeconomics with not a hint of macro. They will fail in reality.

68

1K

148

86

36K

DavidChrisGer retweeted

Dirk Ehnts @DEhnts

4 months ago

Germany stagnates because consumption expenditure is flat. Wages and government spending are not rising enough, so that household spending flatlines and the firms sell a lot less than they can produce. No supply side reform can fix this. #macro

DEhnts's tweet photo. Germany stagnates because consumption expenditure is flat. Wages and government spending are not rising enough, so that household spending flatlines and the firms sell a lot less than they can produce. No supply side reform can fix this. #macro https://t.co/GZUV7ShVFO

8

114

27

19

8K

DavidChrisGer retweeted

Dirk Ehnts @DEhnts

4 months ago

Wenn es um die Wirtschaft von 🇩🇪 und 🇪🇺 geht, sollte die Nachfrageseite in den Mittelpunkt gerückt werden. Nur so kriegt man die Kapazitätsauslastung nach oben. ("Wettbewerbsfähigkeit" fällt unter Handelspolitik.)

DEhnts's tweet photo. Wenn es um die Wirtschaft von 🇩🇪 und 🇪🇺 geht, sollte die Nachfrageseite in den Mittelpunkt gerückt werden. Nur so kriegt man die Kapazitätsauslastung nach oben. ("Wettbewerbsfähigkeit" fällt unter Handelspolitik.) https://t.co/EgXhhcVEOj

0

12

2

4

548

DavidChrisGer retweeted

Dirk Ehnts @DEhnts

4 months ago

Capacity utilization in the Eurozone is currently very low. This means that manufacturing companies could produce a lot more, but don't because of a lack of demand. It is simple #macroeconomics. Increase government spending and wages and firms will produce more. 💶🌍🇪🇺

DEhnts's tweet photo. Capacity utilization in the Eurozone is currently very low. This means that manufacturing companies could produce a lot more, but don't because of a lack of demand. It is simple #macroeconomics. Increase government spending and wages and firms will produce more. 💶🌍🇪🇺 https://t.co/I81aq4GH3x

3

75

33

7

3K

DavidChrisGer retweeted

Dmitrii Kovanikov

5 months ago

AI clearly demonstrated everyone that it doesn’t matter how fast you build if you don’t know what to build

51

966

69

49

30K

5 months ago

@kepano I've let Claude Code create a skill to write a work summary to the vault from anywhere on my machine. i.e. when fixing a bug in some project, I can now tell CC to document the fix in my notes and it will create a new note according to a note template. Very handy to keep docs.

0

1

0

1

188

5 months ago

I love how Claude Code judges its own output 😁

DavidChrisGer's tweet photo. I love how Claude Code judges its own output 😁 https://t.co/15nPz1p0va

0

1

0

0

23

6 months ago

@pontus_rendahl For that matter: Mainstream economics (neoclassical) is a victim of scrutiny, too.

0

0

0

0

86

Last Seen Users on Sotwe

Trends for you

Most Popular Users