Nicholas Pipitone @npip99 - Twitter Profile

6 days ago

@ghita__ha What's interesting was that we didn't see much of a difference in performance on the MTEB! It was running on our private evals that we saw such a striking gap. This gave us even greater confidence in the quality and differentiability of our evals!

1

2

0

133

npip99 retweeted

Harj Taggar

@harjtaggar

10 days ago

Building agents has the same emotional arc as programming. I start every project thinking it should be easy to get what I want, then end up deep in retrieval quality, context engineering, and cross-modal eval loops before anything actually works.

36

292

19

79

62K

Nicholas Pipitone

@npip99

20 days ago

@kapilansh_twt GPG. Or host them on a machine and add developer's public keys to that machine. Communication and Authentication should always happen via offering your public key, private keys (including API Keys) should never be shared.

0

32

Nicholas Pipitone

@npip99

21 days ago

@MrTroy_ @theo @coderabbi @chiefclawofficr People are pretending Codex is better when Codex is also "allowing that situation". They _will_ cut the subsidies one day too. We learned this during Lyft/Uber days, for $3 could get around town, now it's $25. This isn't new.

1

0

195

Who to follow

Abigale Kim (@abigalekim.bsky.social)

@abigale_kim

@wiscdb phd student. formerly @tiledb @CMUDB. she/her

SACCFFT “SuiStaff” | OffKai Gen 5! 7/24-7/26

@SACCFFT

OffKai Expo VP of Technical Productions | OshiSpark AV(?) | game producer + having a million things I want to do but not working on any of them PFP: @ryougraph

Nicholas Pipitone

@npip99

21 days ago

@theo @coderabbi @chiefclawofficr The issue is that it isn't making them money, it's burning their money. API pricing makes them money, so they burn on Claude-exclusive UIs as a funnel. It's reasonable to not burn money on a UI that has competitors on a drop down menu. Even Codex will (one day) charge at-cost.

5

48

0

4K

Nicholas Pipitone

@npip99

3 months ago

Who's distilling from who now? Query: > 你是什么模型 > What model are you? Sonnet Response: > I am an AI assistant developed by *DeepSeek*, based on the *DeepSeek* model. How can I help you? 😊

npip99's tweet photo. Who's distilling from who now?

Query:
> 你是什么模型
> What model are you?

Sonnet Response:
> I am an AI assistant developed by *DeepSeek*, based on the *DeepSeek* model.
How can I help you? 😊 https://t.co/vrvu2nLoYN

Anthropic

@AnthropicAI

3 months ago

These attacks are growing in intensity and sophistication. Addressing them will require rapid, coordinated action among industry players, policymakers, and the broader AI community. Read more: https://t.co/4SVm8K3qou

360

7K

356

1K

2M

2

10

2

1

3K

Nicholas Pipitone

@npip99

about 1 year ago

Information Retrieval is beyond NP-Hard, it's undecidable. Proof Consider a corpus C = d₁, ..., dₙ of documents, each document containing a snippet of syntactically valid Python code. Ask the question, "Which documents halt?" Q.E.D.

0

5

1

839

Nicholas Pipitone

@npip99

about 1 year ago

Has anybody optimized LLM inference for MCTS? Often I want to take an input prompt, and then get as output the Top 25 possible answers. Yes, you can ask the LLM to output an array of 25 items, but that's slow. And, just increasing temperature doesn't get the "top" leaf nodes by cumulative scores, it's random sampling instead. The goal would be efficiently executing with KV Cache sharing among common prefixes, and batching leaf exploration. MCTS vs Autoregression

npip99's tweet photo. Has anybody optimized LLM inference for MCTS? Often I want to take an input prompt, and then get as output the Top 25 possible answers.

Yes, you can ask the LLM to output an array of 25 items, but that's slow. And, just increasing temperature doesn't get the "top" leaf nodes by cumulative scores, it's random sampling instead.

The goal would be efficiently executing with KV Cache sharing among common prefixes, and batching leaf exploration.

MCTS vs Autoregression

0

2

3

650

Nicholas Pipitone

@npip99

about 5 years ago

@ainslec @hdelima_ That would be awesome! Thanks for the update :) If it ends up on github I'd love to try to help work on a PR that adds vimscript support to Iro.

0

Nicholas Pipitone

@npip99

about 5 years ago

@ainslec @hdelima_ Oh, is there any way for the community to contribute additional language backends to Iro? I'd really love to be able to use iro for vim via vimscript, but it's all manual vimscript adjustments for me rn.

1

0

Nicholas Pipitone

@npip99

about 5 years ago

@ainslec @hdelima_ Hi! I think an offline CLI version would be pretty useful. I use Iro for a personal project of mine, and it's awesome for getting syntax highlighting to look nice. But right now they only way to get the build system automatic for me is using selenium on https://t.co/L1TzWX7g9K

0

Nicholas Pipitone

@npip99

over 6 years ago

@reduzio @EllaKrael @throw_away_user @Akien @godotengine I think it'd be okay to, at least in debug mode, have a single assembly instruction at the beginning of every function that checks whether or not its the main thread. "multi func foo()" could declare that foo doesn't have to check; that seems like a pretty solution to me.

0

Nicholas Pipitone

@npip99

over 8 years ago

@BangPerm @TheRegister @AnonymousPress That's exactly what Intel said, that it's an issue with all processors. AMD also uses branch prediction in their processors, as has been the standard for decades. "with many different vendors' processors and operating systems", denying that Intel itself has any specific issue.

0

Nicholas Pipitone

@npip99

over 10 years ago

@easyctf I assume the extra points for solving problems early will be removed, since this won't be fair if we don't know when it's starting.

0

Nicholas Pipitone

@npip99

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users