Sourav Tripathy @EssenceThinker - Twitter Profile

Pinned Tweet

Sourav Tripathy @EssenceThinker

about 2 years ago

Reality was never more tenuous, the abyss never more intoxicating.

1

0

386

EssenceThinker retweeted

Math Lady Hazel 🇦🇷

@mathladyhazel

4 days ago

The Fibonacci elephant.

111

26K

4K

3K

4M

EssenceThinker retweeted

Elliot Arledge

@elliotarledge

about 1 month ago

Co-Founder of Cerebras explains their WSE simplified design compared to classical GPUs made by NVIDIA.

25

3K

333

3K

177K

Sourav Tripathy @EssenceThinker

about 1 month ago

Do yourself a favour if you are curious...just pick up Shannon's original paper A mathematical Theory of communication....It's great to go into thinking mode with this paper...

0

9

Sourav Tripathy @EssenceThinker

about 2 months ago

@gabegreenberg Sent ! https://t.co/wI4TgocuR0

0

1

0

11

Sourav Tripathy @EssenceThinker

about 2 months ago

Link to read - https://t.co/h7w0aYr6sB

0

14

Sourav Tripathy @EssenceThinker

about 2 months ago

For the past few days I have been thinking about distributed inference of llms over open internet...So I wrote about it defining the constraints and how petals paper attempts to solve it...

EssenceThinker's tweet photo. For the past few days I have been thinking about distributed inference of llms over open internet...So I wrote about it defining the constraints and how petals paper attempts to solve it... https://t.co/p5zgR3Ijpw

1

0

20

Sourav Tripathy @EssenceThinker

about 2 months ago

@dwarkesh_sp latest podcast with @reinerpope ....(I believe this is how technical podcasts should be )...weekend going 😊

0

15

Sourav Tripathy @EssenceThinker

about 2 months ago

@dwarkesh_sp @reinerpope It's gonna be one hell of a weekend watching this and grasping all the ideas talked....

0

511

Sourav Tripathy @EssenceThinker

about 2 months ago

@sarahookr @a16z @adaption_ai https://t.co/SdiwLyibvi I wrote about continual learning through a chain of thoughts....Give it a read..may be

0

1

0

23

Sourav Tripathy @EssenceThinker

about 2 months ago

@KeshavRamji Intresting paper.saw it trending in alphaxiv and read it .Model reasoning using abstract tokens and producing the same level result as verbal COT. The only thing is why cannot we use Token1 and Token0 instead of 64 tokens or will that create problem in RL ...

0

318

EssenceThinker retweeted

Nick Levine

@status_effects

2 months ago

New work with @AlecRad and @DavidDuvenaud: Have you ever dreamed of talking to someone from the past? Introducing talkie, a 13B model trained only on pre-1931 text. Vintage models should help us to understand how LMs generalize (e.g., can we teach talkie to code?). Thread:

179

3K

400

2K

1M

Sourav Tripathy @EssenceThinker

about 2 months ago

To make use of my old nvidia geforce 1650 Ti ...Hosted a 500 M model in Fp16... Inference using Vllm.... Try out - https://t.co/LxGNHO8Nux

0

23

Sourav Tripathy @EssenceThinker

2 months ago

For the past three days I have been thinking about how llms can be inferenced through public internet something like decentralised version.came across very few papers - petal ,planetserve,bloombee,MDI-LLM and DSSD(Distributed split speculative decoding) ...It's an elegant problem

0

21

Sourav Tripathy @EssenceThinker

2 months ago

Reading..DeepseekV4 Tech report...

0

49

Sourav Tripathy @EssenceThinker

2 months ago

After 3 crashes realized that one should not re initialise CUDA in a forked subprocess...always spawn ..in Pytorch...🫩

0

26

Sourav Tripathy @EssenceThinker

3 months ago

@maharshii

0

43

Sourav Tripathy @EssenceThinker

3 months ago

For reference I was making a deepseek 1.5B through Ollma to solve a sudoko...via just reasoning...

0

48

Sourav Tripathy @EssenceThinker

3 months ago

If you are working with a small model say 1.5B and limited memory (I was using ollama) and for a multi turn task small output is not enough and increasing length cause OOM error because ollama uses llama.cpp and no pagedattention for efficient use....any thoughts..?

1

0

34

Sourav Tripathy @EssenceThinker

3 months ago

@arpit_bhayani I had a problem , where I used a chunk from a pdf to generate multiple statements and then I needed to have the source location exactly..not the whole chunk..I used jaccard similarity to find from where it was taken out..It's working good as of now

0

19

EssenceThinker retweeted

Andrej Karpathy

@karpathy

3 months ago

One common issue with personalization in all LLMs is how distracting memory seems to be for the models. A single question from 2 months ago about some topic can keep coming up as some kind of a deep interest of mine with undue mentions in perpetuity. Some kind of trying too hard.

2K

21K

1K

3K

3M

Sourav Tripathy

@EssenceThinker

Last Seen Users on Sotwe

Trends for you

Most Popular Users