lucky verma @luckieverma - Twitter Profile

lucky verma @luckieverma

2 months ago

Code/artifacts: https://t.co/W2497OpUjG

0

11

lucky verma @luckieverma

2 months ago

arXiv preprint: When does removing LayerNorm help? Main finding: DyT is not uniformly helpful; it behaves like a regime-dependent implicit regularizer. Paper: https://t.co/bRz1CYmX3c

1

0

21

luckieverma retweeted

comma

@comma_ai

11 months ago

New rack up! Sell comma 3X to buy GPUs to train better models to sell more comma 3X to buy more GPU to sell more comma 3X

34

642

13

22

42K

lucky verma @luckieverma

over 1 year ago

Pacman dominance ＼⍩⃝／: Fun to play around with KGs built with LMs

0

1

0

25

Who to follow

lucky verma @luckieverma

over 1 year ago

terminal is good

0

28

lucky verma @luckieverma

about 2 years ago

@thisistheray @METAPCs Maryland!

1

0

215

luckieverma retweeted

AK

@_akhaliq

about 2 years ago

KAN Kolmogorov-Arnold Networks Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"),

_akhaliq's tweet photo. KAN

Kolmogorov-Arnold Networks

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), https://t.co/JITPSr17cv

15

1K

228

567

133K

luckieverma retweeted

WIZZ🥷 ( beware scammers )

@CryptoWizardd

about 2 years ago

Anyone else needs this ?

509

3K

220

45

324K

lucky verma @luckieverma

about 2 years ago

@ChaniNattan Why do you need sikh hashtag at that time? Do you want to dig up old feuds?

0

103

lucky verma @luckieverma

over 2 years ago

@AGONbyAOC GPU

0

7

lucky verma @luckieverma

over 2 years ago

@BhaktSanatani_ @historyinmemes How much do they pay?

2

0

703

luckieverma retweeted

Elon Musk

@elonmusk

over 2 years ago

Fixed it

10K

307K

14K

4K

54M

luckieverma retweeted

elvis

@omarsar0

over 2 years ago

Knowledge Fusion of LLMs Is it possible to merge existing models into a more potent model? We have already seen a few ways that show the potential to effectively do this using approaches like weight merging and ensembling of models. This work proposes FuseLLM with the core idea of externalizing knowledge from multiple LLMs and transferring their capabilities to a target LLM. It leverages the generative distributions of source LLMs to externalize both their collective knowledge and individual strengths and transfer them to the target LLM through continual training. To put it simply, the idea is to benefit from the strengths of all the LLMs and combine them into one integrated model. Finds that the FuseLLM can improve the performance of the target model across a range of capabilities such as reasoning, common sense, and code generation. By the way, you can also perform the fusion among fine-tuned LLMs that specialize in specific tasks. This continues to be an interesting research area so hoping to document more on any new ideas and findings I come across.