cristian sanchez @noreplydev - Twitter Profile

noreplydev retweeted

ₕₐₘₚₜₒₙ

@hamptonism

4 months ago

This is who you’re competing against, Chinese fruit seller and chip designer. Yea, you’re cooked:

629

44K

5K

6K

3M

noreplydev retweeted

Lior Alexander

@LiorOnAI

5 months ago

You can now run 70B LLMs on a 4GB GPU. AirLLM just made massive models usable on low-memory hardware. 𝗪𝗵𝗮𝘁 𝗷𝘂𝘀𝘁 𝗵𝗮𝗽𝗽𝗲𝗻𝗲𝗱 AirLLM released memory-optimized inference for large language models. It runs 70B models on 4GB VRAM. It can even run 405B Llama 3.1 on 8GB VRAM. 𝗛𝗼𝘄 𝗶𝘁 𝘄𝗼𝗿𝗸𝘀 AirLLM loads models one layer at a time. Instead of loading everything: → Load a layer → Run computation → Free memory → Load the next layer This keeps GPU memory usage extremely low. 𝗞𝗲𝘆 𝗱𝗲𝘁𝗮𝗶𝗹𝘀 • No quantization required by default • Optional 4-bit or 8-bit weight compression • Same API as Hugging Face Transformers • Supports CPU and GPU inference • Works on Linux and macOS Apple Silicon 𝗪𝗵𝗮𝘁 𝘆𝗼𝘂 𝗰𝗮𝗻 𝗱𝗼 • Run Llama, Qwen, Mistral, Mixtral locally • Test large models without cloud GPUs • Prototype agents on cheap hardware

LiorOnAI's tweet photo. You can now run 70B LLMs on a 4GB GPU.

AirLLM just made massive models usable on low-memory hardware.

𝗪𝗵𝗮𝘁 𝗷𝘂𝘀𝘁 𝗵𝗮𝗽𝗽𝗲𝗻𝗲𝗱
AirLLM released memory-optimized inference for large language models.

It runs 70B models on 4GB VRAM.
It can even run 405B Llama 3.1 on 8GB VRAM.

𝗛𝗼𝘄 𝗶𝘁 𝘄𝗼𝗿𝗸𝘀
AirLLM loads models one layer at a time.

Instead of loading everything:
→ Load a layer
→ Run computation
→ Free memory
→ Load the next layer

This keeps GPU memory usage extremely low.

𝗞𝗲𝘆 𝗱𝗲𝘁𝗮𝗶𝗹𝘀
• No quantization required by default
• Optional 4-bit or 8-bit weight compression
• Same API as Hugging Face Transformers
• Supports CPU and GPU inference
• Works on Linux and macOS Apple Silicon

𝗪𝗵𝗮𝘁 𝘆𝗼𝘂 𝗰𝗮𝗻 𝗱𝗼
• Run Llama, Qwen, Mistral, Mixtral locally
• Test large models without cloud GPUs
• Prototype agents on cheap hardware

364

11K

1K

12K

637K

noreplydev retweeted

Charly Wargnier

@DataChaz

5 months ago

NVIDIA just removed one of the biggest friction points in Voice AI. PersonaPlex-7B is an open-source, full-duplex conversational model. Free, open source (MIT), with open model weights on @huggingface 🤗 Links to repo and weights in 🧵↓ The traditional ASR → LLM → TTS pipeline forces rigid turn-taking. It’s efficient, but it never feels natural. PersonaPlex-7B changes that. This @nvidia model can listen and speak at the same time. It runs directly on continuous audio tokens with a dual-stream transformer, generating text and audio in parallel instead of passing control between components. That unlocks: → instant back-channel responses → interruptions that feel human → real conversational rhythm Persona control is fully zero-shot! If you’re building low-latency assistants or support agents, this is a big step forward 🔥

157

6K

670

7K

564K

noreplydev retweeted

Markov

@MarkovMagnifico

5 months ago

how my codebase written entirely with claude code runs

696

63K

3K

7K

4M

Who to follow

Alejandro Jáuregui

@Alejo_Jauregui

👾🕹️Game Designer 👨🏻‍💻 Ingeniero de sistemas 🔎Innovador 🎮Streamer 🖋️Escritor aficionado

dantasmatosn 📚⚖️

@DantasMatos

Nascido em SP | | PCD | Florianopolis SC | Direito

Irfan

@irfan__zainudin

building https://t.co/w3X4SAqe1w

noreplydev retweeted

dax

@thdxr

5 months ago

@kmdrfx

1

62

1

3

9K

cristian sanchez @noreplydev

6 months ago

btw, this would be a better comparison

0

26

cristian sanchez @noreplydev

6 months ago

Just spotted this in London, are you guys hiring in Europe? @vercel @rauchg

0

1

0

32

cristian sanchez @noreplydev

6 months ago

@nikitabier create.

0

1

0

9

cristian sanchez @noreplydev

6 months ago

the last months i've been working on 'self', a virtual machine with native AI integration at both the instruction-set level and the stdlib level. it has its own language, and the demo below takes under 30 lines to write. more details at https://t.co/nbK9L8P7VQ

noreplydev's tweet photo. the last months i've been working on 'self', a virtual machine with native AI integration at both the instruction-set level and the stdlib level.

it has its own language, and the demo below takes under 30 lines to write.

more details at https://t.co/nbK9L8P7VQ https://t.co/nT42hIlJZp

1

2

1

0

82

noreplydev retweeted

Megatron

@Megatron_ron

6 months ago

BREAKING: 🇨🇭🇺🇲 Switzerland REJECTS Palantir Software Palantir Software Poses Devastating Risks - Netz Politik After a risk assessment of the use of Palantir software in Switzerland, the US corporation was rejected, despite years of courting by authorities and the army. The Swiss consider the risks too great. The fear is that Palantir is handing over sensitive data to the US and from there to Israel.

Megatron_ron's tweet photo. BREAKING:

🇨🇭🇺🇲 Switzerland REJECTS Palantir Software

Palantir Software Poses Devastating Risks - Netz Politik

After a risk assessment of the use of Palantir software in Switzerland, the US corporation was rejected, despite years of courting by authorities and the army. The Swiss consider the risks too great.

The fear is that Palantir is handing over sensitive data to the US and from there to Israel.

1K

47K

11K

3K

2M

cristian sanchez @noreplydev

6 months ago

@jarredsumner Hey jarred, are there open roles at bun? I’m 20, working at different startups for the past 4 years. Built “self” a virtual machine built in rust with native AI integration as part of the instruction set. Speaker at community events Everything at https://t.co/DzxHKO6XM9

0

2

1

0

27

cristian sanchez @noreplydev

7 months ago

@ahmadaccino yep yep… https://t.co/DzxHKO6XM9

0

1

0

19

cristian sanchez @noreplydev

7 months ago

@iyoushetwt there is no plan b

0

4

cristian sanchez @noreplydev

9 months ago

@rauchg @vercel i really think mcp it’s not that bold, it’s only a client server contract. It even does not enforce you a transport layer. It only exposes what you can do, or what you can consume. In fact, without LLM’s it’s something like a functions discovery service

0

42

cristian sanchez @noreplydev

about 1 year ago

https://t.co/zE8YRwiEe6

0

1

106

cristian sanchez @noreplydev

almost 2 years ago

@carla0st @_buildspace @_nightsweekends awesome idea 🔥 when i am on the street with the headset i open chatgpt 4o and start talking to him with the phone in my pocket. the point of this is, how to handle the end of the statements and how to interrupt the process if you said something wrong, without using the screen on

1

0

40

cristian sanchez @noreplydev

almost 2 years ago

@anishthite @_nightsweekends @_buildspace hahahah this idea will be fun. any thoughts around banned topics?

0

6

cristian sanchez @noreplydev

almost 2 years ago

@rds_agi @sonicrypt @solana @burnt_xion @_buildspace @_nightsweekends @FarzaTV I am looking at the project, did you code the esp32 with plain c++? maybe this is useful for you https://t.co/FL4AEZ9LmP . esp32 hot reload on code changes, it's feels magic.

0

2

0

10

cristian sanchez @noreplydev

almost 2 years ago

@rds_agi 🔥🔥

0

1

0

21

noreplydev retweeted

Sean Kelly @seanpk

almost 2 years ago

Elon Musk fired 80% of Twitter (6500 people) and everyone thought that Twitter was doomed. He was right. Everyone was wrong. It’s the management masterclass of the decade and every entrepreneur must understand why it worked 🧵:

seanpk's tweet photo. Elon Musk fired 80% of Twitter (6500 people) and everyone thought that Twitter was doomed.

He was right. Everyone was wrong.

It’s the management masterclass of the decade and every entrepreneur must understand why it worked 🧵: https://t.co/jVq9psJuUZ

3K

81K

9K

29K

20M

cristian sanchez

@noreplydev

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users