Marv Solomon

@marvmargic

Farmer @JobBox

Lagos, Nigeria

Joined September 2011

44 Following

334 Followers

1.5K Posts

marvmargic retweeted

Rohit Kumar Tiwari

@_rohit_tiwari_

6 days ago

Builds GPT-like LLMs from scratch in PyTorch > Breaks the LLM architecture into simple parts. > Beginner friendly. > Fully hands on. https://t.co/sbqfNxbBVC > Everything explained step by step. > Just 10 notebooks. 01_tokenization.ipynb 02_token_embeddings.ipynb 03_positional_embeddings.ipynb 04_self_attention_mechanism.ipynb 05_multi_head_self_attention.ipynb 06_feedforward_neural_networks.ipynb 07_residual_connections.ipynb 08_layer_normalization.ipynb 09_transformer_block.ipynb 10_mini_gpt.ipynb

_rohit_tiwari_'s tweet photo. Builds GPT-like LLMs from scratch in PyTorch

> Breaks the LLM architecture into simple parts.
> Beginner friendly.
> Fully hands on.

https://t.co/sbqfNxbBVC

> Everything explained step by step.
> Just 10 notebooks.

01_tokenization.ipynb
02_token_embeddings.ipynb
03_positional_embeddings.ipynb
04_self_attention_mechanism.ipynb
05_multi_head_self_attention.ipynb
06_feedforward_neural_networks.ipynb
07_residual_connections.ipynb
08_layer_normalization.ipynb
09_transformer_block.ipynb
10_mini_gpt.ipynb

300

387

13K

marvmargic retweeted

Amazon Web Services

@awscloud

29 days ago

Have you thought about the future of Agentic AI? We have too, so AWS and @OpenAI leaders recently came together to discuss where agents are working today, and what your organization needs next to lead with AI. Check out the full stream below.

awscloud's tweet photo. Have you thought about the future of Agentic AI?

We have too, so AWS and @OpenAI leaders recently came together to discuss where agents are working today, and what your organization needs next to lead with AI.

Check out the full stream below.

44K

521M

marvmargic retweeted

Elon Musk

@elonmusk

7 days ago

29 more Starlink satellites. Over 10,000 in orbit now.

113K

14K

19M

marvmargic retweeted

Alex Xu

@alexxubyte

8 days ago

How OpenAI Built Its Data Agent Most teams building data agents stack routers, fine-tunes, and complex retrieval pipelines on top of multiple LLMs. OpenAI didn't. Their data agent runs on a single model and only 13 tools, across 1.5 exabytes and 90,000 tables. It's "pretty vanilla" by design. We spoke with Emma Tang, Head of Data Platform Engineering at OpenAI, to better understand the architecture and the engineering decisions behind it. The article covers: - The architecture behind the data agent - The six layers of context that make a single LLM reliable across 90,000 tables - How OpenAI Uses Codex Internally: 3 Use Cases - Five practical lessons for any team building a domain agent - Where OpenAI's data platform is headed next

alexxubyte's tweet photo. How OpenAI Built Its Data Agent

Most teams building data agents stack routers, fine-tunes, and complex retrieval pipelines on top of multiple LLMs. OpenAI didn't.

Their data agent runs on a single model and only 13 tools, across 1.5 exabytes and 90,000 tables. It's "pretty vanilla" by design.

We spoke with Emma Tang, Head of Data Platform Engineering at OpenAI, to better understand the architecture and the engineering decisions behind it.

The article covers:
- The architecture behind the data agent
- The six layers of context that make a single LLM reliable across 90,000 tables
- How OpenAI Uses Codex Internally: 3 Use Cases
- Five practical lessons for any team building a domain agent
- Where OpenAI's data platform is headed next

584

110

588

44K

Who to follow

Jesutofunmi

@OluwaladeJesut1

I build products people love using from day one and fix the ones they don't | product designer open to remote roles, contracts and freelance worldwide

Top Universe

@topuniverse_org

Helping people break into tech & empowering future talents through training & development | NGO. Join our discord community - https://t.co/5UKicq8tqD

Agz

@AgozieEneagor

Digital Artist and brand illustrator

marvmargic retweeted

Ahmad

@TheAhmadOsman

8 days ago

Step-By-Step LLM Engineering Projects Roadmap - Build a tokenizer - Learn embeddings - Implement RoPE / ALiBi - Hand-wire attention - Build MHA - Build a Transformer block - Train a mini-former - Compare objectives - Build sampling - Speculative decoding - KV cache - MQA / GQA / MLA - Long context - FlashAttention - Hardware budgets - Toy MoE - Sparse model trade-offs - State-space / linear attention - Diffusion language models - Data pipelines - Synthetic data - Scaling laws - SFT / DPO / RLHF / GRPO - Quantization - Serving stacks - Eval harnesses - RAG - Tool use / agents - Vision-language adapters - Interpretability - Red-team suite - Full capstone model system One request: Choose an Opensource AI lab when you make it Opensource is where humanity gets to keep the tools DM me when you've made it ;)

262

120K

marvmargic retweeted

Suni

@suni_code

8 days ago

Found the Best Resource to learn Harness Engineering. 😭 https://t.co/3eOEmMlfbv

307

129K

marvmargic retweeted

Rohit Kumar Tiwari

@_rohit_tiwari_

8 days ago

AI Engineering from Scratch. 503 lessons. 20 phases. 320 hours. https://t.co/UuX9N62VCU Phase 00: Setup & Tooling (12 lessons) Phase 01: Math Foundations (22 lessons) Phase 02: ML Fundamentals (18 lessons) Phase 03: Deep Learning Core (13 lessons) Phase 04: Computer Vision (28 lessons) Phase 05: NLP (29 lessons) Phase 06: Speech & Audio (17 lessons) Phase 07: Transformers Deep Dive (14 lessons) Phase 08: Generative AI (14 lessons) Phase 09: Reinforcement Learning (12 lessons) Phase 10: LLMs from Scratch (22 lessons) Phase 11: LLM Engineering (15 lessons) Phase 12: Multimodal AI (25 lessons) Phase 13: Tools & Protocols (23 lessons) Phase 14: Agent Engineering (42 lessons) Phase 15: Autonomous Systems (22 lessons) Phase 16: Multi-Agent & Swarms (25 lessons) Phase 17: Infrastructure & Production (28 lessons) Phase 18: Ethics, Safety & Alignment (30 lessons) Phase 19: Capstone Projects (85 lessons)

_rohit_tiwari_'s tweet photo. AI Engineering from Scratch.

503 lessons. 20 phases. 320 hours.

https://t.co/UuX9N62VCU

Phase 00: Setup & Tooling (12 lessons)
Phase 01: Math Foundations (22 lessons)
Phase 02: ML Fundamentals (18 lessons)
Phase 03: Deep Learning Core (13 lessons)
Phase 04: Computer Vision (28 lessons)
Phase 05: NLP (29 lessons)
Phase 06: Speech & Audio (17 lessons)
Phase 07: Transformers Deep Dive (14 lessons)
Phase 08: Generative AI (14 lessons)
Phase 09: Reinforcement Learning (12 lessons)
Phase 10: LLMs from Scratch (22 lessons)
Phase 11: LLM Engineering (15 lessons)
Phase 12: Multimodal AI (25 lessons)
Phase 13: Tools & Protocols (23 lessons)
Phase 14: Agent Engineering (42 lessons)
Phase 15: Autonomous Systems (22 lessons)
Phase 16: Multi-Agent & Swarms (25 lessons)
Phase 17: Infrastructure & Production (28 lessons)
Phase 18: Ethics, Safety & Alignment (30 lessons)
Phase 19: Capstone Projects (85 lessons)

247

48K

marvmargic retweeted

Bally_AgenticAI

@bally_kehal

10 days ago

@AndrewYNg FDE exists because agentic systems aren't plug-and-play. The model is commoditized - the orchestration, eval loops, and domain-specific tuning are the real work. Built something similar - https://t.co/gLkuTFOo64

marvmargic retweeted

Andrew Ng

@AndrewYNg

10 days ago

One of the new, buzzy jobs in Silicon Valley is the AI Forward Deployed Engineer (FDE), an engineer who is embedded within a client organization to help customize solutions, such as building and tuning agentic workflows that suit the client’s particular needs. I’ve heard from people who are wondering anew about the FDE career path since OpenAI and Anthropic started building new teams to place FDEs within client organizations. The rise of FDEs for AI workloads is one way AI is creating new jobs (and why the jobpolcalypse narrative of upcoming job market collapse is false -- there will be many AI and non-AI jobs). However, I believe there will be far more AI Engineer jobs than FDEs, as I explain below. The FDE role was pioneered about two decades ago by Palantir, which sent engineers to government locations to work on secure, air-gapped networks. In addition to having good technical skills, FDEs need communication skills and sometimes business skills. For example, they may need to speak with clients to understand their needs, formulate a strategy to prioritize projects, explain complex technology, and respectfully push back if a client asks for something unrealistic. They’re enjoying a resurgence because of the amount of work involved in taking an off-the-shelf LLM and building it into a custom agentic workflow that fits particular business needs. However, I believe the number of AI Engineer jobs will be far larger. A company might accept a few FDEs to be embedded within its organization. But most companies will want far more of their own employees working on their projects. While my organizations do hire FDEs, we hire far more AI Engineers! Also, a common client concern is that it is hard to find vendor-neutral FDEs — they are, after all, there to deeply integrate a particular vendor’s product into a company. In this moment when it’s hard to predict which AI service will be the best one in a year’s time, optionality (the ability to pick whatever vendor turns out to fit best in the future) is very valuable. In contrast, letting FDEs tightly bind a company’s processes significantly reduces optionality. Right now, I see surging demand for AI Engineers who can build software applications using AI software components (like LLM prompting, agentic frameworks, evals, etc.) and effectively use AI coding agents (like Claude Code, Codex, Antigravity CLI, and OpenCode). As the AI Engineer role matures, I expect it to fragment into more specialized roles, like the generic Software Engineer role from decades ago fragmented into frontend, backend, mobile, data engineering, devops, and so on. What will be the future, specialized AI engineering roles? I don’t know. Perhaps there will be AI FDEs, LLMOps Engineers, Evals Engineers, AI Data Engineers, Harness Engineers, and other roles we don’t have names for yet. But for now, I see a lot of AI engineers who are generalists create a lot of value. Skilled AI Engineers are in very high demand! As our field continues to mature over the coming decade, I look forward to new specializations within AI Engineering that create even more job opportunities. [Original text: The Batch newsletter]

313

732

547K

marvmargic retweeted

Vaishnavi

@_vmlops

10 days ago

Agentic AI - A Complete Learning Guide for High School Students https://t.co/QRU2VqUNFG

797

164

964

34K

marvmargic retweeted

Dan Kornas

@DanKornas

16 days ago

AI system design interviews are messy. This repo gives you a map. AI System Design Guide is a practical GitHub reference for engineers preparing for AI system design interviews and building production AI systems. It helps you avoid random tutorial-hopping by organizing the path around interview prep, RAG, agents, model selection, evaluation, security, reliability, and case studies. Key features: • Goal-based navigation – jump straight to interview prep, RAG, agents, model picking, evals, role transition, or the glossary • Production AI coverage – chapters span retrieval systems, agentic systems, MLOps, security, reliability, safety, and observability • Interview prep included – README points to a 110-question bank, answer frameworks, and whiteboard exercises • Case-study library – architecture prompts cover search, coding agents, multi-tenant SaaS, support automation, document intelligence, and more • Bonus eval guides – companion guides cover Phoenix, Langfuse, LangWatch, judges, RAG evals, tracing, and drift detection It’s open-source (MIT license). Link in the reply 👇

DanKornas's tweet photo. AI system design interviews are messy. This repo gives you a map.

AI System Design Guide is a practical GitHub reference for engineers preparing for AI system design interviews and building production AI systems.

It helps you avoid random tutorial-hopping by organizing the path around interview prep, RAG, agents, model selection, evaluation, security, reliability, and case studies.

Key features:

• Goal-based navigation – jump straight to interview prep, RAG, agents, model picking, evals, role transition, or the glossary
• Production AI coverage – chapters span retrieval systems, agentic systems, MLOps, security, reliability, safety, and observability
• Interview prep included – README points to a 110-question bank, answer frameworks, and whiteboard exercises
• Case-study library – architecture prompts cover search, coding agents, multi-tenant SaaS, support automation, document intelligence, and more
• Bonus eval guides – companion guides cover Phoenix, Langfuse, LangWatch, judges, RAG evals, tracing, and drift detection

It’s open-source (MIT license).

Link in the reply 👇

421

546

15K

marvmargic retweeted

Vaishnavi

@_vmlops

15 days ago

MICROSOFT DROPPED A PYTEST FRAMEWORK FOR TESTING AI AGENTS and most devs building agents have no idea this exists it's called RAMPART and it fits right into your existing test suite here's what it covers: ▫️ adversarial attacks on your agent ▫️ benign failure modes you didn't think about ▫️ harm category testing across a wide range ▫️ assertion-based evaluation (not manual checking) ▫️ 100% pytest-native no new tooling to learn you already write pytest for your backend now you can write the same kind of tests for your ai agent's safety if you're shipping agents to real users and skipping this step, you're just hoping nothing goes wrong hope is not a test suite https://t.co/rwKgdxVeGi

319

441

23K

marvmargic retweeted

Elon Musk

@elonmusk

19 days ago

Real-time video of Starship from Starlink

136K

10K

32M

marvmargic retweeted

Mohit Bansal

@mohitban47

20 days ago

🚨 Outcome rewards in LLM RL are sparse --> AVSD (Adaptive-View Self-Distillation) turns privileged info into dense token-level supervision, and instead of relying on only one privileged view, it combines multiple views and balances stable cross-view consensus vs. potentially noisy view-specific signals. Privileged views such as full solutions, partial rationales, final answers, reference code, and feedback can all help, but none is consistently the best. AVSD uses consensus across views as the reliable update direction, then adds a view-specific residual only when it aligns with that consensus and is bounded. The result is a richer but still stable learning signal, leading to consistent gains on several math and code benchmarks across model families for each configuration we test. 🧵👇

marvmargic retweeted

Elon Musk

@elonmusk

20 days ago

Breathtakingly Beautiful

18K

401K

37K

10M

marvmargic retweeted

avrl ☘

@avrldotdev

25 days ago

10 Projects that will make you master in low level programming. Start from simple & move up as you go. You will learn memory management, syscalls, networking, sockets, DB & OS basics.

avrldotdev's tweet photo. 10 Projects that will make you master in low level programming.

Start from simple & move up as you go.

You will learn memory management, syscalls, networking, sockets, DB & OS basics. https://t.co/WZbJbzM7YH

612

706

17K

marvmargic retweeted

Elon Musk

@elonmusk

20 days ago

Congratulations @SpaceX team on an epic first Starship V3 launch & landing! You scored a goal for humanity.

218K

17K

23M

marvmargic retweeted

Elon Musk

@elonmusk

25 days ago

Where will AI be in 1, 2 or 3 years?

108K

14K

33K

40M

marvmargic retweeted

How To AI

@HowToAI_

29 days ago

Google has quietly dropped what researchers are calling "Attention Is All You Need V2." And it signals the end of the Transformer era as we know it. In 2017, the original "Attention Is All You Need" paper changed the world by proving that AI doesn't need recurrence, it just needs to pay attention. But today, even the most advanced models like GPT and Gemini suffer from a massive, structural flaw: Catastrophic Forgetting. The moment an AI learns something new, it starts losing what it learned before. It’s why AI "hallucinates" or loses the thread in long conversations. This paper, titled "Nested Learning: The Illusion of Deep Learning Architectures," completely replaces the way AI stores information. The researchers have introduced a paradigm shift called Nested Learning (NL). Here is why this is "V2": For the last decade, we treated AI models as one giant, flat mathematical function. NL proves that a model is actually a set of thousands of smaller, "nested" optimization problems running in parallel. Instead of one giant "memory," each layer has its own internal "context flow." This allows the model to learn new tasks at test-time without overwriting its core intelligence. It moves us past the static Transformer. The new architecture (HOPE) demonstrated 100% stability in long-context memory and "post-training adaptation" that was previously impossible. The technical takeaway is brutal for the competition: Existing deep learning works by compressing information until it breaks. Nested Learning works by organizing information so it can grow forever. We’ve spent 7 years trying to make Transformers bigger. Google figured out how to make them "Nested." The Transformer replaced the RNN in 2017. Nested Learning is here to replace the Transformer in 2026.