John Alpha

@afrodrive

ML Engineer. AI researcher. Promoting ethical, safe and trustworthy AI to humanity.

Cape Town, South Africa

Joined January 2018

2.2K Following

1.9K Followers

1.9K Posts

Pinned Tweet

John Alpha @afrodrive

about 2 years ago

This is a gem

Csaba Kissi

@csaba_kissi

about 2 years ago

Earn $3.000+/month with your coding skills: - Make templates for NextJS or Framer - Develop WordPress themes - Create micro SaaS - Offer consulting services - Build custom plugins for Shopify - Provide online coding courses - Develop mobile apps - Design custom websites

286

184

36K

171

afrodrive retweeted

Arnaud Dyevre @ArnaudDyevre

over 1 year ago

I just read the paper in full; it is even more spectacular than I initially thought. A short thread about the results and their significance.

15K

24K

afrodrive retweeted

AshutoshShrivastava

@ai_for_success

over 1 year ago

What if I told you this was never a joke from Sam Altman? AGI has indeed been achieved internally.

152

284

309K

afrodrive retweeted

Jean de Dieu Nyandwi

@Jeande_d

over 1 year ago

Applied Machine Learning - Cornell CS5785 "Starting from the very basics, covering all of the most important ML algorithms and how to apply them in practice. Executable Jupyter notebooks (and as slides)". 80 videos!! Videos: https://t.co/KGQBLQ37ou Code: https://t.co/hTqxsLt6wu

Jeande_d's tweet photo. Applied Machine Learning - Cornell CS5785

"Starting from the very basics, covering all of the most important ML algorithms and how to apply them in practice. Executable Jupyter notebooks (and as slides)". 80 videos!!

Videos: https://t.co/KGQBLQ37ou
Code: https://t.co/hTqxsLt6wu

420

128K

Who to follow

Angelslyna

@angelslyna1

Early 💎 as Twin Mascot Meme $FLUFFEY & $MEKA on @megaeth (potential 100-1000x)

Dav Astation

@CryptoDav2

Those who make peaceful revolution impossible will make violent revolution inevitable JFK

Asrar ahmed

@asrarc4

ایک اللہ کا معصوم سا بندہ facebook https://t.co/AO5oFnxXNr

afrodrive retweeted

elvis

@omarsar0

over 1 year ago

We live in incredible times.

968

412K

afrodrive retweeted

The AI Timeline

@TheAITimeline

over 1 year ago

🚨This week’s top AI/ML research papers: - Mixture-of-Transformers - BitNet a4.8 - LoRA vs Full Fine-tuning: An Illusion of Equivalence - Mixtures of In-Context Learners - Emergence of Hidden Capabilities - DimensionX - The Surprising Effectiveness of Test-Time Training for Abstract Reasoning - OpenCoder: The Open Cookbook for Top-Tier Code LLMs - ReCapture - Needle Threading - M3DocRAG - Controlling Language and Diffusion Models by Transporting Activations - Why Do We Need Weight Decay in Modern Deep Learning? - "Give Me BF16 or Give Me Death"? Trade-Offs in LLM Quantization - Adaptive Caching for Faster Video Generation with Diffusion Transformers - Constant Acceleration Flow - Randomized Autoregressive Visual Generation - Physics in Next-token Prediction - In-Context LoRA for Diffusion Transformers - Balancing Pipeline Parallelism with Vocabulary Parallelism - EoRA: Eigenspace Low-Rank Approximation - Self-Consistency Preference Optimization - How Transformers Solve Propositional Logic Problems: A Mechanistic Analysis - LASER: Attention with Exponential Transformation - Photon: Federated LLM Pre-Training - Attacking Vision-Language Computer Agents via Pop-ups - Hunyuan-Large - Context Parallelism for Scalable Million-Token Inference - Stealing User Prompts from Mixture of Experts - Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond - Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge overview for each + authors' explanations read this in thread mode for the best experience

TheAITimeline's tweet photo. 🚨This week’s top AI/ML research papers:

- Mixture-of-Transformers
- BitNet a4.8
- LoRA vs Full Fine-tuning: An Illusion of Equivalence
- Mixtures of In-Context Learners
- Emergence of Hidden Capabilities
- DimensionX
- The Surprising Effectiveness of Test-Time Training for Abstract Reasoning
- OpenCoder: The Open Cookbook for Top-Tier Code LLMs
- ReCapture
- Needle Threading
- M3DocRAG
- Controlling Language and Diffusion Models by Transporting Activations
- Why Do We Need Weight Decay in Modern Deep Learning?
- "Give Me BF16 or Give Me Death"? Trade-Offs in LLM Quantization
- Adaptive Caching for Faster Video Generation with Diffusion Transformers
- Constant Acceleration Flow
- Randomized Autoregressive Visual Generation
- Physics in Next-token Prediction
- In-Context LoRA for Diffusion Transformers
- Balancing Pipeline Parallelism with Vocabulary Parallelism
- EoRA: Eigenspace Low-Rank Approximation
- Self-Consistency Preference Optimization
- How Transformers Solve Propositional Logic Problems: A Mechanistic Analysis
- LASER: Attention with Exponential Transformation
- Photon: Federated LLM Pre-Training
- Attacking Vision-Language Computer Agents via Pop-ups
- Hunyuan-Large
- Context Parallelism for Scalable Million-Token Inference
- Stealing User Prompts from Mixture of Experts
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond
- Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge

overview for each + authors' explanations
read this in thread mode for the best experience

954

112

874

104K

afrodrive retweeted

maharshi

@maharshii

over 1 year ago

looking for ideas: what ML/software projects are you currently working on?

145

106

429K

afrodrive retweeted

Elliot Arledge

@elliotarledge

over 1 year ago

for those of you scouting for "AI projects" to work on in your free time, i figure i would share the list of projects im currently doing to get a sense of how i carefully pick out problems: 1. write a training run in CUDA for some neural net you find cool. i started off w/ a single hidden layer MLP and gave myself a template to build from there 2. tinker w/ minecraft AI agents. multiple parts to this which make it even more fun (yes, you will get rapid dopamine hits while building). first you have to get an environment working (pixel output + action input) which i highly recommend "minerl" for . then you have to collect a bunch of training data in the form of minecraft video clips with actions each frame of the video (either your own or a massive corpus online -> also comes with minerl). next, you train the first neural net to take in a sequence of images from a video and you try to predict all the actions done in a given frame (say 7 frames total where the one you are trying to predict actions for is the middle frame -> temporal dimension helps performance). after this smaller neural net knows how to generate action labels from any given minecraft video, you can scrape thousands of hours of gameplay from youtube and run your small "data generator" neural net in inference mode to get yourself a nice dataset. then you test and tinker with different neural net architectures to actually attempt to reach goals in minecraft. you could use only neural nets, or neural nets w/ a mix of fixed algos (state machines, conditionals). 3. mechanistic interpretability on computer vision networks and small language models. i personally started off by training an MLP from scratch on the mnist dataset up to around 98% accuracy then used matplotlib to print out the weights and activations in certain ways. helps you understand how neural nets form patterns internally in order to predict labels for the data you train it on. then you could use libraries like transformer_lens to visualize the attention heads at each layer for any given prompt in llms like gpt2-medium / small. if you're gonna go beyond that just looking at the raw patterns with your own eyes, consider playing with sparse autoencoders (if you find the resources hard to navigate just shoot me a dm). they essentially take a bunch of dense values in activations and project it to a sparse tensor format so you can map sparse signals to features (keywords you'll find here are: superposition/polysemanticity & dictionary learning). 4. fire up an instance with at least 2 3090s or 4090s and try to train a neural net of your choice across them with pytorch DDP (data distributed parallel) to give yourself an intro to distributed training/inference. 5. or if these sound crazy and you want a surprise, try implementing a neural net paper from arxiv in pytorch (a paper on "differential attention" came out recently which i'd like to mess around with. you could too) 6. if you think some resources need to be explained or documented better and you see the value in doing so, consider making a tutorial and posting it on X or youtube (i posted a few courses on the freecodecamp youtube channel and people liked them a lot). again, please send me a DM if you have any questions. would love to hear you out and possibly help steer you in the right direction based on your interests :)

664

68K

John Alpha @afrodrive

over 1 year ago

@cecil_nyasha @daddyhope @TateMavetera This is what she said in her LinkedIn Post

John Alpha @afrodrive

over 1 year ago

@cecil_nyasha @daddyhope @TateMavetera Are you sure you know what she said at the Potraz Breakfast Meeting? Read her LinkedIn post. In future don't comment about things you have no idea about and also leave IT and Law stuff to the right professionals in the field. https://t.co/ase4PTgjrl

John Alpha @afrodrive

over 1 year ago

@cecil_nyasha @daddyhope @TateMavetera

John Alpha @afrodrive

over 1 year ago

@cecil_nyasha @daddyhope @TateMavetera

afrodrive retweeted

Tim Denning

@Tim_Denning

over 1 year ago

This man destroyed wokeism: Naval Ravikant. He was an early investor in Notion, Twitter & Uber & is worth $600M+ He's been on fire lately on X. Here is Naval's updated philosophy:

Tim_Denning's tweet photo. This man destroyed wokeism:

Naval Ravikant.

He was an early investor in Notion, Twitter & Uber & is worth $600M+

He's been on fire lately on X.

Here is Naval's updated philosophy: https://t.co/NUHCd3utRh

411

44K

22K

afrodrive retweeted

minami

@minamisatokun

over 1 year ago

Hi ML peeps, how much Calculus is needed before I start jumping into DL could you just cross-check if these topics are enough: - Fundamental Ideas, Rates and Differentials - Functions and Derivatives - DIfferentials of ALgebraic Functions - Use of Rates and Differentials - Differentials of Trigonometric Functions - Velocity, Acceleration and Derivatives - Interpretation of Functions and Derivatives by Means of Graphs - Maximum and Minimum Values - Maxima and Minima - Differentials of Log and Exponential Functions - Integral Formulas

500

615

73K

afrodrive retweeted

Rohan Paul

@rohanpaul_ai

over 1 year ago

"Understanding LLMs from Scratch Using Middle School Math" Neural networks learn to predict text by converting words to numbers and finding patterns through attention mechanisms. So the network turns words into numbers, then use attention to decide what's important for predicting next words Nice long blog (40 minuted reading time), check the link in comment.

rohanpaul_ai's tweet photo. "Understanding LLMs from Scratch Using Middle School Math"

Neural networks learn to predict text by converting words to numbers and finding patterns through attention mechanisms.

So the network turns words into numbers, then use attention to decide what's important for predicting next words

Nice long blog (40 minuted reading time), check the link in comment.

494

306K

afrodrive retweeted

Arvid Kahl

@arvidkahl

over 1 year ago

I want to run AI agents to scrape specific URLs and do some data extraction until they find a certain kind of information. Like an AI investigator. What’s the framework that allows for this kind of cutting-edge stuff? Is there an AI agent project that we should be using?

250

151

884K

afrodrive retweeted

Demetri Kofinas

@kofinas

over 1 year ago

"Generative AI puts [ad-buying] on steroids: advertisers can provide Meta with broad parameters and brand guidelines and let the black box not just test out a few pieces of creative but an effectively unlimited amount. Critically, this generative AI application has a verification function: did the generated ad generate more revenue or less?"

afrodrive retweeted

The AI Timeline

@TheAITimeline

over 1 year ago

🚨This week’s top AI/ML research papers: - GPT-4o System Card - Are LLMs Better than Reported? - Can Language Models Replace Programmers? - CLEAR - What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking - SelfCodeAlign - Mixture of Parrots - Unpacking SDXL Turbo - A prescriptive theory for brain-like inference - Modular Duality in Deep Learning - Learning Video Representations without Natural Videos - CORAL - Task Vectors are Cross-Modal - Mind Your Step (by Step) - ShadowKV - MarDini - COAT - Fast Best-of-N Decoding via Speculative Rejection - Continuous Speech Synthesis using per-token Latent Diffusion - Teach Multimodal LLMs to Comprehend Electrocardiographic Images - FasterCache - Read-ME - VibeCheck - HoPE - In-Context LoRA for Diffusion Transformers - Knowledge Graph Enhanced Language Agents for Recommendation - $100K or 100 Days - On Memorization of Large Language Models in Logical Reasoning - Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis - Grounding by Trying - Relaxed Recursive Transformers - Combining Induction And Transduction For Abstract Reasoning overview for each + author's explanations read this in thread mode for the best experience

TheAITimeline's tweet photo. 🚨This week’s top AI/ML research papers:

- GPT-4o System Card
- Are LLMs Better than Reported?
- Can Language Models Replace Programmers?
- CLEAR
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking
- SelfCodeAlign
- Mixture of Parrots
- Unpacking SDXL Turbo
- A prescriptive theory for brain-like inference
- Modular Duality in Deep Learning
- Learning Video Representations without Natural Videos
- CORAL
- Task Vectors are Cross-Modal
- Mind Your Step (by Step)
- ShadowKV
- MarDini
- COAT
- Fast Best-of-N Decoding via Speculative Rejection
- Continuous Speech Synthesis using per-token Latent Diffusion
- Teach Multimodal LLMs to Comprehend Electrocardiographic Images
- FasterCache
- Read-ME
- VibeCheck
- HoPE
- In-Context LoRA for Diffusion Transformers
- Knowledge Graph Enhanced Language Agents for Recommendation
- $100K or 100 Days
- On Memorization of Large Language Models in Logical Reasoning
- Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis
- Grounding by Trying
- Relaxed Recursive Transformers
- Combining Induction And Transduction For
Abstract Reasoning

overview for each + author's explanations
read this in thread mode for the best experience

590

552

65K

afrodrive retweeted

The AI Timeline

@TheAITimeline

over 1 year ago

🚨This week’s top AI/ML research papers: - Sparse Crosscoders - Rethinking Softmax - Mechanistic Unlearning - Decomposing The Dark Matter of Sparse Autoencoders - ZIP-FIT - Automatically Interpreting Millions of Features in Large Language Models - Breaking the Memory Barrier - Can Knowledge Editing Really Correct Hallucinations? - Framer: Interactive Frame Interpolation - Beyond position - A Hitchhiker's Guide to Scaling Law Estimation - Scaling up Masked Diffusion Models on Text - Why Does the Effective Context Length of LLMs Fall Short? - Scaling Diffusion Language Models via Adaptation from Autoregressive Models - Improve Vision Language Model Chain-of-thought Reasoning - PyramidDrop - FrugalNeRF - SAM2Long - SeerAttention - FiTv2 overview for each + authors' explanations read this in thread mode for the best experience

TheAITimeline's tweet photo. 🚨This week’s top AI/ML research papers:

- Sparse Crosscoders
- Rethinking Softmax
- Mechanistic Unlearning
- Decomposing The Dark Matter of Sparse Autoencoders
- ZIP-FIT
- Automatically Interpreting Millions of Features in Large Language Models
- Breaking the Memory Barrier
- Can Knowledge Editing Really Correct Hallucinations?
- Framer: Interactive Frame Interpolation
- Beyond position
- A Hitchhiker's Guide to Scaling Law Estimation
- Scaling up Masked Diffusion Models on Text
- Why Does the Effective Context Length of LLMs Fall Short?
- Scaling Diffusion Language Models via Adaptation from Autoregressive Models
- Improve Vision Language Model Chain-of-thought Reasoning
- PyramidDrop
- FrugalNeRF
- SAM2Long
- SeerAttention
- FiTv2

overview for each + authors' explanations
read this in thread mode for the best experience

736

621

85K

afrodrive retweeted

@levelsio

over 1 year ago

Imagine being Sundar Pichai now: - you had the largest continually updated data set of any company to train AI on (the Google Index) - you invented the underlying technology of LLMs like ChatGPT in 2017 called Transfomers - you had complete search dominance: all you had to add was AI and you'd own the market And yet: - you managed to complete fumble your massive head start and was late to everything - you made your APIs so hard to use nobody seriously integrated it into their apps and people instead went Anthropic and OpenAI - you now see your search dominance quickly slipping away to Perplexity and yesterday's launched ChatGPT Search This will be a business case studied in universities for decades

levelsio's tweet photo. Imagine being Sundar Pichai now:
- you had the largest continually updated data set of any company to train AI on (the Google Index)
- you invented the underlying technology of LLMs like ChatGPT in 2017 called Transfomers
- you had complete search dominance: all you had to add was AI and you'd own the market

And yet:
- you managed to complete fumble your massive head start and was late to everything
- you made your APIs so hard to use nobody seriously integrated it into their apps and people instead went Anthropic and OpenAI
- you now see your search dominance quickly slipping away to Perplexity and yesterday's launched ChatGPT Search

This will be a business case studied in universities for decades

17K

afrodrive retweeted

Hadi Vafaii @hadivafaii

over 1 year ago

How do brains “infer” the world’s state from noisy sensory data—and do so “dynamically?” Our new theoretical framework bridges these two perspectives in a brain-inspired model👉🧵[1/n] w/ amazing co-lead @dekelgalor & polymath mentor @jcbyts 📜preprint: https://t.co/xmjFLintZb

hadivafaii's tweet photo. How do brains “infer” the world’s state from noisy sensory data—and do so “dynamically?”

Our new theoretical framework bridges these two perspectives in a brain-inspired model👉🧵[1/n]

w/ amazing co-lead @dekelgalor & polymath mentor @jcbyts
📜preprint: https://t.co/xmjFLintZb https://t.co/S3SkQhHNY2

438

394

39K

John Alpha

@afrodrive

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users