Abdullah Mamun @AB9Mamun - Twitter Profile

Pinned Tweet

4 months ago

🚀 New Publication! My paper has been accepted for publication in Progress in Biomedical Engineering (IF 7.7): “Trustworthy AI in Digital Health: A Comprehensive Review of Robustness and Explainability.” Covers key methods, challenges, and metrics for building reliable healthcare AI. Includes 200 highly relevant references. 📖 Paper: https://t.co/HOeCIfpyUn @IOPPublishing @asuhealth @SCAI_ASU #TrustworthyAI #DigitalHealth #ExplainableAI #HealthcareAI #MachineLearning #LLMs #ASU

1

2

0

117

Abdullah Mamun

@AB9Mamun

30 days ago

I had the privilege of taking Prof. Dimitri Bertsekas’ RL course. A mind that reshaped optimization and RL, yet so kind and humble. An incredibly busy schedule, yet so quick to respond and provide feedback. A lifetime devoted to advancing science. Rest in peace, Professor.

AB9Mamun's tweet photo. I had the privilege of taking Prof. Dimitri Bertsekas’ RL course. A mind that reshaped optimization and RL, yet so kind and humble. An incredibly busy schedule, yet so quick to respond and provide feedback. A lifetime devoted to advancing science. Rest in peace, Professor. https://t.co/9WpkuP3foK

0

2

0

1

187

Abdullah Mamun

@AB9Mamun

about 1 month ago

@mmiemon Great work, @mmiemon!

0

3

Abdullah Mamun

@AB9Mamun

2 months ago

#Publication_Alert Making Childbirth Safer with Explainable AI 🤰👶 I’m happy to share that our paper, “Use of What-if Scenarios to Help Explain Artificial Intelligence Models for Neonatal Health,” has been accepted for publication in ACM Transactions on Computing for Healthcare (ACM Health) (Impact Factor: 8.0). 📄 Paper: https://t.co/fLexbfgMar 💻 Code: https://t.co/v5BTFABXK0 In this work, we propose AIMEN, a deep learning framework for predicting adverse labor outcomes while providing interpretable, counterfactual explanations. Key contributions: • Conditional GAN-based data augmentation to address data limitations • Explore how different restrictions on data augmentation affect the prediction performance • Strong predictive performance (F1: 0.784), outperforming state-of-the-art models • Actionable insights through minimal feature changes to support clinical decisions #MachineLearning #HealthcareAI #ExplainableAI #MaternalHealth #NeonatalHealth #AIResearch

AB9Mamun's tweet photo. #Publication_Alert
Making Childbirth Safer with Explainable AI 🤰👶
I’m happy to share that our paper, “Use of What-if Scenarios to Help Explain Artificial Intelligence Models for Neonatal Health,” has been accepted for publication in ACM Transactions on Computing for Healthcare (ACM Health) (Impact Factor: 8.0).

📄 Paper: https://t.co/fLexbfgMar
💻 Code: https://t.co/v5BTFABXK0

In this work, we propose AIMEN, a deep learning framework for predicting adverse labor outcomes while providing interpretable, counterfactual explanations.

Key contributions:
• Conditional GAN-based data augmentation to address data limitations
• Explore how different restrictions on data augmentation affect the prediction performance
• Strong predictive performance (F1: 0.784), outperforming state-of-the-art models
• Actionable insights through minimal feature changes to support clinical decisions

#MachineLearning #HealthcareAI #ExplainableAI #MaternalHealth #NeonatalHealth #AIResearch

1

0

61

Who to follow

2 months ago

There’s no alternative to finding compute-efficient alternatives to transformers if AI is to scale long term. Kudos to @GoogleResearch for pushing that frontier.

Google Research

@GoogleResearch

2 months ago

Google presents a new Transformer alternative at #ICLR2026! Join Nino Scherrer & Yanick Schimpf at the Google booth (#411) at 10AM to learn about MesaNet, proposing a new linear sequence layer that optimally learns in-context given a fixed memory budget.

GoogleResearch's tweet photo. Google presents a new Transformer alternative at #ICLR2026! Join Nino Scherrer & Yanick Schimpf at the Google booth (#411) at 10AM to learn about MesaNet, proposing a new linear sequence layer that optimally learns in-context given a fixed memory budget. https://t.co/OFo0Qtb1KI

37

953

130

540

61K

0

47

Abdullah Mamun

@AB9Mamun

2 months ago

DeepSeek V4 drops, GPT 5.5 lands, and Anthropic, Meta, and Google are all accelerating at once. April 2026 has been relentless.

DeepSeek

@deepseek_ai

2 months ago

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at https://t.co/GCdiMzk1Dl via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: https://t.co/drlDrxkYtp 🤗 Open Weights: https://t.co/T13Y8i7SDM 1/n

deepseek_ai's tweet photo. 🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.

🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.
🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice.

Try it now at https://t.co/GCdiMzk1Dl via Expert Mode / Instant Mode. API is updated & available today!

📄 Tech Report: https://t.co/drlDrxkYtp
🤗 Open Weights: https://t.co/T13Y8i7SDM

1/n

2K

46K

8K

10K

10M

0

80

AB9Mamun retweeted

Alexandr Wang

@alexandr_wang

3 months ago

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

alexandr_wang's tweet photo. 1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵 https://t.co/fThDXdsxwB

752

10K

1K

3K

5M

AB9Mamun retweeted

Mark Cuban

@mcuban

4 months ago

I’m going to tell you how much worse it was at the start of the PC Revolution for white collar workers trying to adapt, vs today with AI Today, presumably every white collar worker has access to a smart phone and/or a PC/laptop. Back then, a PC cost $4,995 , an off brand was $3,995. 5k in 1984 is about $16k today. It was really expensive. The only reason I could learn how to code and support software is because my job let me take home a PC to learn. By reading the software manual. Literally. RTFM. Or pay to go to training. Classes that started at hundreds of dollars then. It was expensive. It absolutely limited who could get ahead. Today, ANYONE can go to their browser, to the AI LLM website of their choice, and type in the words “I’m a novice with zero computer background, teach me how to create an agent that reads my email and …” That concept applies to LEARNING ANYTHING Think about what this means. Any employee of any company can say “ I need to learn how to xyz for my job , which is to do the following: Tell me what more information do you need to help me be more efficient, productive and promotable”. Or “ what new skills can you teach me that will help me reduce my chances of getting laid off “. Or “what suggestions do you have for me to communicate to my boss, who I barely know, to help my chances of staying employed “ These aren’t great prompts. But they are a start that anyone can take. Think about how incredible that is. Back in the day was so much harder for white collar workers. It was harder for new grads because unless they took comp sci, they probably had never used a PC. Big Companies are going to cut jobs. No question about it. Small companies is are going to need more and more AI literate thinkers who can help them compete or get an edge What I tell every entrepreneur, and it’s more crucial today. “ when you run with the elephants there are the quick and the dead. Adopt tech quickly , you can out maneuver big companies. “

309

4K

460

2K

935K

AB9Mamun retweeted

Sebastian Raschka

@rasbt

3 months ago

A visual guide to modern LLM attention variants, all in one place: https://t.co/vy1DAEKXU8

37

2K

357

2K

106K

Abdullah Mamun

@AB9Mamun

4 months ago

@mmiemon Thanks, Emon!

0

6

Abdullah Mamun

@AB9Mamun

4 months ago

🚀 New Publication! My paper has been accepted for publication in Progress in Biomedical Engineering (IF 7.7): “Trustworthy AI in Digital Health: A Comprehensive Review of Robustness and Explainability.” Covers key methods, challenges, and metrics for building reliable healthcare AI. Includes 200 highly relevant references. 📖 Paper: https://t.co/HOeCIfpyUn @IOPPublishing @asuhealth @SCAI_ASU #TrustworthyAI #DigitalHealth #ExplainableAI #HealthcareAI #MachineLearning #LLMs #ASU

1

2

0

117

AB9Mamun retweeted

Google DeepMind @GoogleDeepMind

4 months ago

Nano Banana 2 makes sophisticated visual creation faster, cheaper, and accessible to everyone. 🍌 Tap on each photo to see the details 👀

GoogleDeepMind's tweet photo. Nano Banana 2 makes sophisticated visual creation faster, cheaper, and accessible to everyone. 🍌

Tap on each photo to see the details 👀

61

907

56

147

74K

AB9Mamun retweeted

Jack Morris

@jxmnop

5 months ago

at long last, the final paper of my phd 🧮 Learning to Reason in 13 Parameters 🧮 we develop TinyLoRA, a new ft method. with TinyLoRA + RL, models learn well with dozens or hundreds of params example: we use only 13 parameters to train 7B Qwen model from 76 to 91% on GSM8K 🤯

jxmnop's tweet photo. at long last, the final paper of my phd

🧮 Learning to Reason in 13 Parameters 🧮

we develop TinyLoRA, a new ft method. with TinyLoRA + RL, models learn well with dozens or hundreds of params

example: we use only 13 parameters to train 7B Qwen model from 76 to 91% on GSM8K 🤯

60

2K

230

1K

183K

AB9Mamun retweeted

Google Research

@GoogleResearch

5 months ago

A fundamental challenge in machine learning, feature selection is NP-hard (i.e., a problem that is mathematically "impossible" to solve perfectly and quickly for large groups of data), which makes it a highly challenging area of research. We introduce Sequential Attention, an algorithm that optimizes subset selection in large-scale ML models. This is an effective technique for multiple large-scale subset selection problems in deep learning and plays a key role in model architecture optimization. As these techniques evolve, they will solidify the future of machine learning.

11

546

65

280

35K

AB9Mamun retweeted

Simon Willison

@simonw

6 months ago

Here's my enormous round-up of everything we learned about LLMs in 2025 - the third in my annual series of reviews of the past twelve months https://t.co/HD9Zf85SG2 This year it's divided into 26 sections! This is the table of contents:

simonw's tweet photo. Here's my enormous round-up of everything we learned about LLMs in 2025 - the third in my annual series of reviews of the past twelve months
https://t.co/HD9Zf85SG2
This year it's divided into 26 sections! This is the table of contents: https://t.co/DFlzgXudLy

102

5K

868

6K

510K

AB9Mamun retweeted

Andrej Karpathy

@karpathy

9 months ago

nanochat now has a primordial identity and can talk a bit about itself and its capabilities (e.g. it knows it's nanochat d32 that cost $800, that it was built by me, that it can't speak languages other than English too well and why, etc.). This kind of customization is all done through synthetic data generation and I uploaded a new example script to demonstrate. It's a bit subtle but by default LLMs have no inherent personality or any understanding of their own capabilities because they are not animal-like entities. They don't know what they are or what they can or can't do or know or don't know. All of it has to be explicit bolted on. This is done by asking a bigger LLM cousin to generate synthetic conversations (you tell it what they should look like simply in words), and then mixing them into midtraining and/or SFT stage. The most important challenge is ensuring enough entropy/diversity in your generated data. If you don't do it well, LLMs will generate 1000 conversations that are all ay too similar, even with high temperature. My script shows a crappy example of how to add diversity - e.g. by creating lists of starting messages or topics, sampling from them explicitly, adding them as fewshot examples into prompts for "inspiration", etc. I wanted to have some fun with it so nanochat now refers to me as King Andrej Karpathy (lol) just to illustrate that this is a giant blank canvas - you can infuse completely arbitrarily identity, knowledge or style into your LLM in this manner. I hope it's helpful and sparks fun ideas!

155

4K

231

1K

462K

Abdullah Mamun

@AB9Mamun

9 months ago

@mmiemon This is cool. Do you think it can be used to empower RAGs in other LLMs or SLMs?

0

6

AB9Mamun retweeted

Rohan Paul

@rohanpaul_ai

9 months ago

🚨 BAD news for Medical AI models. MASSIVE revelations from this @Microsoft paper. 🤯 Current medical AI models may look good on standard medical benchmarks but those scores do not mean the models can handle real medical reasoning. The key point is that many models pass tests by exploiting patterns in the data, not by actually combining medical text with images in a reliable way. The key findings are that models overuse shortcuts, break under small changes, and produce unfaithful reasoning. This makes the medical AI model's benchmark results misleading if someone assumes a high score means the model is ready for real medical use. --- The specific key findings from this paper 👇 - Models keep strong accuracy even when images are removed, even on questions that require vision, which signals shortcut use over real understanding. - Scores stay above the 20% guess rate without images, so text patterns alone often drive the answers. - Shuffling answer order changes predictions a lot, which exposes position and format bias rather than robust reasoning. - Replacing a distractor with “Unknown” does not stop many models from guessing, instead of abstaining when evidence is missing. - Swapping in a lookalike image that matches a wrong option makes accuracy collapse, which shows vision is not integrated with text. - Chain of thought often sounds confident while citing features that are not present, which means the explanations are unfaithful. - Audits reveal 3 failure modes, incorrect logic with correct answers, hallucinated perception, and visual reasoning with faulty grounding. - Gains on popular visual question answering do not transfer to report generation, which is closer to real clinical work. - Clinician reviews show benchmarks measure very different skills, so a single leaderboard number misleads on readiness. - Once shortcut strategies are disrupted, true comprehension is far weaker than the headline scores suggest. - Most models refuse to abstain without the image, which is unsafe behavior for medical use. - The authors push for a robustness score and explicit reasoning audits, which signals current evaluations are not enough. 🧵 Read on 👇

rohanpaul_ai's tweet photo. 🚨 BAD news for Medical AI models.

MASSIVE revelations from this @Microsoft paper.

🤯 Current medical AI models may look good on standard medical benchmarks but those scores do not mean the models can handle real medical reasoning.

The key point is that many models pass tests by exploiting patterns in the data, not by actually combining medical text with images in a reliable way.

The key findings are that models overuse shortcuts, break under small changes, and produce unfaithful reasoning.

This makes the medical AI model's benchmark results misleading if someone assumes a high score means the model is ready for real medical use.

---

The specific key findings from this paper 👇

- Models keep strong accuracy even when images are removed, even on questions that require vision, which signals shortcut use over real understanding.

- Scores stay above the 20% guess rate without images, so text patterns alone often drive the answers.

- Shuffling answer order changes predictions a lot, which exposes position and format bias rather than robust reasoning.

- Replacing a distractor with “Unknown” does not stop many models from guessing, instead of abstaining when evidence is missing.

- Swapping in a lookalike image that matches a wrong option makes accuracy collapse, which shows vision is not integrated with text.

- Chain of thought often sounds confident while citing features that are not present, which means the explanations are unfaithful.

- Audits reveal 3 failure modes, incorrect logic with correct answers, hallucinated perception, and visual reasoning with faulty grounding.

- Gains on popular visual question answering do not transfer to report generation, which is closer to real clinical work.

- Clinician reviews show benchmarks measure very different skills, so a single leaderboard number misleads on readiness.

- Once shortcut strategies are disrupted, true comprehension is far weaker than the headline scores suggest.

- Most models refuse to abstain without the image, which is unsafe behavior for medical use.

- The authors push for a robustness score and explicit reasoning audits, which signals current evaluations are not enough.

🧵 Read on 👇

171

4K

818

3K

523K

Abdullah Mamun

@AB9Mamun

9 months ago

📖 A high-quality blog post by @Han_Fang_ and @karthikabinav of Meta SuperIntelligence Labs.

Han Fang

@Han_Fang_

10 months ago

🧠 Tokens for Thoughts is live! @karthikabinav & I are distilling our reading notes into bite-sized LLM notebooks—come read papers with us. Ep. 1: Post-Training 101 (SFT + RL primer) 👉 https://t.co/6L3oEyeoJa DMs or replies open for topic requests!

0

54

8

39

4K

0

1

0

148

AB9Mamun retweeted

Alexandr Wang

@alexandr_wang

9 months ago

new research from Meta FAIR: Code World Model (CWM), a 32B research model we encourage the research community to research this open-weight model! pass@1 evals, for the curious: 65.8 % on SWE-bench Verified 68.6 % on LiveCodeBench 96.6 % on Math-500 76.0 % on AIME 2024 🧵

alexandr_wang's tweet photo. new research from Meta FAIR: Code World Model (CWM), a 32B research model

we encourage the research community to research this open-weight model!

pass@1 evals, for the curious:

65.8 % on SWE-bench Verified
68.6 % on LiveCodeBench
96.6 % on Math-500
76.0 % on AIME 2024

🧵 https://t.co/OawFRX91zf

94

1K

153

424

869K

Abdullah Mamun

@AB9Mamun

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users