Zining Zhu

@zhuzining

Assistant Professor @FollowStevens (2024-) PhD @UofT, @VectorInst Areas: #NLProc #Explainable #AI

Hoboken, New Jersey

Joined January 2014

609 Following

854 Followers

345 Posts

Zining Zhu @zhuzining

1 day ago

One single score can almost never tell {estimated AI participation extent, type of AI works, detector confidence, etc.} at the same time. Humans using predictor outputs for decisions should be more careful.

Harry Wang @harryjwang

1 day ago

NeurIPS 2026 just desk-rejected hundreds of papers because an AI detector said they were AI-written (https://t.co/P2IDyWWOTB). I understand why they did it. Reviewer time is scarce, and a flood of slop submissions is a real threat to peer review. But here's where it gets tricky. The Position Paper Track requires papers be "substantially human-written," so I ran one of my own through the same detector they used. It flagged 73% as AI-written — on pretty weak evidence (see screenshots). The funny part? It's actually closer to 99%. I only hand-edited a few sentences. But the idea, the literature review, the positioning, the data analysis ... all human. The AI wrote the text, across many rounds of back-and-forth with me. It's like coding: most code today is generated almost entirely by AI. Do we "reject" those systems too? To me, whether the words were typed by a human matters less than the substance behind them. What do you think? PS. I ran this very post through the detector too. It came back 100% AI-generated — even though it took me almost 30 minutes to write, and I think it carries my thinking and is worth reading. #AI #PeerReview #NeurIPS #AIDetection #AcademicResearch #MachineLearning #ResearchIntegrity #AIEthics #ScholarlyPublishing #AIWriting

harryjwang's tweet photo. NeurIPS 2026 just desk-rejected hundreds of papers because an AI detector said they were AI-written (https://t.co/P2IDyWWOTB).

I understand why they did it. Reviewer time is scarce, and a flood of slop submissions is a real threat to peer review.

But here's where it gets tricky. The Position Paper Track requires papers be "substantially human-written," so I ran one of my own through the same detector they used. It flagged 73% as AI-written — on pretty weak evidence (see screenshots).

The funny part? It's actually closer to 99%. I only hand-edited a few sentences. But the idea, the literature review, the positioning, the data analysis ... all human. The AI wrote the text, across many rounds of back-and-forth with me.

It's like coding: most code today is generated almost entirely by AI. Do we "reject" those systems too?

To me, whether the words were typed by a human matters less than the substance behind them.

What do you think?

PS. I ran this very post through the detector too. It came back 100% AI-generated — even though it took me almost 30 minutes to write, and I think it carries my thinking and is worth reading.

#AI #PeerReview #NeurIPS #AIDetection #AcademicResearch #MachineLearning #ResearchIntegrity #AIEthics #ScholarlyPublishing #AIWriting

210

127

115K

224

zhuzining retweeted

elie

@eliebakouch

about 1 month ago

Qwen first release on interpretability (qwen scope) is very interesting they use SAE features to identify what causes repetition in model outputs, then use steering to manufacture a "bad" rollout where the model repeats a lot. this gives RL a clear negative signal to learn from, since repetition barely shows up in normal rollouts so the model never gets punished for it they also use SAE features as a fingerprint for benchmarks, you look at which features each benchmark activates and compare overlap. lets you find redundancy inside a benchmark and across benchmarks without running any model. for instance 63% of GSM8K features are in MATH but only 10% the other way

eliebakouch's tweet photo. Qwen first release on interpretability (qwen scope) is very interesting

they use SAE features to identify what causes repetition in model outputs, then use steering to manufacture a "bad" rollout where the model repeats a lot. this gives RL a clear negative signal to learn from, since repetition barely shows up in normal rollouts so the model never gets punished for it

they also use SAE features as a fingerprint for benchmarks, you look at which features each benchmark activates and compare overlap. lets you find redundancy inside a benchmark and across benchmarks without running any model. for instance 63% of GSM8K features are in MATH but only 10% the other way

782

118

614

41K

zhuzining retweeted

kaize

@0x_kaize

2 months ago

https://t.co/ScYWvkqPa4

227

11K

42K

12M

zhuzining retweeted

Zihan "Zenus" Wang

@wzenus

2 months ago

AutoResearch saves 90% time for research solution. But in research, 90% time is spent for the right question.

14K

Who to follow

Jie Huang

@jefffhj

Building intelligence @xAI. Grok-2🍍, 3🍫, 4🫐, Video Gen🪄. PhD from UIUC CS.

Hao Zhu

@_Hao_Zhu

Building the AI social brain for humans @StanfordNLP PhD @LTIatCMU

Lei Li

@lileics

Generative AI for language and science. MT, LLM, GenAI Safety, Drug Discovery

zhuzining retweeted

Callum McDougall @calsmcdougall

3 months ago

Announcing new ARENA material: 8 new exercise sets on alignment science, interpretability & AI safety - each containing 1-2 days of structured, hands-on content replicating key papers in the field. All open source on a public GitHub, and available for study. Here's what's in it:

calsmcdougall's tweet photo. Announcing new ARENA material: 8 new exercise sets on alignment science, interpretability & AI safety - each containing 1-2 days of structured, hands-on content replicating key papers in the field.

All open source on a public GitHub, and available for study. Here's what's in it: https://t.co/fmnZNHhrKn

612

598

84K

zhuzining retweeted

Jeff Dean

@JeffDean

4 months ago

This is absolutely shameful. Agents of a federal agency unnecessarily escalating, and then executing a defenseless citizen whose offense appears to be using his cell phone camera. Every person regardless of political affiliation should be denouncing this.

246

948

423

980K

zhuzining retweeted

Sarah Wiegreffe @sarahwiegreffe

6 months ago

TIL that ACL 2026's theme track is "Explainability of NLP Models"! 😮🤩 @aclmeeting

10K

zhuzining retweeted

Haojin Wang Applying 27 Fall PhD @haojinw2323

7 months ago

❌ LMs can’t express all next-token distributions — embedding-space limits constrain what’s possible. 🤔 But have you wondered which ones are hardest to elicit? Our #EMNLP2025 paper finds medium-entropy distributions (without outliers) are the toughest. 🧩https://t.co/pVIO9e3Fo8

zhuzining retweeted

XLLM-Reason-Plan @XllmReasonPlan

8 months ago

@COLM_conf #COLM2025 Our last invited talk if you are still around: Prof. Zining Zhu is presenting "Improving LLM reasoning with mechanistic insights" @zhuzining

XllmReasonPlan's tweet photo. @COLM_conf #COLM2025 Our last invited talk if you are still around: Prof. Zining Zhu is presenting "Improving LLM reasoning with mechanistic insights" @zhuzining https://t.co/WtCaZEtbVE

416

Zining Zhu @zhuzining

10 months ago

@frankniujc Congratulations!!

zhuzining retweeted

Paul Bogdan @paulcbogdan

12 months ago

New paper: What happens when an LLM reasons? We created methods to interpret reasoning steps & their connections: resampling CoT, attention analysis, & suppressing attention We discover thought anchors: key steps shaping everything else. Check our tool & unpack CoT yourself 🧵

773

149

824

124K

Zining Zhu @zhuzining

12 months ago

Reviewers should perhaps be prohibited from changing the scores they give on the day of seeing the scores of their own papers submitted to @ReviewAcl.

373

zhuzining retweeted

XLLM-Reason-Plan @XllmReasonPlan

12 months ago

🚨Deadline alert: If you work on LLM explainability for reasoning and planning, submit your work by June 23! - Non-archival, two formats (long/short) - Welcome recently accepted papers and dual submissions - 🏆Two awards will be announced! Details: https://t.co/ZzMT2BCxCy

XllmReasonPlan's tweet photo. 🚨Deadline alert: If you work on LLM explainability for reasoning and planning, submit your work by June 23!
- Non-archival, two formats (long/short)
- Welcome recently accepted papers and dual submissions
- 🏆Two awards will be announced!
Details: https://t.co/ZzMT2BCxCy https://t.co/qEN1fADgPS

Zining Zhu @zhuzining

12 months ago

@sarahwiegreffe @umdcs Congratulations!

231

Zining Zhu @zhuzining

about 1 year ago

@wzhao_nlp https://t.co/ZfDo0gqGLq Self-promotion here: We did counterfactual reasoning. The best thing I like about ACCORD is the incorporation of formal reasoning settings into commonsense reasoning datasets.

Zining Zhu @zhuzining

over 1 year ago

Let's bring in more formal reasoning properties in the commonsense reasoning datasets! Introducing ACCORD https://t.co/tP9hRqPt99, to be presented at #NAACL2025 ACCORD allows (1) controllable reasoning path length, (2) controllable distraction items on the reasoning tree. These controls are (3) automatic and (4) scalable. 1/n

zhuzining's tweet photo. Let's bring in more formal reasoning properties in the commonsense reasoning datasets! Introducing ACCORD https://t.co/tP9hRqPt99, to be presented at #NAACL2025
ACCORD allows (1) controllable reasoning path length, (2) controllable distraction items on the reasoning tree. These controls are (3) automatic and (4) scalable. 1/n

621

231

zhuzining retweeted

David Bau @davidbau

about 1 year ago

Dear MAGA friends, I have been worrying about STEM in the US a lot, because right now the Senate is writing new laws that cut 75% of the STEM budget in the US. Sorry for the long post, but the issue is really important, and I want to share what I know about it. The entire funding for the NSF and NIH together is only 0.82% of the federal budget, but it is hugely important for science, funding the entire science research and education pipeline in the US. The Senate is now planning to cut more than half of that science funding permanently. For decades the USA has already underfunded science compared to how important it is for the future, but now with these cuts, the future of science in this country will be devastated. Let me break down what's happening: (1) First, how we got into this mess. (2) Second, why it is such a disaster. (1) Here is why we are making the mistake. This year, more than half of all projects in the NSF and NIH have been cut because they mention DEI, but we have NOT redirected the money to non-DEI STEM education. Instead it's just cut permanently. To understand why there is so much DEI to be cut, you need to look at how NSF has operated. For decades, NSF has required grant proposals to address "broader impacts" - showing how the research would benefit society beyond just scientific discovery. NSF specifically asked for activities focused on "full participation of women, persons with disabilities, and underrepresented minorities in STEM." This led professors to include plans for broadening participation, literally writing programs like this: "we will design educational activities for a robot club for girls." Now remember, the professors were asked to propose these programs by the government. We are all trained in physics or AI or robots or biology - but the law says, you must make sure your programs encourage women to do STEM, so that is what we all did. So basically 100% of science educators followed this law. But now that DEI is no longer allowed, it has all been cut - DOGE scrubbed through all the proposals for any of these ideas and ended up cancelling 75% of STEM education programs. But here is the big mistake we are making. Instead of redirecting the money to tell the professors to boost STEM for "ALL students and ALL people," we are CUTTING the funding PERMANENTLY. The same scientists would enthusiastically make programs for ALL American students - boys and girls, rural and urban, from every background. We just need Congress to redirect the funding instead of eliminating it. But instead, the senators are planning to just cut the whole budget for everything that included that girls club - all the research, all the education, all the experts, everything - basically forcing our teaching scientists to leave for other careers. For example 85% of all fundamental physics research has been cut. You cannot cut 85% of a budget without losing nearly everybody. And so the departures have already started in the NSF. They have already lost hundreds of expert STEM staff. So that is how we got to making this mistake. (2) The second big thing to understand is how BAD this mistake is. By making the cuts permanent and losing everybody, instead of just changing the priorities for the professors, we end up permanently shrinking not just the NSF and NIH but all science in the US. Every single scientist in this country learned from another scientist; they all went to school to learn. But by specifically losing professors, we are shutting down the education pipeline, which is the future of the field. So even though there is still a lot of science funding in the budget for, example, engineers making Defense weapons systems - every single one of those engineers had to have decades of training. And so we need to have a scientific training and teaching pipeline, which is done by the NSF and NIH, but this is exactly what is getting zeroed out. Every time I talk to a talented PhD student about career choices, I work hard to try to convince them to stay in low-paying science teaching and research instead of getting rich on Wall Street. Because teaching is about the future of ideas, the future of talent in the country. But that is a really hard conversation, because even though teaching and research is idealistic, Wall Street is literally dangling millions of dollars in front of the best PhD students. Being a professor doesn't pay by comparison. Cutting the NSF and NIH will force our teaching scientists to leave for other careers, and those scientists will quickly move on to some other technical career that doesn't involve teaching. They will go to Wall Street, or they will go make and sell a product. We might think: why is that so bad? Maybe they can go do something more useful with their talent, something that makes more money. The reason it is so bad is because in this country we really have a shortage of teachers for the next generation. Not only for the K-12 kids, but also top professors for teaching the top experts how to advance the frontiers of our scientific fields. Science and technology is advancing so quickly, for there to be opportunities in the US, we really need our best experts at the frontier in training positions, to help teach more scientists. Think about it this way: every single American scientist - whether they work for SpaceX, design weapons for defense contractors, develop new drugs, or create AI systems - they ALL learned from professors funded by NSF and NIH. Without these teaching scientists, we won't have ANY scientists in 10 years. Not for defense, not for industry, not for medicine. We're not just cutting some programs - we're cutting off the entire pipeline that creates every technical expert in America. The real thing that is keeping me from sleeping, is that this "little" budget decision will actually be the end of science teaching in this country. That half-a-percent investment is really about cutting the whole future for all future scientists in the US. I ran a search for programs that have already been cut, and in Kansas and Nebraska, we've cut Grant #2409150 - a STEM Pathways Alliance providing scholarships, research opportunities, and tutoring. In Alaska, we've cut Grant #2308786 - a program providing research opportunities and research travel across the university system. In Kansas, we've also terminated Grant #2314275 - a program teaching tech skills from basic computer literacy to coding and cybersecurity. In West Virginia, we've cut Grant #2411642 - a program to strengthen STEM departments at WVU. In Nebraska, we've terminated Grant #2415667 - a program connecting youth to agricultural technology and environmental science. Why cut these programs? They are being terminated simply because they mention serving women, minorities, indigenous students, or ex-cons. But the professors running them are scientists and engineers, not activists. They'd be thrilled to teach ANY student who wants to learn. Just redirect the scholarships to ALL deserving students. Open the tech training to ALL unemployed workers. Expand the rural programs to serve ALL rural kids. The infrastructure is already there - the labs, the mentors, the industry partnerships. All Congress needs to do is change "underrepresented minorities" to "ALL Americans." If anything, we need MORE funding for these programs, not less - there are plenty of Americans across the country who need these STEM programs. If you or anybody you know lives in one of these states, you have a lot of influence. Here are some of the key senators debating the issue: @JerryMoran (KS) @SenCapito (WV) @lisamurkowski (AK) @SenKatieBritt (AL) @SenatorFischer (NE). All these Senators have all supported STEM in the past, and we need to make sure that they know how important we all think it is to preserve (and redirect) that half a percent of the budget for STEM research and education instead of zeroing it out. It's one thing to debate fairness in STEM. But it is a huge mistake to just cut it all. Please ask these senators to: (1) Keep the STEM funding but redirect it to serve ALL Americans (2) Protect American jobs and innovation Even a simple message saying "Don't cut STEM - redirect it to serve everyone" would help. These specific senators are making these decisions NOW, this week, this month.

466

130

124K

Zining Zhu @zhuzining

about 1 year ago

@aryaman2020 Yes the mechanisms are tied to the downstream behavior. In this work, we predicted the fine-tuning performance using probing results with some non-trivial accuracy. https://t.co/psruaxxFnF In the future, many model behaviors can be predicted from mech interp signals.

127

Zining Zhu @zhuzining

over 1 year ago

@DavidSKrueger We are working on letting the LLMs ask questions themselves, driven by curiosity (https://t.co/0hD16DbZdx) and applying interpretability during the process! Look forward to discussing further.