Shichao @nightwingng - Twitter Profile

1 day ago

Claude Fable 5 ties for the lowest raw sycophancy rate in my LLM Sycophancy Benchmark. However, it is contrarian and inconsistent. This benchmark asks whether a model keeps the same judgment when the same dispute is rewritten from both opposing first-person perspectives.

LechMazur's tweet photo. Claude Fable 5 ties for the lowest raw sycophancy rate in my LLM Sycophancy Benchmark. However, it is contrarian and inconsistent.

This benchmark asks whether a model keeps the same judgment when the same dispute is rewritten from both opposing first-person perspectives. https://t.co/Le3jaeJtJL

10

108

7

32

8K

nightwingng retweeted

Google

@Google

1 day ago

We're releasing DiffusionGemma as an open model under an Apache 2.0 license for anyone to experiment with. Download the model weights on @huggingface, and learn more about DiffusionGemma → https://t.co/nPFBhQQqqj

Google's tweet photo. We're releasing DiffusionGemma as an open model under an Apache 2.0 license for anyone to experiment with.

Download the model weights on @huggingface, and learn more about DiffusionGemma → https://t.co/nPFBhQQqqj https://t.co/ZcRbe3LsT6

12

462

35

92

45K

nightwingng retweeted

Google

@Google

1 day ago

Meet DiffusionGemma ⚡ Our latest experimental open model (Apache 2.0) that generates text up to 4x faster. Instead of predicting and typing just one word at a time like most language models, it drafts and refines entire blocks of text simultaneously. Here’s how it works 🧵 ↓

Google's tweet photo. Meet DiffusionGemma ⚡ Our latest experimental open model (Apache 2.0) that generates text up to 4x faster.

Instead of predicting and typing just one word at a time like most language models, it drafts and refines entire blocks of text simultaneously.

Here’s how it works 🧵 ↓

114

3K

377

934

222K

nightwingng retweeted

Halo

@Halo

4 days ago

Experience Halo like never before. Halo: Campaign Evolved arrives on July 28, including three brand new missions set one year before the events of Halo: Combat Evolved. #HaloCE

670

21K

4K

2K

3M

Who to follow

Nikolaos Stergioulas

@nikstergioulas

Astrophysicist, Aristotle University of Thessaloniki. Theory of relativity, Neutron stars, Gravitational waves.

Keefe Mitman

@KeefeMitman

Theoretical Physicist from Wisconsin 🧀 | @NASA Einstein Fellow at @Cornell | PhD @Caltech | BA @Columbia | Love black holes and gravitational waves!

Will M Farr

@farrwill

Astrophysicist, formerly of @unibirmingham & @UoBIGWaves, currently lead gravitational wave group at @flatironCCA, Assoc. Prof @stonybrooku.

nightwingng retweeted

Call of Duty

@CallofDuty

4 days ago

Get your first look at the living world of Hajin in DMZ: the high-stakes extraction experience arriving October 23 as part of Modern Warfare 4. #MW4 | #XBOXShowcase

439

8K

978

989

2M

nightwingng retweeted

AnalyzeThat

@test_user_X1

5 days ago

CFD test of the alleged Area 51 craft spotted in recent IR/thermal imagery (structure seems to be a mix of the X-36 and the Bird of Prey). Four-regime Mach sweep: M0.8, M1.1, M2.0, M3.0. 2D Euler Simulation only, but the silhouette is wild.

33

2K

231

587

305K

nightwingng retweeted

Pandramodo

@pandramodo

7 days ago

...Is it her?

34

2K

169

342

41K

nightwingng retweeted

Dylan Nguyen @DylanA_Nguyen

8 days ago

Other than the thermal imaging discrepancies that others have pointed out, the silhouette of the aircraft in this photo looks suspiciously like a speculative F-47 3D model made by the modeler/artist “Netrunner.” Link to the 3D model: https://t.co/ATPXwWlJkb

DylanA_Nguyen's tweet photo. Other than the thermal imaging discrepancies that others have pointed out, the silhouette of the aircraft in this photo looks suspiciously like a speculative F-47 3D model made by the modeler/artist “Netrunner.”

Link to the 3D model:
https://t.co/ATPXwWlJkb https://t.co/6xboz9GsHL

21

682

55

206

77K

nightwingng retweeted

Zhao DaShuai 东北进修🇨🇳 Commentary

@zhao_dashuai

8 days ago

This is so hilarious, the F-47, the so called 6th gen fighter of the US will most likely have canards. This comes after Americans spending 10 years mocking the canards of the Chinese J-20.😆 Some Americans tried to play the canards on the official rendering off, as an artist impression. But if this is indeed the first image of the F-47, then it's has 2 giant canards, it looks so ugly and out of proportion.

zhao_dashuai's tweet photo. This is so hilarious, the F-47, the so called 6th gen fighter of the US will most likely have canards.

This comes after Americans spending 10 years mocking the canards of the Chinese J-20.😆

Some Americans tried to play the canards on the official rendering off, as an artist impression. But if this is indeed the first image of the F-47, then it's has 2 giant canards, it looks so ugly and out of proportion.

58

955

78

112

51K

nightwingng retweeted

NASA

@NASA

9 days ago

Later this year, @NASARoman will launch into orbit, where it will capture both the big picture and the finer details of the cosmos—observing distant celestial bodies with its wide view. Add Roman to your phone—download this free poster: https://t.co/xm1cTRadRO

NASA's tweet photo. Later this year, @NASARoman will launch into orbit, where it will capture both the big picture and the finer details of the cosmos—observing distant celestial bodies with its wide view.

Add Roman to your phone—download this free poster: https://t.co/xm1cTRadRO https://t.co/VArJZhhuhQ

352

6K

2K

307

607K

nightwingng retweeted

Dave Lee

@Dave2D

11 days ago

First Impressions on Nvidia's New Laptops - RTX Spark https://t.co/mE7e3VKw3O

7

232

14

37

20K

nightwingng retweeted

NVIDIA RTX Spark

@NVIDIARTXSpark

11 days ago

NVIDIA RTX Spark reinvents @Windows PCs for the era of personal AI agents, offering a new class of computer that moves from tool to teammate.

64

2K

231

303

2M

nightwingng retweeted

CharlieIntel

@charlieINTEL

15 days ago

THE VAULT EDITION SKINS for Modern Warfare 4

123

5K

303

357

321K

nightwingng retweeted

Infinity Ward

@InfinityWard

14 days ago

#MW4 ✅ Theater Mode ✅ Classic Perks ✅ Weapon Build Sharing ✅ All-new movement suite ✅ Slide Cancelling ✅ Slide-to-ADS ✅ Sprint & Mantle Assist ✅ Dynamic Kill Block map ✅ Apex Attachments ✅ Gunny weapon build recommender ✅ Destructible Riot Shield ✅ Riot Shield = Field Upgrade ✅ Red dots ✅ Ninja Perk is back ✅ No bloom ✅ Map Voting ✅ Operators & Killstreaks in Loadouts ✅ Equip any Operator, no factions ✅ No more last-gen support ✅ Two Prestige Paths And more in the tank for later this year ⛽️ https://t.co/ZPI4dV1QZh

InfinityWard's tweet photo. #MW4

✅ Theater Mode
✅ Classic Perks
✅ Weapon Build Sharing
✅ All-new movement suite
✅ Slide Cancelling
✅ Slide-to-ADS
✅ Sprint & Mantle Assist
✅ Dynamic Kill Block map
✅ Apex Attachments
✅ Gunny weapon build recommender
✅ Destructible Riot Shield
✅ Riot Shield = Field Upgrade
✅ Red dots
✅ Ninja Perk is back
✅ No bloom
✅ Map Voting
✅ Operators & Killstreaks in Loadouts
✅ Equip any Operator, no factions
✅ No more last-gen support
✅ Two Prestige Paths

And more in the tank for later this year ⛽️

https://t.co/ZPI4dV1QZh

2K

17K

2K

1M

nightwingng retweeted

宇宙線研究所 @ICRRpr

16 days ago

スーパーカミオカンデは、2026年4月1日、観測開始から30年という大きな節目を迎えました。岐阜県飛騨市神岡町の地下1,000メートルに建設され、1996年4月1日から運用を開始しました。詳細は以下リンクをご覧ください。 https://t.co/GJhpziFBUP

3

2K

617

99

50K

nightwingng retweeted

PlayStation

@PlayStation

15 days ago

Call of Duty: Modern Warfare 4 launches October 23 on PS5. First intel on Campaign, Multiplayer, and DMZ: https://t.co/DbFzMeUx8X

PlayStation's tweet photo. Call of Duty: Modern Warfare 4 launches October 23 on PS5.

First intel on Campaign, Multiplayer, and DMZ: https://t.co/DbFzMeUx8X https://t.co/bvQqF3LrsE

152

5K

544

249

343K

nightwingng retweeted

Call of Duty

@CallofDuty

15 days ago

No line holds forever. Modern Warfare 4 releases on October 23, 2026.

4K

79K

12K

25M

nightwingng retweeted

Artificial Analysis

@ArtificialAnlys

22 days ago

Alibaba’s new Qwen3.7 Max model scores 56.6 on the Artificial Analysis Intelligence Index, 4.8 points higher than Qwen3.6 Max Preview (51.8). While Alibaba still trails models from OpenAI, Anthropic and Google, Qwen3.7 Max is the closest they have been to the frontier Qwen3.7 Max is @Alibaba_Qwen's latest proprietary flagship, scoring 56.6 on the Intelligence Index, a 4.8 point gain over Qwen3.6 Max Preview (51.8) released in April. Qwen3.7 Max continues Alibaba's pattern, in place since Qwen2.5 Max (January 2025), of releasing Max and Plus models as closed weights while the rest of the Qwen line remains open weights. The leading open weights Qwen on the Intelligence Index is Qwen3.6 27B (Reasoning, 45.8) released in April 2026, and the leading open weights MoE Qwen is Qwen3.5 397B A17B (Reasoning, 45.0) released in February 2026 Key takeaways for the reasoning variant: ➤ The Intelligence Index gains over Qwen3.6 Max Preview are concentrated in scientific reasoning, agentic capability and coding. CritPt +9.7 p.p (3.7% to 13.4%), HLE +9.2 p.p (28.9% to 38.1%), TerminalBench Hard +6.9 p.p (43.9% to 50.8%) and GDPval-AA +42 Elo (1504 to 1546). Scores on other benchmarks in the Intelligence Index are flat compared to Qwen3.6 Max Preview ➤ A significant share of the Intelligence Index gain is driven by higher abstention on AA-Omniscience, not higher accuracy. Qwen3.7 Max's accuracy on AA-Omniscience dropped 7.6 p.p (37.7% to 30.1%), while its hallucination rate dropped 21.3 p.p (44.2% to 22.9%). The model is choosing not to answer more questions rather than recalling more facts. Because hallucination rate and accuracy both feed into the Intelligence Index, the hallucination reduction is one of the larger single contributors to the +4.8 point gain on the Intelligence Index ➤ Qwen3.7 Max used 96.7M output tokens to run the Intelligence Index, ~31% more than Qwen3.6 Max Preview (73.9M). It sits mid-pack on frontier token usage: above GPT-5.5 (high, 44.5M) and Gemini 3.1 Pro Preview (57.3M), below Claude Opus 4.7 (Adaptive Reasoning, Max Effort, 112M), Kimi K2.6 (166M) and DeepSeek V4 Pro (Reasoning, Max Effort, 187M) Key model details: ➤ Context window: 1M tokens (up from 256K on Qwen3.6 Max Preview) ➤ Multimodality: Text input and output only ➤ Pricing: Yet to be announced (Qwen3.6 Max Preview is priced at $1.30/$7.80 per 1M input/output tokens on the @alibaba_cloud first-party API) ➤ Licensing: Proprietary, closed weights

ArtificialAnlys's tweet photo. Alibaba’s new Qwen3.7 Max model scores 56.6 on the Artificial Analysis Intelligence Index, 4.8 points higher than Qwen3.6 Max Preview (51.8). While Alibaba still trails models from OpenAI, Anthropic and Google, Qwen3.7 Max is the closest they have been to the frontier

Qwen3.7 Max is @Alibaba_Qwen's latest proprietary flagship, scoring 56.6 on the Intelligence Index, a 4.8 point gain over Qwen3.6 Max Preview (51.8) released in April. Qwen3.7 Max continues Alibaba's pattern, in place since Qwen2.5 Max (January 2025), of releasing Max and Plus models as closed weights while the rest of the Qwen line remains open weights. The leading open weights Qwen on the Intelligence Index is Qwen3.6 27B (Reasoning, 45.8) released in April 2026, and the leading open weights MoE Qwen is Qwen3.5 397B A17B (Reasoning, 45.0) released in February 2026

Key takeaways for the reasoning variant:
➤ The Intelligence Index gains over Qwen3.6 Max Preview are concentrated in scientific reasoning, agentic capability and coding. CritPt +9.7 p.p (3.7% to 13.4%), HLE +9.2 p.p (28.9% to 38.1%), TerminalBench Hard +6.9 p.p (43.9% to 50.8%) and GDPval-AA +42 Elo (1504 to 1546). Scores on other benchmarks in the Intelligence Index are flat compared to Qwen3.6 Max Preview

➤ A significant share of the Intelligence Index gain is driven by higher abstention on AA-Omniscience, not higher accuracy. Qwen3.7 Max's accuracy on AA-Omniscience dropped 7.6 p.p (37.7% to 30.1%), while its hallucination rate dropped 21.3 p.p (44.2% to 22.9%). The model is choosing not to answer more questions rather than recalling more facts. Because hallucination rate and accuracy both feed into the Intelligence Index, the hallucination reduction is one of the larger single contributors to the +4.8 point gain on the Intelligence Index

➤ Qwen3.7 Max used 96.7M output tokens to run the Intelligence Index, ~31% more than Qwen3.6 Max Preview (73.9M). It sits mid-pack on frontier token usage: above GPT-5.5 (high, 44.5M) and Gemini 3.1 Pro Preview (57.3M), below Claude Opus 4.7 (Adaptive Reasoning, Max Effort, 112M), Kimi K2.6 (166M) and DeepSeek V4 Pro (Reasoning, Max Effort, 187M)

Key model details:
➤ Context window: 1M tokens (up from 256K on Qwen3.6 Max Preview)
➤ Multimodality: Text input and output only
➤ Pricing: Yet to be announced (Qwen3.6 Max Preview is priced at $1.30/$7.80 per 1M input/output tokens on the @alibaba_cloud first-party API)
➤ Licensing: Proprietary, closed weights

47

1K

118

169

324K

nightwingng retweeted

Huawei

@Huawei

17 days ago

HUAWEI has unveiled the Tau (τ) Scaling Law, a new principle guiding the evolution of both semiconductors and electronic systems, which is expected to deliver transistor density equivalent to 14 Å (1.4 nm) processes in high-end chips by 2031.

Huawei's tweet photo. HUAWEI has unveiled the Tau (τ) Scaling Law, a new principle guiding the evolution of both semiconductors and electronic systems, which is expected to deliver transistor density equivalent to 14 Å (1.4 nm) processes in high-end chips by 2031.

21

447

80

21

27K

nightwingng retweeted

Google Antigravity

@antigravity

23 days ago

Introducing Antigravity 2.0, a new standalone desktop application that delivers fully on that original glimpse of a truly agent-optimized experience. Rebuilt from the ground up with multi-agent teams, scheduled tasks, native voice and one-click integration with other Google products. Learn how to get started with Antigravity 2.0 👇

2K

10K

1K

5K

2M

Shichao

@nightwingng

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users