Robbie Boyd

about 1 month ago

NHS sees biggest improvement in waiting times in 16 years https://t.co/CxLM0P4cHI

735

4K

2K

149

722K

about 2 months ago

Oncology nurses believe they are having miscarriages & other conditions from giving chemotherapy without proper PPE or ‘closed systems’, as our current legislation lags behind EU and US @vsmacdonald @Rebeccasmt and I worked with @_SHBN_ and @theRCN on this incredible story

24 | Cardiff Uni Grad | Winchester | Lil SB

about 2 months ago

NHS staff claim chemotherapy drugs could have exposed them to harm https://t.co/6vlkXweUw6

2

9

4

3

6K

0

2

4

0

777

Who to follow

République du Sénégal

robbieboyd_ retweeted

Kristin Merrilees @kristnmerrilees

2 months ago

whenever you tell someone at a party you’re a journalist and they clamor to clarify that they’re “off the record” it’s like. you just told me a story about your cat throwing up after eating funfetti cake batter. should we call the new york times? should we invite bella hadid ?

82

10K

426

475

2M

robbieboyd_ retweeted

Cathy Newman

@cathynewman

3 months ago

My last ever ⁦@Channel4News⁩ NOW - and here’s the team who made it

251

1K

46

35

142K

robbieboyd_ retweeted

3 months ago

‘My dad’s life was an ‘afterthought’: Families give evidence at Nottingham attack inquiry https://t.co/8vXDRlXUMm

8

231

42

1

12K

robbieboyd_ retweeted

Hannah Barnes

@hannahsbee

3 months ago

NEW 🧵Oxford maternity: Following a joint @NewStatesman & C4 News investigation into maternity services at Oxford University Hospitals, the BBC have found 58 babies’ may have been saved if they/their mothers had received better care at OUH, 2019 – 2024 https://t.co/F7LJCBSIJy

1

40

20

6

4K

robbieboyd_ retweeted

3 months ago

How dangerous delays left thousands waiting more than 6 hours in ambulances https://t.co/GFOzj3ddSI

3

18

12

1

6K

robbieboyd_ retweeted

Abdul Șhakoor

@abxxai

4 months ago

BREAKING: 🚨 Someone just tested 35 AI models across 172 billion tokens of real document questions. The hallucination numbers should end the "just give it the documents" argument forever. Here is what the data actually showed. The best model in the entire study, under perfect conditions, fabricated answers 1.19% of the time. That sounds small until you realize that is the ceiling. The absolute best case. Under optimal settings that almost no real deployment uses. Typical top models sit at 5 to 7% fabrication on document Q&A. Not on questions from memory. Not on abstract reasoning. On questions where the answer is sitting right there in the document in front of it. The median across all 35 models tested was around 25%. One in four answers fabricated, even with the source material provided. Then they tested what happens when you extend the context window. Every company selling 128K and 200K context as the hallucination solution needs to read this part carefully. At 200K context length, every single model in the study exceeded 10% hallucination. The rate nearly tripled compared to optimal shorter contexts. The longer the window people want, the worse the fabrication gets. The exact feature being sold as the fix is making the problem significantly worse. There is one more finding that does not get talked about enough. Grounding skill and anti-fabrication skill are completely separate capabilities in these models. A model that is excellent at finding relevant information in a document is not necessarily good at avoiding making things up. They are measuring two different things that do not reliably correlate. You cannot assume a model that retrieves well also fabricates less. 172 billion tokens. 35 models. The conclusion is the same across all of them. Handing an LLM the actual document does not solve hallucination. It just changes the shape of it.

abxxai's tweet photo. BREAKING: 🚨 Someone just tested 35 AI models across 172 billion tokens of real document questions.

The hallucination numbers should end the "just give it the documents" argument forever.

Here is what the data actually showed.

The best model in the entire study, under perfect conditions, fabricated answers 1.19% of the time. That sounds small until you realize that is the ceiling. The absolute best case. Under optimal settings that almost no real deployment uses.

Typical top models sit at 5 to 7% fabrication on document Q&A. Not on questions from memory. Not on abstract reasoning. On questions where the answer is sitting right there in the document in front of it.

The median across all 35 models tested was around 25%.

One in four answers fabricated, even with the source material provided.

Then they tested what happens when you extend the context window. Every company selling 128K and 200K context as the hallucination solution needs to read this part carefully.

At 200K context length, every single model in the study exceeded 10% hallucination. The rate nearly tripled compared to optimal shorter contexts.

The longer the window people want, the worse the fabrication gets. The exact feature being sold as the fix is making the problem significantly worse.

There is one more finding that does not get talked about enough.

Grounding skill and anti-fabrication skill are completely separate capabilities in these models.

A model that is excellent at finding relevant information in a document is not necessarily good at avoiding making things up. They are measuring two different things that do not reliably correlate. You cannot assume a model that retrieves well also fabricates less.

172 billion tokens. 35 models. The conclusion is the same across all of them.

Handing an LLM the actual document does not solve hallucination. It just changes the shape of it.

261

5K

1K

3K

478K

4 months ago

@visionergeo @grok can you translate this video into English, using descriptors of who's speaking

2

0

34

4 months ago

Assertive outreach is well evidenced intervention to keep MH patients safe in the community. It should exist across UK, but we were told lack of funding has meant just 1/3 of Trusts run it. Calocane was not on one of these programs

1

0

86

4 months ago

Snr NHS staff told us that system wide pressures mean Calocane could have happened anywhere. Indeed our analysis found 23 similar killings by mentally unwell strangers in the year before the Nottingham attacks & as many since- inquiry will need to go beyond "lessons learned"

4 months ago

Nottingham attacks inquiry: What state are UK mental health services in? https://t.co/s2uZN2Pzmw

8

16

9

6

8K

2

4

0

351

robbieboyd_ retweeted

4 months ago

What was the impact of the pandemic on society? Final module of Covid inquiry begins https://t.co/VcuL61zCcY

1

5

3

0

5K

robbieboyd_ retweeted

4 months ago

The Nottingham Inquiry - what lessons need to be learned from the Calocane killings? The Nottingham Inquiry will examine Valdo Calocane's killings. Will it also shed a light on other deaths caused by mental health service failings? Victoria Macdonald writes. Read the full article on Substack: https://t.co/cp7KtYuowo

Channel4News's tweet photo. The Nottingham Inquiry - what lessons need to be learned from the Calocane killings?

The Nottingham Inquiry will examine Valdo Calocane's killings. Will it also shed a light on other deaths caused by mental health service failings? Victoria Macdonald writes.

Read the full article on Substack:
https://t.co/cp7KtYuowo

0

10

2

1

5K

robbieboyd_ retweeted

4 months ago

Exclusive: ‘Spycam’-style footage inside Jeffrey Epstein’s living room uncovered in latest files

18

117

62

33

26K

5 months ago

@beatscaduk @heartresearchuk talk to us about the lesser known type of heart attack which has gone under researched, and often misdiagnosed, affecting otherwise fit, healthy adults

5 months ago

SCAD: heart attacks affecting mostly young women https://t.co/jixRmbHHFh

5

19

13

3

11K

1

4

2

0

519

6 months ago

https://t.co/NljhLFH3e9

0

2

0

189

Sophie Wilkinson @sophwilkinson

6 months ago

For the next year I'll be the Health Producer for @Channel4News - you can contact me confidentially at [email protected] with any stories or tip offs #NHS #health #news #uk

1

9

1

0

535

robbieboyd_ retweeted

6 months ago

Incredible work from the @Telegraph’s investigations team, getting THE story that journalists have been trying to tell for years now. Walliams has always been publicly creepy but this is another level. https://t.co/34yyvDVdLI

103

3K

314

814

909K