Dieter Castel

@DieterCastel

Engineer, ex-@NVISOsecurity, Alumnus CompSci @CW_KULeuven Math/STEAM/ML/@julialangu Enthusiast, Traceur, Multi-Genre music lover. #MathsJam @stadleuven cohost.

Leuven - Belgium

Joined February 2014

2.2K Following

415 Followers

6.5K Posts

Pinned Tweet

Dieter Castel @DieterCastel

over 6 years ago

My plead for using #privacy friendly communication is now already available in 3 languages: Nederlands 🇳🇱, Español 🇪🇸 & English 🇬🇧 https://t.co/HrVPxPzdKa

DieterCastel retweeted

Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)

@rao2z

about 2 years ago

No. Training LLMs on purely factual data STILL WON'T cure them of "Hallucinations" #SundayHarangue There is a persistent myth that LLM hallucinations are just a result of them being trained on un-curated and "non-factual" data, and will go away with high quality/factual data. This misses the basic n-gram structure of LLMs. Yes, the presence of "non-factual" training data does increase the chance of producing "non-factual" completions. But, even if you train LLMs only on factual data (and I will suspend my disbelief for a minute about the impossibility of doing that in a multi-polar world), LLMs can and will still continue to produce completions that are not factual! A simplistic way to visualize it is this: Imagine you have access to a 1000 curated wikipedia documents. Don't you think that by selectively cutting pasting from those documents, you can generate an inaccurate/not-fully-factual new one? This happens because LLMs are completing the prompt probabilistically conditioned on the training corpus ("approximate retrieval") rather than indexing and retrieving like (the boring and much maligned) databases! (See https://t.co/qMJxiOCMGI; quoted below). The fact that factuality of the training data is not sufficient to avoid hallucinations is demonstrated in multiple ways in the current LLM usage patterns: (1) When you ask an LLM to generate a bio for you, it often combines factual statements with some made-up ones. (2) When you ask an LLM to summarize a given document (in the RAG style) it still can generate an incorrect summary (e.g. the work showing that 50% of book summaries contain factual errors https://t.co/jLGqgOl1ve) (3) When you fine-tune an LLM all LLaMAI-style (e.g. https://t.co/V9np4DWaiZ), it can improve the generation but doesn't completely avoid hallucinated completions. tldr; higher quality training data can improve the quality of completions, but doesn't guarantee factuality as it can't fuly eliminate the possibility of hallucination. In general, the n-gram nature of LLMs makes them inherently "creative" helping them mix and match content/patterns they drew from different parts of the corpora. This is their boon--and also bane. 👉https://t.co/rY8KDbWAdV If factuality/correctness/truth is critical, you have to go LLM-Modulo external verifiers.. https://t.co/mREKgH8mxk (https://t.co/RlbpwkYuns)

193

37K

DieterCastel retweeted

Matt Enlow @CmonMattTHINK

over 2 years ago

What proportion of quadratics have real roots?

124

55K

Dieter Castel @DieterCastel

over 2 years ago

I stumbled upon these lovely @PlutoJL notebooks https://t.co/9mUZPJl5yX and I think they would fit nicely in the @explorables collection as well. :-)

109

Dieter Castel @DieterCastel

over 2 years ago

I'd be pro regulation mandating these being published.

Dieter Castel @DieterCastel

over 2 years ago

The #GeminiAI paper (in)conveniently doesn't mention how long it trained nor the energy usage required. Anyone got more info on that?

150

DieterCastel retweeted

Bart Preneel @bpreneel1

over 2 years ago

Who would have thought - ChatGPT's heartbleed moment

DieterCastel retweeted

Tuta

@TutaPrivacy

over 2 years ago

📢 BREAKING 📢 Historic agreement on #chatcontrol proposal: EU Parliament wants to remove chat control and safeguard secure encryption. 🔒 💪Let's keep pushing for strong privacy rights!👇 https://t.co/N9Qd43c5WQ

TutaPrivacy's tweet photo. 📢 BREAKING 📢

Historic agreement on #chatcontrol proposal: EU Parliament wants to remove chat control and safeguard secure encryption. 🔒

💪Let's keep pushing for strong privacy rights!👇
https://t.co/N9Qd43c5WQ https://t.co/o7KlIud1U3

177

15K

DieterCastel retweeted

Moshe Vardi @vardi

over 2 years ago

:-)

vardi's tweet photo. :-) https://t.co/Dqx6uqB5Nd

831

296

276K

DieterCastel retweeted

Prakash

@8teAPi

over 2 years ago

Vicious Self-Degradation > you Google > Quora spots query and id’s as frequent > Quora uses ChatGPT to generate answer > ChatGPT hallucinates > Google picks up Quora answer as highest probability correct answer > ChatGPT hallucination is now canonical Google answer

8teAPi's tweet photo. Vicious Self-Degradation

> you Google
> Quora spots query and id’s as frequent
> Quora uses ChatGPT to generate answer
> ChatGPT hallucinates
> Google picks up Quora answer as highest probability correct answer
> ChatGPT hallucination is now canonical Google answer https://t.co/qInLfIMGtc

169

11K

DieterCastel retweeted

Cliff Pickover

@pickover

almost 3 years ago

Mathematics. "What’s the area of the toppled square?" (All blocks are squares. The diagram is not to scale. The numbers represent areas of squares.) By Catriona Agg, @Cshearer41, Used with permission.

pickover's tweet photo. Mathematics.

"What’s the area of the toppled square?" (All blocks are squares. The diagram is not to scale. The numbers represent areas of squares.)

By Catriona Agg, @Cshearer41, Used with permission. https://t.co/jwqQZyxBs5

240

72K

DieterCastel retweeted

LLM Security @llm_sec

about 3 years ago

* People ask LLMs to write code * LLMs recommend imports that don't actually exist * Attackers work out what these imports' names are, and create & upload them with malicious payloads * People using LLM-written code then auto-add malware themselves https://t.co/Va9w18RpWu

Dieter Castel @DieterCastel

about 3 years ago

Twee vraagjes voor https://t.co/PUKnVZXj2V @fostplusnl 1) Verstorven elastiekjes bij PMD? 2) Eenzijdige zilverpapiertjes zoals bij chocolaatjes zit restafval of PMD? (sommige zilverpapier mag bij rest anderre moet bij PMD :S)

Dieter Castel @DieterCastel

about 3 years ago

I keep wondering, @katiesteckles, is the eurosong theme a boon or a hurdle for the MJ target audience? I'm def. ambivalent myself, but maybe you can recall from previous years. :-)

Dieter Castel @DieterCastel

about 3 years ago

Next week tuesday @ 20:00 in OPEK café @stadleuven. The monthly #Leuven #MathsJam. See u there? Below a Eurosong theme teaser flyer \/. :-)

$DieterCastel's tweet photo. Next week tuesday @ 20:00 in OPEK café @stadleuven. The monthly #Leuven #MathsJam. See u there? Below a Eurosong theme teaser flyer \/. :-) https://t.co/eVXtrwTKZI$

161

Dieter Castel @DieterCastel

about 3 years ago

https://t.co/SCR429ublj

Dieter Castel @DieterCastel

about 3 years ago

@xsteenbrugge The EU is not the one making it impossible. The Big Tech monopoly is... that's imho much more relevant in the field atm than this legislation. @jbaert @thomas_wint ?

247

Dieter Castel @DieterCastel

about 3 years ago

@xsteenbrugge There's much to say about the #GDPR, it's far from perfect. But failed? [source needed] It put #privacy on the map in an important way.

Dieter Castel @DieterCastel

about 3 years ago

@xsteenbrugge Honest question: Where's the "small company competion" in ML right now? Even LARGE academic institutions can't compete with big tech atm. There's much more needed for healthy competition imho.

Dieter Castel @DieterCastel

about 3 years ago

@cyrilzakka @jeremyphoward I think the same goes for many high risk fields. Current gen AI-models are often dangerously brittle. Take a look at my TL for some failures i posted of ChatGPT in december 2022.

DieterCastel retweeted

Cyril Zakka, MD

@cyrilzakka

about 3 years ago

@jeremyphoward In a way, I welcome this initiative for medical generative AI. The amount of poorly tested models for clinical AI I’ve seen out there is going to cause a lot of harm.

25K

Dieter Castel

@DieterCastel

Last Seen Users on Sotwe

Trends for you

Most Popular Users