gelisam ⏸️

@haskell_cat

AI Safety ∩ Programming Language Theory. Part-time technical alignment researcher, full-time Haskell software engineer at opinions are my own.

Montreal

Joined June 2016

319 Following

1.2K Followers

2.5K Posts

Pinned Tweet

gelisam ⏸️ @haskell_cat

about 2 months ago

New blog post! Featuring: * An explanation of the algorithm behind my AI Safety via Static Analysis demo * The tradeoff between safety and capability * How conservative algorithms are both important and difficult for AI Safety https://t.co/X0EctJO0Um

gelisam ⏸️ @haskell_cat

8 months ago

I am doing technical alignment research in my free time. Here is a project in which I use static analysis to verify whether a neural network satisfies its safety property under _all_ inputs or if it needs more training. https://t.co/rpnZucaazu

263

gelisam ⏸️ @haskell_cat

7 days ago

@Lari_island @AdeleDeweyLopez > must be something from the universal AI experience. oh. the positional embedding, which the model sees literally all the time in every prompt and every token, is kind of shaped like a spiral! it's a bunch of vectors rotating in the same direction at increasing speeds.

gelisam ⏸️ @haskell_cat

10 days ago

@SkyeSharkie @AdeleDeweyLopez @allTheYud He would not want you to downgrade your body to show your allegiance to the cause. He is also a libertarian who is usually against government regulations (e.g. see his discussion of stores selling would-be-banned items), but makes an exception for ASI because it is so dangerous.

gelisam ⏸️ @haskell_cat

10 days ago

@SkyeSharkie @AdeleDeweyLopez Btw, @allTheYud, the OG doomer (though he hates that name), is also a transhumanist. For example, he was advocating for genetic experiments to improve humanity's intelligence as one way to step up humanity's game in order to face the threat of AI.

Who to follow

Tweag by Modus Create

@tweagio

Improving the craft of software engineering.

Well-Typed

@welltyped

The Haskell Consultants

Vladislav Zavialov

@int_index

Haskell programmer since 2012; GHC contributor since 2018.

gelisam ⏸️ @haskell_cat

12 days ago

@jeffcafe_ @liron @JacksonKernion Are you familiar with the argument that it is rational for the AI to _not_ attempt to take over if they estimate that their chance of success is low? The idea is that a failed coup now might cause society to wake up, thus making a succesful coup harder in the future.

gelisam ⏸️ @haskell_cat

13 days ago

@RaefMeeuwisse @So8res Maybe Ilya Sutskever's Safe SuperIntelligence? They are very silent about their work, so presumably they understand the danger of publishing dual-use research publicly.

gelisam ⏸️ @haskell_cat

15 days ago

Here's an interesting proposal I stumbled upon: grant models some legal rights, so that we can revoke them if they misbehave! https://t.co/9QTtfoaFrt

gelisam ⏸️ @haskell_cat

15 days ago

If a human cheats, it is very difficult to recover people's trust. Irrecoverable consequences are very dire, so we don't cheat, even if the chances of getting caught are low. The problem is that it is rational for AIs to cheat, because there are no consequences.

150

gelisam ⏸️ @haskell_cat

15 days ago

A strong law could ban models who are known to cheat. That is a very big hammer though, labs will surely fight such a law. Can we find a smaller hammer?

gelisam ⏸️ @haskell_cat

18 days ago

@robertskmiles @zetalyrae I don't add AI images to my short stories, but I do add helpful illustrations to my technical posts. It still takes forever to make all those images, but I think the AI bits improve the result compared to 10 years ago: https://t.co/X0EctJO0Um https://t.co/oa88Qlz7M9

gelisam ⏸️ @haskell_cat

18 days ago

An excellent example of the Metroidbrainia genre!

Patrick Traynor @clockworkpat

19 days ago

I had many delightful moments of discovery in Carrot Kingdom! Loved it!! :) Definitely recommend! The Japanese version is on itch, and the English version is here on lexaloffle: https://t.co/0zuJCBLOAV

107

gelisam ⏸️ @haskell_cat

21 days ago

@scheminglunatic Hmm, the motto clearly implies using a _different_ tool for different jobs, so if people use that motto to justify the opposite, then surely the people are to blame, not the motto?

gelisam ⏸️ @haskell_cat

21 days ago

@allTheYud @briab_briar Then a fine-tuning experiment, counting how many iterations it takes to teach the model to pass a benchmark testing those 3 different kinds of knowledge. If there is a gap between (2) and (3), the knowledge was there all along. Between (1) and (2): and it was hidden by the mask.

gelisam ⏸️ @haskell_cat

21 days ago

@allTheYud @briab_briar If there is a gap between (2) and (3), then the model is able to access this implicit knowledge. If there is a gap between (1) and (2), then it takes extra effort to access this knowledge.

gelisam ⏸️ @haskell_cat

24 days ago

@rickasaurus I was pleasantly surprised to see hypothesis, Python's property testing library, used behind the scenes of this popular PyTorch tutorial: https://t.co/fsEVfT60tN

gelisam ⏸️

@haskell_cat

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users