Oğuz Kağan Hitit

@kaganhitit_

CS & EE @kocuniversity Co-Founder @leaglehukuk Research @KuisAICenter ex-Research @ExplainableML

Joined June 2020

748 Following

174 Followers

242 Posts

Oğuz Kağan Hitit @kaganhitit_

about 13 hours ago

@attentionMatrix Tebrikler! Seninle ve İsmail ile Münih’te tanışmak büyük zevkti. Başarılarının devamını keyifle izliyor olacağım 👏🏻👏🏻

kaganhitit_ retweeted

Isha Puri @ ICML

@ishapuri101

about 1 month ago

It's never made sense to me that RL collapses all reward signals to a single scalar. Today, we fix that! Introducing Vector Policy Optimization: we train models to inherently optimize for the varied nature of a reward vector, creating diverse sets of answers ideal for test time search. Website and code coming soon!

715

578

69K

kaganhitit_ retweeted

sophie

@netcapgirl

about 1 month ago

this is KD joining the warriors for people who know linear algebra

11K

782

551

668K

kaganhitit_ retweeted

Harrison Kinsley

@Sentdex

6 months ago

All innovation and the gains from industrialization is due to abstraction. When I first got into programming Python, this was the standard take for people using Python. The concern was that Python abstracted away too much from "real" languages like C++ and that you weren't going to learn proper computer science if you didn't know C++. Many of my viewers were also students who would tell me how their professors taught in C++ because Python was too high level in their opinion. The thing is, when C++ came out, it too faced this exact standard take. It was accused of being too high level compared to C and assembly. C++ even faced the same claims that Python faces about how the abstraction would negatively impact performance and that it was too much bloat. You already know what I'm about to say about Assembly... but it too was accused of being too high level by the TRUE machine coders. We can look back at all of this and laugh, but it's important to recognize that for the entire course of history, not just for computer science, all innovation has been abstraction. That's been the key the entire time, but the next layer of abstraction has been accused of being not real and fear mongered by the local purists of the time...but eventually becomes the thing that the same purist types point to later as the "good ol times."

127

977

171K

Who to follow

kaganhitit_ retweeted

Alex Dimakis

@AlexGDimakis

7 months ago

Agreed. The frontier is on Continual learning, personalization and memory management. We fundamentally don’t know how to do it and it will have direct and immediate impact on enterprise.

273

130

44K

Oğuz Kağan Hitit @kaganhitit_

7 months ago

Checkout mergeval: an evaluation framework for large-scale LLM model merging experiments. It streamlines LLM benchmarking with MergeKit + LM-Eval Harness integration. Repository: https://t.co/hQHEyWxJ8m

102

kaganhitit_ retweeted

fatih kadir akın

@fkadev

8 months ago

I made a simple chart to show the core differences between LLM, RAG, and MCP. It illustrates how each layer adds new abilities, from plain model reasoning to retrieval, real-time actions, and finally, standardized tool calling through Model Context Protocol.

fkadev's tweet photo. I made a simple chart to show the core differences between LLM, RAG, and MCP.

It illustrates how each layer adds new abilities, from plain model reasoning to retrieval, real-time actions, and finally, standardized tool calling through Model Context Protocol. https://t.co/9VM4jPGlfX

184

148

36K

Oğuz Kağan Hitit @kaganhitit_

10 months ago

@altudev Bu tarz işler için ücretsiz olarak Leagle’ı kullanabilirsiniz 😊. Tamamen Türk hukukuna entegre çalışıyoruz. Hukuki araştırma, analiz ve taslak yazmak için deneyebilirsiniz. Aşağıya linki bıraktım. https://t.co/Lhu8fsB98J

782

Oğuz Kağan Hitit @kaganhitit_

11 months ago

2. Bu veritabanı üzerine inşa ettiğimiz araştırma asistanı. Normalde saatler sürecek hukuki nitelendirme ve emsal araştırmasını sizin yerinize dakikalar içinde gerçekleştiriyor. https://t.co/Lhu8fsABjb Leagle’ı ücretsiz deneyin, çevrenize gösterin ✍🏻💎

213

Oğuz Kağan Hitit @kaganhitit_

11 months ago

Burayı şimdiye kadar hep okuyucu olarak kullanmıştım ancak artık üzerinde uzun zamandır çalıştığımız Leagle’dan ve ekibimizle yaptıklarımızdan bahsetmek için kullanmak istiyorum. Eminim ki Türkiye’de yapay zeka tabanlı ajanlar geliştiren birçok ekip var.

kaganhitit_'s tweet photo. Burayı şimdiye kadar hep okuyucu olarak kullanmıştım ancak artık üzerinde uzun zamandır çalıştığımız Leagle’dan ve ekibimizle yaptıklarımızdan bahsetmek için kullanmak istiyorum.

Eminim ki Türkiye’de yapay zeka tabanlı ajanlar geliştiren birçok ekip var. https://t.co/AzE08CqCv5

798

Oğuz Kağan Hitit @kaganhitit_

11 months ago

Dolayısıyla Leagle’da yaptıklarımızı burada daha aktif bir şekilde paylaşacağım. Kısaca Leagle’dan bahsetmem gerekirse, en önemli iki özellik, 1. Anlamsal ve metinsel arama yapılabilen, şu anda Yargıtay’dan SPK’ya 11 farklı kaynaktan, günlük güncellenen bir veritabanı.

257

kaganhitit_ retweeted

Jack Morris

@jxmnop

over 1 year ago

# A new type of information theory this paper is not super well-known but has changed my opinion of how deep learning works more than almost anything else it says that we should measure the amount of information available in some representation based on how *extractable* it is, given finite computation. for example, an encrypted text file has less V-information than the same data in plaintext, because it takes more computation to extract. note the contrast to traditional information theory, which would tell us that the two representations have the same amount of Shannon information i’ve long wondered why certain types of basic questions didn’t have a proper theoretical answer: > why does distillation outperform vanilla maximum likelihood training? > why does lora work better than finetuning? > why does self-attention work better than almost any other similar operation? > how much “information” remains in a text embedding? > how much “information” remains in language model weights? > should i use fine-tuning or RAG? the true answers to all these questions depend on some way of measuring and comparing *information content* between different representations. v-information is one step towards doing this besides computational constraints, model architecture probably affects the “information content” in representations, along with the presence of any pretraining data used, as well as model-level statistics about the optimization in the training process – e.g. the length of time a model was trained for probably changes representations pretty drastically this is all to say, i think there is some true notion of “information” that none of our current paradigms (Shannon information, V-information, etc.) capture. we encounter this idea every day but we dance around it and describe it in vague terms; we measure it from all sorts of angles but can’t quite characterize it theoretically when the v-information paper came out I thought there would be a lot of follow up work developing more complex and useful notions of information for deep learning. but it hasn’t yet I still think at the heart of these questions of what-information-lies-in-representations there’s something to be found that’s profound, elegant, and potentially extremely useful. I don’t know what it is and i’m probably not the person who will figure it out. but I really hope someone does. :)

jxmnop's tweet photo. # A new type of information theory

this paper is not super well-known but has changed my opinion of how deep learning works more than almost anything else

it says that we should measure the amount of information available in some representation based on how *extractable* it is, given finite computation. for example, an encrypted text file has less V-information than the same data in plaintext, because it takes more computation to extract. note the contrast to traditional information theory, which would tell us that the two representations have the same amount of Shannon information

i’ve long wondered why certain types of basic questions didn’t have a proper theoretical answer:
> why does distillation outperform vanilla maximum likelihood training?
> why does lora work better than finetuning?
> why does self-attention work better than almost any other similar operation?
> how much “information” remains in a text embedding?
> how much “information” remains in language model weights?
> should i use fine-tuning or RAG?

the true answers to all these questions depend on some way of measuring and comparing *information content* between different representations. v-information is one step towards doing this

besides computational constraints, model architecture probably affects the “information content” in representations, along with the presence of any pretraining data used, as well as model-level statistics about the optimization in the training process – e.g. the length of time a model was trained for probably changes representations pretty drastically

this is all to say, i think there is some true notion of “information” that none of our current paradigms (Shannon information, V-information, etc.) capture. we encounter this idea every day but we dance around it and describe it in vague terms; we measure it from all sorts of angles but can’t quite characterize it theoretically

when the v-information paper came out I thought there would be a lot of follow up work developing more complex and useful notions of information for deep learning. but it hasn’t

yet I still think at the heart of these questions of what-information-lies-in-representations there’s something to be found that’s profound, elegant, and potentially extremely useful. I don’t know what it is and i’m probably not the person who will figure it out. but I really hope someone does. :)

353

233K

kaganhitit_ retweeted

Andrew Ng

@AndrewYNg

over 1 year ago

Some people today are discouraging others from learning programming on the grounds AI will automate it. This advice will be seen as some of the worst career advice ever given. I disagree with the Turing Award and Nobel prize winner who wrote, “It is far more likely that the programming occupation will become extinct [...] than that it will become all-powerful. More and more, computers will program themselves.” Statements discouraging people from learning to code are harmful! In the 1960s, when programming moved from punchcards (where a programmer had to laboriously make holes in physical cards to write code character by character) to keyboards with terminals, programming became easier. And that made it a better time than before to begin programming. Yet it was in this era that Nobel laureate Herb Simon wrote the words quoted in the first paragraph. Today’s arguments not to learn to code continue to echo his comment. As coding becomes easier, more people should code, not fewer! Over the past few decades, as programming has moved from assembly language to higher-level languages like C, from desktop to cloud, from raw text editors to IDEs to AI assisted coding where sometimes one barely even looks at the generated code (which some coders recently started to call vibe coding), it is getting easier with each step. I wrote previously that I see tech-savvy people coordinating AI tools to move toward being 10x professionals — individuals who have 10 times the impact of the average person in their field. I am increasingly convinced that the best way for many people to accomplish this is not to be just consumers of AI applications, but to learn enough coding to use AI-assisted coding tools effectively. One question I’m asked most often is what someone should do who is worried about job displacement by AI. My answer is: Learn about AI and take control of it, because one of the most important skills in the future will be the ability to tell a computer exactly what you want, so it can do that for you. Coding (or getting AI to code for you) is a great way to do that. When I was working on the course Generative AI for Everyone and needed to generate AI artwork for the background images, I worked with a collaborator who had studied art history and knew the language of art. He prompted Midjourney with terminology based on the historical style, palette, artist inspiration and so on — using the language of art — to get the result he wanted. I didn’t know this language, and my paltry attempts at prompting could not deliver as effective a result. Similarly, scientists, analysts, marketers, recruiters, and people of a wide range of professions who understand the language of software through their knowledge of coding can tell an LLM or an AI-enabled IDE what they want much more precisely, and get much better results. As these tools are continuing to make coding easier, this is the best time yet to learn to code, to learn the language of software, and learn to make computers do exactly what you want them to do. [Original text: https://t.co/HdI3Jb9HmF ]

513

12K

Oğuz Kağan Hitit @kaganhitit_

over 1 year ago

the side project I built with cursor + 3.7 sonnet in around half an hour: https://t.co/sDVeQYRZmG this combination is really strong to build anything

202

Oğuz Kağan Hitit @kaganhitit_

almost 2 years ago

@_GalaHaber Zaha kesin gitti nereden biliyorsun diye sormayın inanmazsınız :D

180

kaganhitit_ retweeted

Alberto Hojel

@AlbyHojel

about 2 years ago

How do you decide between startup and PhD?

181

187K

Oğuz Kağan Hitit @kaganhitit_

about 2 years ago

@destroyert95922 @Velimessidemir1 haklı

214

kaganhitit_ retweeted

Yann LeCun

@ylecun

about 2 years ago

There is no question that AI will eventually reach and surpass human intelligence in all domains. But it won't happen next year. And it won't happen with the kind of Auto-Regressive LLMs currently in fashion (although they may constitute a component of it). https://t.co/ohg9y6qV37

203

292

673

637K

kaganhitit_ retweeted