Mustafa Özdemir

@namruni

Argem/Üsküdar/İstanbul

Joined February 2012

1.2K Following

829 Followers

1.6K Posts

Pinned Tweet

Mustafa Özdemir @namruni

almost 4 years ago

İşin doğrusu bir makina öğrenmesi algoritmasını oyunlaştırmak oldukça garip gelebilir. Ama biz yaptık. 5 ülkeden gelen öğrencilere uyguladık. Üstüne bu Polonya televizyonunda haber de oldu. Daha ne olsun. @MersinMEM @MersinValiligi @tarsusmem1

namruni's tweet photo. İşin doğrusu bir makina öğrenmesi algoritmasını oyunlaştırmak oldukça garip gelebilir. Ama biz yaptık. 5 ülkeden gelen öğrencilere uyguladık. Üstüne bu Polonya televizyonunda haber de oldu. Daha ne olsun. @MersinMEM @MersinValiligi @tarsusmem1 https://t.co/2Gtku1bqt9

namruni retweeted

Eğitim Reformu Girişimi

@egitimreformu

8 days ago

💭 MESEM’e devam etmek bir tercih mi zorunluluk mu? 🔗 ERG Araştırmacısı ve Medya Koordinatörü Umay Aktaş Salman ile Politika Analisti Kayıhan Kesbiç’in hazırladığı “MESEM: Eğitimin Kıyısında İş Hayatının Ortasında” başlıklı yazının tamamı için: https://t.co/54UwB94qiZ

egitimreformu's tweet photo. 💭 MESEM’e devam etmek bir tercih mi zorunluluk mu?

🔗 ERG Araştırmacısı ve Medya Koordinatörü Umay Aktaş Salman ile Politika Analisti Kayıhan Kesbiç’in hazırladığı “MESEM: Eğitimin Kıyısında İş Hayatının Ortasında” başlıklı yazının tamamı için: https://t.co/54UwB94qiZ https://t.co/uBBnnVYJwY

namruni retweeted

Informal Economy

@EconomyInformal

19 days ago

Inflation Rate (%) Türkiye🇹🇷: 30.9 Colombia🇨🇴: 5.6 Mexico🇲🇽: 4.6 Australia🇦🇺: 4.6 Ireland🇮🇪: 3.6 Spain🇪🇸: 3.4 UK🇬🇧: 3.4 USA🇺🇸: 3.3 Germany🇩🇪: 2.7 Netherlands🇳🇱: 2.7 Canada🇨🇦: 2.4 Korea🇰🇷: 2.2 France🇫🇷: 1.7 Italy🇮🇹: 1.7 Japan🇯🇵: 1.5 Switzerland🇨🇭: 0.3 Costa Rica🇨🇷: -2.1 @OECD

EconomyInformal's tweet photo. Inflation Rate (%)

Türkiye🇹🇷: 30.9
Colombia🇨🇴: 5.6
Mexico🇲🇽: 4.6
Australia🇦🇺: 4.6
Ireland🇮🇪: 3.6
Spain🇪🇸: 3.4
UK🇬🇧: 3.4
USA🇺🇸: 3.3
Germany🇩🇪: 2.7
Netherlands🇳🇱: 2.7
Canada🇨🇦: 2.4
Korea🇰🇷: 2.2
France🇫🇷: 1.7
Italy🇮🇹: 1.7
Japan🇯🇵: 1.5
Switzerland🇨🇭: 0.3
Costa Rica🇨🇷: -2.1

@OECD https://t.co/vNnUdboGts

159

74K

namruni retweeted

merve

@mervenoyann

about 1 month ago

Istanbul Open-source AI meet-up @huggingface was 🔥 we had many stories from building in-house models cutting costs to agentic apps 🙌🏼 many thanks @trendyoltech @nsrt_py @anil_ozturkk for hosting us 🤗

mervenoyann's tweet photo. Istanbul Open-source AI meet-up @huggingface was 🔥

we had many stories from building in-house models cutting costs to agentic apps 🙌🏼

many thanks @trendyoltech @nsrt_py @anil_ozturkk for hosting us 🤗 https://t.co/GhcZtx4Bqk

191

23K

Who to follow

Yasemin Özügümüş

@Yaseminzgm

Eğitim Danışmanı-ART Eğitim -Üstün Yetenekli Öğrenciler Eğitmeni-Wilkes University-EğitimTeknolojileriUzmanı-GEMS workshop lideri- #ART Eğitim #egt

Ayşenur Öznalçın Turan

@aysenurrlive

onur

@iOnurUnal

because none of us are as cruel as all of us.

namruni retweeted

Emre Savcı

@mstrYoda_

about 1 month ago

Anthropic tarafından verilen ücretsiz AI Agent kurslarını merak edenler varsa şuradan bakabilir 👇 https://t.co/eFwbiKyAnT

namruni retweeted

François Chollet

@fchollet

about 1 month ago

I wrote Deep Learning with Python to be the definitive guide to how deep learning works and how to best make use of it. Tens of thousands of people got their career start via this book. 120,000 copies sold, and downloaded by millions more. And now it's free to read online: https://t.co/3CbcQ7hmjp

558

719K

namruni retweeted

̶P̶A̶R̶L̶A̶ 🇹🇷

@vldaII

2 months ago

Avustralya'da öğle yemeğine makam aracıyla gittiği ortaya çıkan Yeni Güney Galler eyaleti Ulaştırma Bakanı Haylen'ın istifası istendi. Haylen: "Bir hata yaptım ve bu yüzden özür diliyorum. Özel araç çağırmalıydım."

vldaII's tweet photo. Avustralya'da öğle yemeğine makam aracıyla gittiği ortaya çıkan Yeni Güney Galler eyaleti Ulaştırma Bakanı Haylen'ın istifası istendi.

Haylen: "Bir hata yaptım ve bu yüzden özür diliyorum. Özel araç çağırmalıydım." https://t.co/x9Y9C2iGPp

544

178K

namruni retweeted

Ronald van Loon

@Ronald_vanLoon

3 months ago

#Robot Barbershop: Precision Haircuts Powered by #Automation by @Fabriziobustama #Robotics #EmergingTech #Innovation #Technology

Mustafa Özdemir @namruni

3 months ago

@drmugeoz Merhaba, anneniz Tuna hanıma çok benzemişsiniz..

Mustafa Özdemir @namruni

4 months ago

namruni's tweet photo. https://t.co/mgsU5tCMyf

116

namruni retweeted

Ahmad

@TheAhmadOsman

5 months ago

Hugging Face has released a 214-page MASTERCLASS on how to train LLMs > it’s called The Smol Training Playbook > and if want to learn how to train LLMs, > this GIFT is for you > this training bible walks you through the ENTIRE pipeline > covers every concept that matters from why you train, > to what you train, to how you actually pull it off > from pre-training, to mid-training, to post-training > it turns vague buzzwords into step-by-step decisions > architecture, tokenization, data strategy, and infra > highlights the real-world gotchas > instabilities, scaling headaches, debugging nightmares > distills lessons from building actual > state-of-the-art LLMs, not just toy models how modern transformer models are actually built > tokenization: the secret foundation of every LLM > tokenizer fundamentals > vocabulary size > byte pair encoding > custom vs existing tokenizers > all the modern attention mechanisms are here > multi-head attention > multi-query attention > grouped-query attention > multi-latent attention > every positional encoding trick in the book > absolute position embedding > rotary position embedding > yaRN (yet another rotary network) > ablate-by-frequency positional encoding > no position embedding > randomized no position embedding > stability hacks that actually work > z-loss regularization > query-key normalization > removing weight decay from embedding layers > sparse scaling, handled > mixture-of-experts scaling > activation ratio tuning > choosing the right granularity > sharing experts between layers > load balancing across experts > long-context handling via ssm > hybrid models: transformer plus state space models data curation = most of your real model quality > data curation is the main driver of your model’s actual quality > architecture alone won’t save you > building the right data mixture is an art, > not just dumping in more web scrapes > curriculum learning, adaptive mixes, ablate everything > you need curriculum learning: > design data mixes hat evolve as training progresses > use adaptive mixtures that shift emphasis > based on model stage and performance > ablate everything: run experiments to systematically > test how each data source or filter impacts results > smollm3 data > the smollm3 recipe: balanced english web data, > broad multilingual sources, high-quality code, and diverse math datasets > without the right data pipeline, > even the best architecture will underperform the training marathon > do your preflight checklist or die > check your infrastructure, > validate your evaluation pipelines, > set up logging, and configure alerts > so you don’t miss silent failures > scaling surprises are inevitable > things will break at scale in ways they never did in testing > vanishing throughput? that usually means > you’ve got a hidden shape mismatch or > batch dimension bug killing your GPU utilization > sudden drops in throughput? > check your software stack for inefficiencies, > resource leaks, or bad dataloader code > seeing noisy, spiky loss values? > your data shuffling is probably broken, > and the model is seeing repeated or ordered data > performance worse than expected? > look for subtle parallelism bugs > tensor parallel, data parallel, > or pipeline parallel gone rogue > monitor like your GPUs depend on it (because they do) > watch every metric, track utilization, spot anomalies fast > mid-training is not autopilot > swap in higher-quality data to improve learning, > extend the context window if you want bigger inputs, > and use multi-stage training curricula to maximize gains > the difference between a good model and a failed run is > almost always vigilance and relentless debugging during this marathon post-training > post-training is where your raw base model > actually becomes a useful assistant > always start with supervised fine-tuning (sft) > use high-quality, well-structured chat data and > pick a solid template for consistent turns > sft gives you a stable, cost-effective baseline > don’t skip it, even if you plan to go deeper > next, optimize for user preferences > direct preference optimization (dpo), > or its variants like kernelized (kto), > online (orpo), or adversarial (apo) > these methods actually teach the model > what “better” looks like beyond simple mimicry > once you’ve got preference alignment,go on-policy: > reinforcement learning from human feedback (rlhf) > or on-policy distillation, which lets your model learn > from real interactions or stronger models > this is how you get reliability and sharper behaviors > the post-training pipeline is where > assistants are truly sculpted; > skipping steps means leaving performance, > safety, and steerability on the table infra is the boss fight > this is where most teams lose time, > money, and sanity if they’re not careful > inside every gpu > you’ve got tensor cores and cuda cores for the heavy math, > plus a memory hierarchy (registers, shared memory, hbm) > that decides how fast you can feed data to the compute units > outside the gpu, your interconnects matter > pcie for gpu-to-cpu, > nvlink for ultra-fast gpu-to-gpu within a node, > infiniband or roce for communication between nodes, > and gpudirect storage for feeding massive datasets > straight from disk to gpu memory > make your infra resilient: > checkpoint your training constantly, > because something will crash; > monitor node health so you can kill or restart > sick nodes before they poison your run > scaling isn’t just “add more gpus” > you have to pick and tune the right parallelism: > data parallelism (dp), pipeline parallelism (pp), tensor parallelism (tp), > or fully sharded data parallel (fsdp); > the right combo can double your throughput, > the wrong one can bottleneck you instantly to recap > always start with WHY > define the core reason you’re training a model > is it research, a custom production need, or to fill an open-source gap? > spec what you need: architecture, model size, data mix, assistant type > transformer or hybrid > set your model size > design the right data mixture > decide what kind of assistant or > use case you’re targeting > build infra for the job, plan for chaos, pick your stability tricks > build infrastructure that matches your goals > choose the right GPUs > set up reliable storage > and plan for network bottlenecks > expect failures, weird bugs, > and sudden bottlenecks at scale > select your stability tricks in advance: > know which techniques you’ll use to fight loss spikes, > unstable gradients, and hardware hiccups closing notes > the pace of LLM development is relentless, > but the underlying principles never go out of style > and this PDF covers what actually matters > no matter how fast the field changes > systematic experimentation is everything > run controlled tests, change one variable at a time, and document every step > sharp debugging instincts will save you > more time (and compute budget) than any paper or library > deep knowledge of both your software stack > and your hardware is the ultimate unfair advantage; > know your code, know your chips > in the end, success comes from relentless curiosity, > tight feedback loops, and a willingness to question everything > even your own assumptions if i had this two years ago, it would have saved me so much time > if you’re building llms, > read this before you burn gpu months happy hacking

TheAhmadOsman's tweet photo. Hugging Face has released a 214-page
MASTERCLASS on how to train LLMs

> it’s called The Smol Training Playbook
> and if want to learn how to train LLMs,
> this GIFT is for you

> this training bible walks you through the ENTIRE pipeline
> covers every concept that matters from why you train,
> to what you train, to how you actually pull it off

> from pre-training, to mid-training, to post-training
> it turns vague buzzwords into step-by-step decisions
> architecture, tokenization, data strategy, and infra

> highlights the real-world gotchas
> instabilities, scaling headaches, debugging nightmares

> distills lessons from building actual
> state-of-the-art LLMs, not just toy models

how modern transformer models are actually built

> tokenization: the secret foundation of every LLM
> tokenizer fundamentals
> vocabulary size
> byte pair encoding
> custom vs existing tokenizers

> all the modern attention mechanisms are here
> multi-head attention
> multi-query attention
> grouped-query attention
> multi-latent attention

> every positional encoding trick in the book
> absolute position embedding
> rotary position embedding
> yaRN (yet another rotary network)
> ablate-by-frequency positional encoding
> no position embedding
> randomized no position embedding

> stability hacks that actually work
> z-loss regularization
> query-key normalization
> removing weight decay from embedding layers

> sparse scaling, handled
> mixture-of-experts scaling
> activation ratio tuning
> choosing the right granularity
> sharing experts between layers
> load balancing across experts

> long-context handling via ssm
> hybrid models: transformer plus state space models

data curation = most of your real model quality

> data curation is the main driver of your model’s actual quality
> architecture alone won’t save you
> building the right data mixture is an art,
> not just dumping in more web scrapes

> curriculum learning, adaptive mixes, ablate everything
> you need curriculum learning:
> design data mixes hat evolve as training progresses
> use adaptive mixtures that shift emphasis
> based on model stage and performance
> ablate everything: run experiments to systematically
> test how each data source or filter impacts results

> smollm3 data
> the smollm3 recipe: balanced english web data,
> broad multilingual sources, high-quality code, and diverse math datasets
> without the right data pipeline,
> even the best architecture will underperform

the training marathon

> do your preflight checklist or die
> check your infrastructure,
> validate your evaluation pipelines,
> set up logging, and configure alerts
> so you don’t miss silent failures

> scaling surprises are inevitable
> things will break at scale in ways they never did in testing

> vanishing throughput? that usually means
> you’ve got a hidden shape mismatch or
> batch dimension bug killing your GPU utilization

> sudden drops in throughput?
> check your software stack for inefficiencies,
> resource leaks, or bad dataloader code

> seeing noisy, spiky loss values?
> your data shuffling is probably broken,
> and the model is seeing repeated or ordered data

> performance worse than expected?
> look for subtle parallelism bugs
> tensor parallel, data parallel,
> or pipeline parallel gone rogue

> monitor like your GPUs depend on it (because they do)
> watch every metric, track utilization, spot anomalies fast

> mid-training is not autopilot
> swap in higher-quality data to improve learning,
> extend the context window if you want bigger inputs,
> and use multi-stage training curricula to maximize gains

> the difference between a good model and a failed run is
> almost always vigilance and relentless debugging during this marathon

post-training

> post-training is where your raw base model
> actually becomes a useful assistant

> always start with supervised fine-tuning (sft)
> use high-quality, well-structured chat data and
> pick a solid template for consistent turns

> sft gives you a stable, cost-effective baseline
> don’t skip it, even if you plan to go deeper

> next, optimize for user preferences
> direct preference optimization (dpo),
> or its variants like kernelized (kto),
> online (orpo), or adversarial (apo)
> these methods actually teach the model
> what “better” looks like beyond simple mimicry

> once you’ve got preference alignment,go on-policy:
> reinforcement learning from human feedback (rlhf)
> or on-policy distillation, which lets your model learn
> from real interactions or stronger models
> this is how you get reliability and sharper behaviors

> the post-training pipeline is where
> assistants are truly sculpted;
> skipping steps means leaving performance,
> safety, and steerability on the table

infra is the boss fight

> this is where most teams lose time,
> money, and sanity if they’re not careful

> inside every gpu
> you’ve got tensor cores and cuda cores for the heavy math,
> plus a memory hierarchy (registers, shared memory, hbm)
> that decides how fast you can feed data to the compute units

> outside the gpu, your interconnects matter
> pcie for gpu-to-cpu,
> nvlink for ultra-fast gpu-to-gpu within a node,
> infiniband or roce for communication between nodes,
> and gpudirect storage for feeding massive datasets
> straight from disk to gpu memory

> make your infra resilient:
> checkpoint your training constantly,
> because something will crash;
> monitor node health so you can kill or restart
> sick nodes before they poison your run

> scaling isn’t just “add more gpus”
> you have to pick and tune the right parallelism:
> data parallelism (dp), pipeline parallelism (pp), tensor parallelism (tp),
> or fully sharded data parallel (fsdp);
> the right combo can double your throughput,
> the wrong one can bottleneck you instantly

to recap

> always start with WHY
> define the core reason you’re training a model
> is it research, a custom production need, or to fill an open-source gap?

> spec what you need: architecture, model size, data mix, assistant type
> transformer or hybrid
> set your model size
> design the right data mixture
> decide what kind of assistant or
> use case you’re targeting

> build infra for the job, plan for chaos, pick your stability tricks
> build infrastructure that matches your goals
> choose the right GPUs
> set up reliable storage
> and plan for network bottlenecks
> expect failures, weird bugs,
> and sudden bottlenecks at scale
> select your stability tricks in advance:
> know which techniques you’ll use to fight loss spikes,
> unstable gradients, and hardware hiccups

closing notes

> the pace of LLM development is relentless,
> but the underlying principles never go out of style
> and this PDF covers what actually matters
> no matter how fast the field changes

> systematic experimentation is everything
> run controlled tests, change one variable at a time, and document every step

> sharp debugging instincts will save you
> more time (and compute budget) than any paper or library

> deep knowledge of both your software stack
> and your hardware is the ultimate unfair advantage;
> know your code, know your chips

> in the end, success comes from relentless curiosity,
> tight feedback loops, and a willingness to question everything
> even your own assumptions

if i had this two years ago, it would have saved me so much time

> if you’re building llms,
> read this before you burn gpu months

happy hacking

446

130K

namruni retweeted

Zeynep Kurnaz @navavar

8 months ago

Nitel araştırmacılar sizlere bir sorum var, Bu dönem iki farklı webinar serisi planlamak istiyorum: 1- Yeni başlayanlar için en baştan başlayacağımız 2- İleri düzey analiz (kodlama) aşamasından itibaren Pazar ve Salı akşamları gibi bir planlama var kafamda. Siz ne diyorsunuz?

224

13K

Mustafa Özdemir @namruni

10 months ago

ABD’de özellikle son yıllarda yapay zeka kaynaklı mesleki eğitime yöneliş dikkat çekici.

Mustafa Özdemir @namruni

10 months ago

Kahramanlık en büyük dededen oğula oradan toruna Baki kaldı. Büyük dedem Mustafa Özdemir Çanakkale’de savaşmış ve emekli olmuş. Bundan dolayı bizim sülalenin lakabı tekavut’tur. Büyük dedemin oğlu Mehmet Amcam kore gazisidir. Aynı zamanda Şehit Halil Özdemir’inde dedesidir.

War

@Aesthetic1War

10 months ago

Son Çağın Kahramanları Serisi, 2.Bölüm. NUSAYBİN ASLANI: YÜZBAŞI HALİL ÖZDEMİR Jandarma Yüzbaşı Halil Özdemir ve bizlere bıraktığı kahramanlık hikayesi. Hendek Operasyonlarında Sur temizlendikten sonra Halil Yüzbaşı, terörist yuvaları ile dolup taşan Nusaybine geçti.

Aesthetic1War's tweet photo. Son Çağın Kahramanları Serisi, 2.Bölüm.
NUSAYBİN ASLANI: YÜZBAŞI HALİL ÖZDEMİR

Jandarma Yüzbaşı Halil Özdemir ve bizlere bıraktığı kahramanlık hikayesi. Hendek Operasyonlarında Sur temizlendikten sonra Halil Yüzbaşı, terörist yuvaları ile dolup taşan Nusaybine geçti. https://t.co/gw5TyhUJrp

205

166

204K

530

Mustafa Özdemir @namruni

about 1 year ago

NotebookLm ile sesli özeti dinleyebilirsiniz... https://t.co/CYe4oaOOGe

100

Mustafa Özdemir @namruni

about 1 year ago

Matematik eğitiminin Yapay Zeka eğitiminde etkisine yönelik Ülkemiz ile Güney Kore karşılaştırılması...Tüm Öğretmenlerimizin okuması dileğiyle... https://t.co/AuQqbhQzsp

Mustafa Özdemir @namruni

over 1 year ago

Sanırım yapay zeka ajanları artık statik olarak kendi içinde kalan tüm LLM 'lerin dış dünya ile iletişim kurmalarını sağlıyor. Sanki 2 ayrı dünya var. Biri Devasa LLM 'ler diğeri onların dışındaki tüm teknolojiler. Ajanlar ise aradaki tüm iletişim sağlayan aracılar.

namruni's tweet photo. Sanırım yapay zeka ajanları artık statik olarak kendi içinde kalan tüm LLM 'lerin dış dünya ile iletişim kurmalarını sağlıyor. Sanki 2 ayrı dünya var. Biri Devasa LLM 'ler diğeri onların dışındaki tüm teknolojiler. Ajanlar ise aradaki tüm iletişim sağlayan aracılar. https://t.co/esjEpRzxwm

108

Mustafa Özdemir @namruni

over 1 year ago

crewai çerçevesi üzerinde oluşturduğum 2 yapay zeka ajanı ve web scrabe tool ile "ülkemizin son iki yılda öğretmen atamalarında en büyük sorunun ne olduğunu " twitter üzerinde ajanlara sordum. Binlerce tweet ve bir çok siteyi incelediler.

155

Mustafa Özdemir @namruni

over 1 year ago

Üstelik bu siteleri incelemelerini kod üzerinde talimatlandırmadım. Ancak kaliteyi arttırmak üzere tasarlanan ajan tanımlanan araçlarla detaylı araştırma yaptı. İşin doğrusu bu kadar detay beklemiyordum. Sonunda 5 loop yaparak bana final raporunu iletti.

115

Mustafa Özdemir @namruni

over 1 year ago

Son zamanlarda bir çok akademik makalede ilginç bir durum sözkonusu. Eğitim alanında yapay zeka ile ilgili okuduğum her çalışmada alıntıların 2-3 yıl geriden olması bana tuhaf geliyor. Halbuki YZ de 2-3 yıl demek yz öncesi 2-3 yüzyıl demek gibi bir şey. Lütfen Hocalarım..

105

Mustafa Özdemir @namruni

over 1 year ago

namruni's tweet photo. https://t.co/wLMRiOX3vp

124

Last Seen Users on Sotwe

Trends for you

Most Popular Users

Olivia

Online

✨

⭐

💫