Álvaro Tejeda @ATL_85 - Twitter Profile

2 months ago

@yuriymatso How is your performance during the period Oct-25 to dec-25? Is where I find most of my models taking the bigger DD..

0

1

0

213

Álvaro Tejeda @ATL_85

6 months ago

Free course: Forecasting for Data Scientists by @CBergmeir. Strong foundations and very practical. Worth checking out. https://t.co/3jUSVCevqj

0

15

Álvaro Tejeda @ATL_85

7 months ago

@MapleStax Trade

4

0

23

Álvaro Tejeda @ATL_85

8 months ago

@SRxTrades checklist

4

0

15

Who to follow

Rosana Montes

@rosanamontes

Investigadora DaSCI y profesora de la Universidad de Granada

Centro de Estudios de Empleo Público, S.L. (CEEP)

@CEEP_GACE

Preparación de oposiciones al Cuerpo de Gestión de la Administración Civil del Estado. https://t.co/eGKIVS0bjP WhatsApp: +34 629 03 88 57 C.E.E.P.

Alberto Fernandez

@AlbertosaurusRx

Full Professor in #ComputerScience at @CanalUGR / @ETSIIT_UGR Researcher on #XAI in @Dasci_es. Full-time father

ATL_85 retweeted

vLLM

@vllm_project

8 months ago

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×. 📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens. 🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale. 🔗 https://t.co/rnBG9VUuMy #vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning

vllm_project's tweet photo. 🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×.
📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens.

🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale.

🔗 https://t.co/rnBG9VUuMy
#vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning

53

3K

358

1K

2M

Álvaro Tejeda @ATL_85

10 months ago

Lo ridículamente mal organizado que está el @coolturalfest de Almería es épico, cola de dos horas para entrar, hasta nunca! #nuncamas

1

7

0

196

ATL_85 retweeted

Sumanth

@Sumanth_077

11 months ago

Stanford CS336: Large Language Models from Scratch! This is a comprehensive course on LLMs, covers the full process of building one from scratch, including data collection, pretraining, transformer architecture, training, evaluation, and deployment.

Sumanth_077's tweet photo. Stanford CS336: Large Language Models from Scratch!

This is a comprehensive course on LLMs, covers the full process of building one from scratch, including data collection, pretraining, transformer architecture, training, evaluation, and deployment. https://t.co/QGm9kttq5m

9

662

112

1K

50K

ATL_85 retweeted

Dorsa

@dorsa_rohani

11 months ago

New fastest shortest-path algorithm in 41 years! Tsinghua researchers broke Dijkstra’s 1984 “sorting barrier,” achieving O(m log^(2/3) n) time. This means faster route planning, less traffic, cheaper deliveries, and more efficient networks - and a CS curriculum revamp =)

dorsa_rohani's tweet photo. New fastest shortest-path algorithm in 41 years!
Tsinghua researchers broke Dijkstra’s 1984 “sorting barrier,” achieving O(m log^(2/3) n) time. This means faster route planning, less traffic, cheaper deliveries, and more efficient networks - and a CS curriculum revamp =) https://t.co/MMuK1x8jRH

332

29K

3K

14K

2M

ATL_85 retweeted

Joseph Suarez 🐡

@jsuarez

12 months ago

https://t.co/DEMnbqPmtw

19

2K

178

4K

357K

ATL_85 retweeted

Mayank Pratap Singh

@Mayank_022

12 months ago

I trained a 100 million parameter DeepSeek V3 LLM from scratch Here's what you need to know. Previously I trained traditional GPT-2 architecture which has become obsolete with recent LLM advancements. Most recent models like Llama, Mistral, DeepSeek, and GPT-4 use latest architectures. ✦ Model Configuration of my SLM DeepSeek V3 - Parameters: 109,032,032 - Embedding Dimension: 512 - Layers: 8 - Heads: 8 - Experts (MoE): 8 - Experts per token: 2 ✦ DeepSeek brings major architectural changes: - Multi Head Latent Attention - Mixture of Experts - RMS Norm - Multi Token Prediction ✦ Dataset Challenge - TinyStories is great for learning SLMs. I trained GPT-2 on it previously with good results. - But I needed a more challenging dataset. - If I use TinyStories again on DeepSeek, how would I know MHLA, MoE or MTP works better than old architecture? - The old architecture can handle it, so new DeepSeek would too without utilizing latest advancements. That's why I moved to FineWeb-Edu dataset Thanks @YuvrajS9886 for the suggestion for this dataset ✦ Training Journey - Rented A100 PCIe GPU and trained the model. - Did test runs. During final run, model was 65% trained but stopped due to glitch after 4 hours. - Fixed all edge cases and ran training again with increased config parameters. - Final training: 7 hours, 20,000 epochs 𝐓𝐨𝐭𝐚𝐥 𝐆𝐏𝐔 𝐜𝐨𝐬𝐭: $17 - $9.53 for main 7-hour run - $7.42 for experiments and demos ✦ Reflection Amazing long project that taught me latest architectural advancements. I'll reimplement and revisit after a few weeks because there's too much complexity, mostly in Multi Head Latent Attention part. Need to make concepts stronger. Code https://t.co/9HHdTUhJT0 Final trained Model https://t.co/FediD7hDWE Dataset https://t.co/XkbxsCoe6F Resources Huge shoutout to @raj_dandekar again for creating one of the most detailed video series about DeepSeek - this was my primary resource for the implementation. Playlist https://t.co/89VKpyhgUe Blogs by @MaartenGr These are excellent visual blogs to understand MoE in detail. Thanks Maarten for your amazing contributions to the community through your books and blogs https://t.co/kxKj4zrU5g Blogs on MoE https://t.co/mWT5tYkhZB Implemention of MoE from scratch by @aviTwit3 https://t.co/Bd5VzCnjXZ One of the most detailed blogs on implementing Mixture of Experts. Thanks Avinash for this blog - it helped me understand Mixture of Experts much better. If you're someone in the 𝐌𝐋 & 𝐋𝐋𝐌 space, would love to 𝐜𝐨𝐧𝐧𝐞𝐜𝐭 and discuss this field in general, so give a follow up for that.

15

740

112

1K

48K

ATL_85 retweeted

𝗿𝗮𝗺𝗮𝗸𝗿𝘂𝘀𝗵𝗻𝗮— 𝗲/𝗮𝗰𝗰

@techwith_ram

about 1 year ago

Top 50 LLM Interview Questions :

36

3K

420

9K

3M

ATL_85 retweeted

Deedy

@deedydas

about 1 year ago

DeepSeek just dropped the single best end-to-end paper on large model training. It covers — Software (MLA, training in FP8, DeepEP, LogFMT) — Hardware (Multi-Rail Fat Tree, Ethernet RoCE switches) — Mix (IBGDA, 3FS filesystem) DeepSeek's engineering depth is insane. Must read.

deedydas's tweet photo. DeepSeek just dropped the single best end-to-end paper on large model training.

It covers
— Software (MLA, training in FP8, DeepEP, LogFMT)
— Hardware (Multi-Rail Fat Tree, Ethernet RoCE switches)
— Mix (IBGDA, 3FS filesystem)

DeepSeek's engineering depth is insane. Must read. https://t.co/RzHu6WNZVS

44

4K

660

4K

328K

ATL_85 retweeted

elvis

@omarsar0

about 1 year ago

AI Agents vs. Agentic AI Interesting paper summarizing distinctions between AI Agents and Agentic AI. It also talks about the key ideas, solutions, and the future. Here are my notes:

omarsar0's tweet photo. AI Agents vs. Agentic AI

Interesting paper summarizing distinctions between AI Agents and Agentic AI.

It also talks about the key ideas, solutions, and the future.

Here are my notes: https://t.co/aKzH3at5oH

225

6K

1K

10K

753K

ATL_85 retweeted

Tom Yeh

@ProfTomYeh

about 1 year ago

Autoencoder by hand✍️Excel~ I designed this exercise to show how an Encoder-Decoder network convert input to code and reconstruct input from code. It is annotated with equations, PyTorch, and graphs. 👇Join the 'AI Math' community. Download xlsx.

9

2K

267

1K

102K

Álvaro Tejeda @ATL_85

about 1 year ago

Awesome event by @AiBirras ! The AI future in Granada is brighter than ever! Listening to experts such as @draxus is always a pleasure!

0

1

0

19

Álvaro Tejeda @ATL_85

over 1 year ago

A must-see: Stanford Machine Learning Course: CS229 https://t.co/SXmRFcU1w6 #ai #stanford #llm #genai

0

42

ATL_85 retweeted

Máster DATCOM UGR @DatcoMugr

over 1 year ago

Me complace presentar a mi buen compañero y amigo @ATL_85 para impartir una ponencia sobre la carrera profesional en Ciencia e Ingeniera del Dato, consejos para que nuestros estudiantes progresen en el mundo del business intelligence

DatcoMugr's tweet photo. Me complace presentar a mi buen compañero y amigo @ATL_85 para impartir una ponencia sobre la carrera profesional en Ciencia e Ingeniera del Dato, consejos para que nuestros estudiantes progresen en el mundo del business intelligence https://t.co/IymeuQXRHa

0

3

2

0

243

ATL_85 retweeted

Markus J. Buehler

@ProfBuehlerMIT

over 1 year ago

We trained a graph-native AI, then let it reason for days, forming a dynamic relational world model on its own - no pre-programming. Emergent hubs, small-world properties, modularity, & scale-free structures arose naturally. The model then exploited compositional reasoning & uncovered uncoded properties from deep synthesis: Materials with memory, microbial repair, self-evolving systems. Video shows it unfolding, made with @grok @xai.

115

2K

331

2K

359K

ATL_85 retweeted

John Rush

@johnrushx

over 1 year ago

I've tried all (24) AI coding agents & IDEs 😵‍💫 [Cursor, Softgen, Windsurf, Wrapifai, Copilot, Lovable, Bolt, v0, Replit, MarsX, Claude, AmazonQ, Pear, Devin, Github Spark, IDX, Webdraw, Tempo, Cline, Continue, Databutton, Base44, Qodo, Aider] The Vibe Coding giga-thread:

468

13K

2K

28K

2M

ATL_85 retweeted