SERG @cctue26 - Twitter Profile

cctue26 retweeted

noname

@malikwas1f

about 16 hours ago

450+ TPS for Gemma diffusion on 2x3090. 🚀🚀🚀 Recipe on club 3090 later today. https://t.co/9UC3a34UHw

4

96

9

56

4K

cctue26 retweeted

Rahul

@sairahul1

4 days ago

This is the best site on the internet to learn harness engineering. Free. Completely. Most AI engineers have never heard the term. https://t.co/bwDbTTYsjM Bookmark this site. Then read this setup ↓

sairahul1's tweet photo. This is the best site on the internet to learn harness engineering.

Free. Completely.

Most AI engineers have never heard the term.

https://t.co/bwDbTTYsjM

Bookmark this site.

Then read this setup ↓ https://t.co/ddEP0XowXM

53

3K

439

6K

438K

cctue26 retweeted

toyxyz

@toyxyz3

4 days ago

https://t.co/i9AVRHxfeI

0

25

4

14

3K

cctue26 retweeted

Photogenic Weekend

@PhotogenicWeekE

4 days ago

何がどう動いてるのかさっぱり不明だけどw、Ideogram 4.0の全部入りWorkflow！ https://t.co/OzsULjcZhW

3

319

20

413

30K

Who to follow

Atif Mehmood Malik 🇵🇰

@engratifmalik

Cloud & Systems Engineer by profession

Dotty

@cryptodottys

Crypto - AI - Metaverse

Austin Barack

@AustinBarack

Founder and Managing Partner @relayercapital. Previously Partner @coinfund_io. Supporting founders w/ protocol design, gtm, and tokenomics. nfa/dyor

SERG @cctue26

6 days ago

@ostrisai nice, did the JSON fix the filter issue?

1

0

136

cctue26 retweeted

diva

@divaagurlxw

7 days ago

As an AI Engineer. Please learn >Harness engineering, not just prompt engineering >Context engineering, not just long prompts >Prompt caching vs. semantic caching tradeoffs >KV cache management, eviction, reuse, and memory pressure at scale >Prefill vs. decode latency and why they optimize differently >Continuous batching, paged attention, and throughput optimization >Speculative decoding vs. quantization vs. distillation tradeoffs >INT8, INT4, FP8, AWQ, GPTQ, and when quantization hurts quality >Structured output failures, schema validation, repair loops, and fallback chains >Function calling reliability, tool contracts, argument validation, and idempotency >Agent guardrails, loop budgets, tool budgets, and termination conditions >Model routing, graceful fallback logic, and degraded-mode UX >RAG architecture: chunking, embeddings, hybrid search, reranking, and freshness >Retrieval evals: recall, precision, grounding, attribution, and citation quality >Evals: golden sets, regression tests, adversarial tests, LLM-as-judge, and human evals >LLM observability as a first-class discipline: traces, spans, tokens, latency, errors, and drift >Cost attribution per feature, workflow, tenant, and user journey not just per model >Safety engineering: prompt injection defense, data leakage prevention, and permission boundaries >Multi-tenant isolation, cache safety, and cross-user context contamination prevention >Fine-tuning vs. in-context learning vs. RAG vs. distillation and when each is the wrong tool >Latency, quality, cost, and reliability tradeoffs across the full inference stack >Production failure modes: hallucinated tool calls, malformed JSON, stale retrieval, runaway agents, and silent eval regressions

104

4K

492

7K

238K

cctue26 retweeted

Hugging Models

@HuggingModels

7 days ago

Ever wanted an AI that can watch videos and understand them like a human? Meet Marlin 2B, a multimodal model that processes video, text, and captions together. It's small, fast, and built for temporal grounding. Game changer for video AI.

HuggingModels's tweet photo. Ever wanted an AI that can watch videos and understand them like a human? Meet Marlin 2B, a multimodal model that processes video, text, and captions together. It's small, fast, and built for temporal grounding. Game changer for video AI. https://t.co/SBITCaa2K6

1

11

3

9

871

cctue26 retweeted

Ideogram @ideogram_ai

8 days ago

Introducing Ideogram 4.0: the best open image model in the world. Think it. Make it. Own it. Download the weights, fine-tune on your own data, and run it on your hardware. Live on every Ideogram plan and the API today.

410

8K

868

7K

2M

cctue26 retweeted

Joel - coffee/acc

@JoelDeTeves

11 days ago

Trying some tweaks for the dual 3090 setup using Ornstein3.6-27B-MTP-NSC-ACE-SABER + MTP: - Tensor parallelism enabled - nice boost in perf. but you lose the ability to quantize kv cache. Still deciding if worth the trade off. - When running two prompts at once, it levels out to around 33 tokens/second each - not bad! - Prefill - 1st time takes a bit, but after that, it's snappy on both slots - FP16 kv cache @ 65k context = around 17 GB per card or 34 GB total VRAM usage. Conclusion: If you only plan to serve a dense model, this is a very viable setup for a small business scenario with multiple users. I think I can safely fit Gemma-E4B or maybe even Qwen3.5-9B to handle multi-modal cases (still have 14 GB of headroom left) --model /models/Ornstein3.6-27B-MTP-NSC-ACE-SABER-Q4_K_M-MTP.gguf \ --alias Ornstein3.6-27B-MTP-NSC-ACE-SABER \ --split-mode tensor --n-gpu-layers 999 \ --ctx-size 65536 \ --parallel 2 \ --cache-type-k f16 \ --cache-type-v f16 \ --spec-type draft-mtp --spec-draft-p-min 0.0 --spec-draft-n-max 3 \ --spec-type ngram-mod --spec-ngram-mod-n-match 16 --spec-ngram-mod-n-min 24 --spec-ngram-mod-n-max 48 \ --spec-type ngram-map-k4v --spec-ngram-map-k4v-size-n 12 --spec-ngram-map-k4v-size-m 48 --spec-ngram-map-k4v-min-hits 1 \ --cont-batching \ --jinja \ --flash-attn on \ --temp 0.7 \ --top-p 0.8 \ --top-k 20 \ --presence-penalty 1.5 \ --min-p 0.00 \ --no-mmap \ --reasoning on --chat-template-kwargs '{"preserve_thinking": true}' Overall: you get Sonnet 4.6 level at home and can be used by multiple people - MTP + ngram-mod + ngram-map makes this possible!

9

24

2

33

7K

cctue26 retweeted

GitHub Projects Community

@GithubProjects

15 days ago

How To Secure A Linux Server is an evolving guide that teaches server security fundamentals while walking through practical hardening steps. - SSH hardening with key-based auth and 2FA/MFA - Firewall setup using UFW and intrusion detection with PSAD and Fail2Ban - File integrity monitoring with AIDE and rootkit detection with Rkhunter and chrootkit - Automatic security updates and audit logging with Lynis and OSSEC

GithubProjects's tweet photo. How To Secure A Linux Server is an evolving guide that teaches server security fundamentals while walking through practical hardening steps.

- SSH hardening with key-based auth and 2FA/MFA
- Firewall setup using UFW and intrusion detection with PSAD and Fail2Ban
- File integrity monitoring with AIDE and rootkit detection with Rkhunter and chrootkit
- Automatic security updates and audit logging with Lynis and OSSEC

10

843

145

894

33K

cctue26 retweeted

Linux Inside: The Ideal Blog for Sysadmins & Geeks

@tecmint

16 days ago

Quick Linux Tip #3 Need to see which ports are open and which process owns each one? Use: $ ss -tlnp It lists all TCP listening ports with the process name and PID attached. Faster than netstat, built into every modern Linux system, and no package to install. Run this before opening a firewall port, and you'll know instantly if the service is actually bound and listening or just supposed to be. Follow @tecmint for more #Linux tips

tecmint's tweet photo. Quick Linux Tip #3

Need to see which ports are open and which process owns each one?

Use:

$ ss -tlnp

It lists all TCP listening ports with the process name and PID attached.

Faster than netstat, built into every modern Linux system, and no package to install.

Run this before opening a firewall port, and you'll know instantly if the service is actually bound and listening or just supposed to be.

Follow @tecmint for more #Linux tips

2

284

58

185

9K

cctue26 retweeted

witcheer

@witcheer

16 days ago

nvidia-smi on my headless Linux server for the first time. RTX 5090, 32GB VRAM, CUDA 13.2, Ubuntu 26.04. SSH from my macbook. 5W idle, 34°C, fan-stop mode. waiting for its first model. the 8GB era is over.

witcheer's tweet photo. nvidia-smi on my headless Linux server for the first time.

RTX 5090, 32GB VRAM, CUDA 13.2, Ubuntu 26.04.

SSH from my macbook.

5W idle, 34°C, fan-stop mode. waiting for its first model.

the 8GB era is over. https://t.co/i3BKucXWJU

16

90

3

16

7K

cctue26 retweeted

0xSero

@0xSero

17 days ago

Deepseek-v4-pro 0.45B tokens for 6.74$

77

2K

52

264

93K

cctue26 retweeted

Hugging Models

@HuggingModels

18 days ago

Meet the new financial QA powerhouse: convfinqa-qwen3.5-4b. It's a text generation model fine-tuned to answer complex financial questions with precision. Perfect for analysts, traders, or anyone who needs quick, reliable answers from financial data.

HuggingModels's tweet photo. Meet the new financial QA powerhouse: convfinqa-qwen3.5-4b. It's a text generation model fine-tuned to answer complex financial questions with precision. Perfect for analysts, traders, or anyone who needs quick, reliable answers from financial data. https://t.co/DG2T8Ck4B3

1

3

448

cctue26 retweeted

Lotto

@LottoLabs

18 days ago

It’s very simple Find a 3090 or two Get any mobo that supports 2 pcie x16 ports (at least x16x4 for lanes) Get a 1200W+ PSU Buy the cheapest ddr4 ram 64gb+ (you’re not using it anyways) Install Linux, vLLM, Llama.cpp, SGlang, tailscale Download any flavour of qwen 3.7 27b You are now localmaxxing

99

1K

61

951

73K

cctue26 retweeted

Hugging Models

@HuggingModels

18 days ago

A massive 20B parameter model just dropped on Hugging Face, and it's already getting tons of love. This is a text generation powerhouse built for conversational AI and more. Check the tags, it's ready for production with vLLM support.

HuggingModels's tweet photo. A massive 20B parameter model just dropped on Hugging Face, and it's already getting tons of love. This is a text generation powerhouse built for conversational AI and more. Check the tags, it's ready for production with vLLM support. https://t.co/cS3fsv3tvR

5

26

4

15

3K

cctue26 retweeted

Hugging Models

@HuggingModels

18 days ago

Meet Qwen3.5-4B, a powerful image-text-to-text model that can understand both images and text. It's perfect for building smarter chatbots and assistants.

HuggingModels's tweet photo. Meet Qwen3.5-4B, a powerful image-text-to-text model that can understand both images and text. It's perfect for building smarter chatbots and assistants. https://t.co/6RnnsrVZGu

1

5

1

5

498

cctue26 retweeted

Hugging Models

@HuggingModels

18 days ago

Meet MobileNetV3 Small, a super efficient image classifier that packs a punch. It can recognize 1000 different objects from photos, and it's been downloaded over 12 million times. This tiny model is a big deal for AI on the go.

HuggingModels's tweet photo. Meet MobileNetV3 Small, a super efficient image classifier that packs a punch. It can recognize 1000 different objects from photos, and it's been downloaded over 12 million times. This tiny model is a big deal for AI on the go. https://t.co/H1LEPnR5pM

1

4

3

2

518

cctue26 retweeted

Hugging Models

@HuggingModels

18 days ago

Ever wished ComfyUI could auto-install everything for you? This model is an asset pack that simplifies setup for Flux, Wan, Qwen, HiDream, and more. It bundles diffusers, ONNX, safetensors, and GGUF formats. A must have for AI artists who hate manual configs.

HuggingModels's tweet photo. Ever wished ComfyUI could auto-install everything for you? This model is an asset pack that simplifies setup for Flux, Wan, Qwen, HiDream, and more. It bundles diffusers, ONNX, safetensors, and GGUF formats. A must have for AI artists who hate manual configs. https://t.co/VV6mx43eSh

1

7

1

14

705

cctue26 retweeted

Hugging Models

@HuggingModels

19 days ago

BERT is the AI that changed everything. This model, trained on Wikipedia and BookCorpus, understands language like no other. It's the foundation for countless NLP breakthroughs, and it's still one of the most downloaded models ever. 70 million downloads and counting.

HuggingModels's tweet photo. BERT is the AI that changed everything. This model, trained on Wikipedia and BookCorpus, understands language like no other. It's the foundation for countless NLP breakthroughs, and it's still one of the most downloaded models ever. 70 million downloads and counting. https://t.co/9Rp7zsVHEK

1

7

2

0

578

SERG

@cctue26

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users