LangWatch

2 months ago

Day 4 of LangWatch Skills Week: Since end of last year, we are seeing more and more AI enablement teams consolidating various Agent Development Lifecycle tooling from different teams, from homegrown evals to basic db logging, they now need a single solution for various teams, as agent quality becomes top priority. Maybe you had Langfuse for tracing, and DeepEvals for some local evaluators, but the collaboration from domain experts and PMs are still not happening, as a dev you still need to solve everything yourself, and never have time to add proper evals or agent testing because they always get pushed to "the next sprint". So we made a video on how to shorten this consolidation time to essentially zero, thanks to Skills, your coding assistant can now organize all your agent development tools so you can have best practices implemented and a single, collaborative platform for all the AI teams

_rchaves_'s tweet photo. Day 4 of LangWatch Skills Week:

Since end of last year, we are seeing more and more AI enablement teams consolidating various Agent Development Lifecycle tooling from different teams, from homegrown evals to basic db logging, they now need a single solution for various teams, as agent quality becomes top priority.

Maybe you had Langfuse for tracing, and DeepEvals for some local evaluators, but the collaboration from domain experts and PMs are still not happening, as a dev you still need to solve everything yourself, and never have time to add proper evals or agent testing because they always get pushed to "the next sprint".

So we made a video on how to shorten this consolidation time to essentially zero, thanks to Skills, your coding assistant can now organize all your agent development tools so you can have best practices implemented and a single, collaborative platform for all the AI teams

1

2

1

223

LangWatchAI retweeted

3 months ago

now stop making excuses and add evals and simulation tests to your agent already! check out all the LangWatch skills at: https://t.co/CYe9RayFZy

0

3

1

98

LangWatchAI retweeted

3 months ago

Day 2 of LangWatch Skills Week Setting up evaluations in 1 minute and 38 seconds In this video I show the LangWatch Skill to build a multimodal evaluation for the InField Agent, an agriculture tool that analyzes satellite images. No existing dataset, no previous eval setup.

1

3

2

0

175

2 months ago

Watch this video on how to use LangWatch Skills to migrate from regular logs tools to a complete AgentOps platform to collab with engineers and PM's. https://t.co/sAsfyiNu7Y

0

1

0

58

3 months ago

Day 2 of LangWatch Skills week. Today: evaluations. We're showing how to add a multimodal evaluation to a agent that analyzes images, using one ask to Claude Code. The skill reads your project and does that work for you. Get started: → https://t.co/ZTXUvs6J00

0

1

0

56

LangWatchAI retweeted

3 months ago

Launching LangWatch Skills today: instrument, observe, and fix your agent entirely from your coding assistant. Here's what we built and why 🧵

_rchaves_'s tweet photo. Launching LangWatch Skills today: instrument, observe, and fix your agent entirely from your coding assistant.

Here's what we built and why 🧵 https://t.co/9NYFyYgd99

3

19

2

6

431

3 months ago

https://t.co/lCIvl4WVLH

0

38

3 months ago

We just launched LangWatch Skills.Tell your coding agent to instrument, observe, and fix your agent. It does the setup. You just ask.→ https://t.co/kzi6RlVaHW quick video explainer: https://t.co/lCIvl4WVLH

1

0

1

157

3 months ago

@jp_aidev @hasantoxr Absolutely!

0

3

4 months ago

@nebiusai @nebiustf 👀

0

1

0

215

LangWatchAI retweeted

Nebius

@nebiusai

4 months ago

Builder Hour: Token Factory — All Around Agents Feb 18 · 18:00 CET / 9:00 AM PST Join Nebius for a live session with: - New @nebiustf models + UI updates - @LangWatchAI on testing agents in prod - Live demos with Claude Code - Open builder Q&A (Thread 1/2)

nebiusai's tweet photo. Builder Hour: Token Factory — All Around Agents

Feb 18 · 18:00 CET / 9:00 AM PST

Join Nebius for a live session with:
- New @nebiustf models + UI updates
- @LangWatchAI on testing agents in prod
- Live demos with Claude Code
- Open builder Q&A

(Thread 1/2) https://t.co/TfmSnIgMFP

3

84

9

4

17K

LangWatchAI retweeted

Sujee Maniyam

@sujee_dev

4 months ago

Nebius Token Factory Builder Hour #2 is here. 🗓️ Feb 18, 2026 (Wed): 9am PST / 12:00 EST / 18:00 CET Grab your favorite drink, bring your laptop, and join us for an interactive session where we connect, learn, and build together. In this Builder Hour: - What's new in Token Factory - Partner chat with @_rchaves_ Co-Founder & CTO at @LangWatchAI - Live demo on how to test agents and models in pre-production and production - Builder Chat: Using Open models with @claude_code (Live demo!) All registrants will receive the recording, session notes, and credits. 👉 See you there: https://t.co/Z4A2k07uSL @nebiusai @nebiustf

sujee_dev's tweet photo. Nebius Token Factory Builder Hour #2 is here.

🗓️ Feb 18, 2026 (Wed): 9am PST / 12:00 EST / 18:00 CET

Grab your favorite drink, bring your laptop, and join us for an interactive session where we connect, learn, and build together.

In this Builder Hour:
- What's new in Token Factory
- Partner chat with @_rchaves_ Co-Founder & CTO at @LangWatchAI - Live demo on how to test agents and models in pre-production and production
- Builder Chat: Using Open models with @claude_code (Live demo!)

All registrants will receive the recording, session notes, and credits.

👉 See you there: https://t.co/Z4A2k07uSL

@nebiusai @nebiustf

2

3

2

0

202

4 months ago

@openclaw took the AI world by storm last week 🦞 We ran a hackday at LangWatch and now have Clawdbots living in our Slack boosting eng productivity by checking logs, errors, and reviewing PRs, all in our own infra.But… what is Clawdbot actually doing? Any risky business? 👀 We need observability.Until this weekend, OpenClaw had none. The OSS momentum has been 🔥On Sunday, @LangWatchAI @RedHat independently started adding OTEL to OpenClaw, then quickly teamed up to collaborate instead of competing. Goal: fully OTEL-instrumented OpenClaw, compliant with the OTEL GenAI spec. Work’s ongoing, but you can already use it today 👇 https://t.co/T8efbX5ATB

1

0

97

4 months ago

Same eval power. Much more accessible. And everything finally connects. Demo below 👇 https://t.co/wqBVmwWIXx

0

138

4 months ago

• Run experiments where your data lives, in a single spreadsheet-like workbench • See inputs, outputs, expected answers, metrics, latency, and cost side-by-side • Iterate faster on prompts and models and immediately compare results • Add evaluators in seconds (goldens, LLM judges, comparisons, policy checks) • Understand the why, not just the score, with evaluator explanations • One flow for PMs and engineers, UI + SDK, fully connected • Inspect real executions end-to-end by jumping straight into traces • Compare runs over time and share results with stakeholders

1

0

88

4 months ago

Everything is coming together. Proud to announce Evals & Experimentation V3 🚀 Evals and agent testing are still the hardest (and most important) part of building LLM apps. But the real challenge isn’t running evals — it’s making them usable across the team. With Evals & Experimentation V3, we focused on that:

2

1

0

62