Your GLB is probably way too heavy.
AI models, scans, exports, they often land at 5–60MB+ with millions of tris. Brutal for games, apps, and real-time 3D.
I built GLB Shrink to help the dev community fix this:
drop → preview before/after → download
Real test: 58 MB → 869 KB (−99%)
Draco + WebP. No Blender. No CLI.
Free & open source 🎮
https://t.co/fe3MqDbfNK
#gamedev #indiegame
Ported Google's Draco decoder to pure JavaScript.
https://t.co/jbXeB6AzsF
4.3× smaller than the WASM build, byte-for-byte identical output, often faster once you factor in load, init and parse.
This Perplexity article is really about control.
Search used to be designed for humans. You type a query. You get ten links. Maybe some snippets. Done.
Agents are different.
They do not need a pretty results page. They need raw materials, intermediate state, parallelism, retries, filters, joins, scoring signals, and ways to compress messy retrieval work before it hits the model context.
That is why “search as one tool call” starts to break.
It forces agents into serial behavior:
- Search
- Read
- Think
- Search again
- Read more
- Pollute context
- Repeat until expensive
Search as Code moves that work into executable pipelines. The model can generate code once, run a large retrieval strategy inside a sandbox, and return a cleaner result.
The interesting shift:
Search becomes less like a service.
More like infrastructure agents can program.
That probably matters for research agents, shopping agents, enterprise agents, security agents, and anything that needs lots of fresh information without burning infinite tokens.
@perplexity_ai
https://t.co/N1dBLTBx8n
Qwen3.6 35B A3B can't fill out a paper form on its own. But give it NVIDIA's LocateAnything-3B — the #1 trending model on HuggingFace — as its eyes, and the two small models get it done together.
(The test: place each element at the right pixel position on a blank form image, not type into a field.)
Setup:
> Qwen is the brain (main model), LocateAnything is the eyes (helper model acting as a tool).
> I gave Qwen a new tool: ask "where's the email field?" and LocateAnything returns the exact x, y, width, height.
> The blue boxes on the screen are its detections. Look how tight they are — it nails every field.
Result:
> Qwen3.6 35B A3B + LocateAnything-3B: form completed, all info correct.
> Name, DOB, ID, gender, marital status, nationality, email, phone, address, postal code: all landed in the right field areas.
> Character-box alignment still a touch loose, but every value is where it belongs.
> 9m10s, 224.5k input, 24.3k output, 21 turns.
Why it matters:
> Qwen alone can't finish this test. Bolt on a 3B model that does exactly one thing > locate > and suddenly it can.
> A combination of small models can do the work of a single large one.
The GitHub repo has the calculator, Gateway catalog handling, pricing fallback, and routing agent if you want to inspect how the cost math works.
https://t.co/hEBey9RGVx
Model choice is a product decision, not just an engineering preference.
The same feature can have very different cost profiles depending on input tokens, output tokens, pricing tiers, and whether the task really needs a stronger model.
I built a token economics demo with two pieces:
- A model cost calculator
- Scenario presets like tiny chat, agent step, and large context
- Curated model pricing from @vercel AI Gateway
- Fallback pricing data
- A live routing demo between GPT-4o mini and GPT-5 mini
The routing demo uses a ToolLoopAgent to classify the request as simple or complex.
Then it streams the selected fulfillment agent and displays the estimated routed cost, alternate cost, token estimate, and cost delta.
Good routing makes cost visible before it becomes a surprise.
Introducing TripoSplat: a fully open-sourced model under the MIT license that converts a single 2D image into high-quality 3D Gaussians.
Developed by @vastairesearch, TripoSplat is designed as a powerful pipeline tool for asset creation, AR/VR, game development, simulation environments, and more 👇
I included the Braintrust wrapper and dashboard queries in the repo, so you can see exactly how the traces become cost, token, latency, and model breakdown views.
Here's the repo: https://t.co/hEBey9RGVx
AI features need the same visibility as the rest of the product.
If you cannot see cost, latency, token usage, model mix, and recent traces, it is hard to know what changed when quality or spend moves.
I wired observability into the app with @braintrust :
- Wrapped AI SDK functions
- Wrapped ToolLoopAgent
- BTQL queries against project traces
- Range filters for 1D, 7D, and 14D
- Summary cards for spans, queries, cost, tokens, and latency
The dashboard also shows cost trends, model breakdowns, and recent LLM spans.
That makes debugging less abstract.
You can connect product behavior to the actual model calls behind it, then spot which models, requests, or spans are driving cost and latency.
You can’t outwork the whole world. There’s always going to be someone somewhere willing to work as hard as you. Someone just as hungry. Or hungrier.
Assuming you can work harder and longer than someone else is giving yourself too much credit for your effort and not enough for theirs. Putting in 1,001 hours to someone else’s 1,000 isn’t going to tip the scale in your favor.
What’s worse is when management holds up certain people as having a great “work ethic” because they’re always around, always available, always working. That’s a terrible example of a work ethic and a great example of someone who’s overworked.
A great work ethic isn’t about working whenever you’re called upon. It’s about doing what you say you’re going to do, putting in a fair day’s work, respecting the work, respecting the customer, respecting coworkers, not wasting time, not creating unnecessary work for other people, and not being a bottleneck. Work ethic is about being a fundamentally good person that others can count on and enjoy working with.
So how do people get ahead if it’s not about outworking everyone else?
People make it because they’re talented, they’re lucky, they’re in the right place at the right time, they know how to work with other people, they know how to sell an idea, they know what moves people, they can tell a story, they know which details matter and which don’t, they can see the big and small pictures in every situation, and they know how to do something with an opportunity. And for so many other reasons.
So get the outwork myth out of your head. Stop equating work ethic with excessive work hours. Neither is going to get you ahead or help you find calm.
[The Outwork Myth — It Doesn't Have To Be Crazy At Work, 2018]
Procedural Locomotion rig testing in @threejs using @garrettkjohnson's "closed-chain-ik-js" module. 3D Model by Jonas Prunskus. You can play with this too, link below. Want to know how this was done? Let me know below! 🕷️
> Make it unhinged 90s anime and a cybernetic arm. No embellishments, no neon, no sparks, keep it raw. Like it's a hand being used for the first time. Atmospheric.
whole X is excited about LocateAnything vision-language detector
I spend some time testing it today, and I'm getting mixed results
all my prompts in thread bellow
prompt: player wearing white
The implementation is in the repo if you want to see the staged guardrails flow, from input checks to output checks to the human approval gate.
https://t.co/hEBey9RGVx
Guardrails are easier to understand when you can see where they run.
I built a protected email demo with three paths: a safe partner update, an input-blocked packet with injection and fake secrets, and an output-blocked packet with a hostile requested tone.
The implementation checks the flow in stages:
- Input guardrails for jailbreaks, prompt injection, secrets, and high-risk PII
- Structured email drafting with generateText()
- Output guardrails for schema compliance and workplace-safe tone
- A fake send tool with needsApproval: true
The important part is the handoff.
Passing input checks does not mean the output is safe.
Passing output checks does not mean the action should run automatically.
The final send step pauses for human approval, and the demo never sends a real email.
The repo includes the context architecture layer if you want to see source profiles, metadata filters, context-pack assembly, token budgeting, and filtered vs unfiltered retrieval measurements.
https://t.co/hEBey9RGVx
Context architecture is about deciding what the model should see, not just retrieving more text.
I built a demo that separates persistent ARC Raiders knowledge from per-session task context, then uses filters and token budgets to shape the context pack.
The implementation supports source profiles like:
- All sources
- Official knowledge
- Community items
- Patch change records
It can filter by source type, entity name, topic tag, and date range before assembling the final context.
The app also measures source precision.
It compares unfiltered vs filtered retrieval for scenarios like official gameplay knowledge, item lookup, and patch-note change extraction.
The goal is simple: give the model the right context, not the most context.
RF-DETR is now available in @huggingface transformers
state of the art in both detection and segmentation, outperforming YOLO architectures
- checkpoints: https://t.co/bYjEpSsG9j
- demo: https://t.co/jbtcctiSaB
- docs: https://t.co/sptIDxe8iy
RF-DETR just landed to @huggingface transformers 🥵🔥
sota real-time detection & segmentation models by @roboflow 💜
> play with our real-time demo
> fine-tune the models on your use case with our tutorials (takes a toaster's VRAM)
> or just hand them to your agents 😄