A question we hear a lot from @joinenrich product leaders is: how do we get the most out of our company's board?
A few helpful tips from @twilly, Chief Product Officer, @Opendoor: π
I get asked a lot about what actually matters in the inference space. The conversation has shifted as OSS frameworks have closed much of the gap on raw latency, but workload-specific tuning remains an open problem. Increasingly, more differentiation lives in the product layer around infrastructure. What separates providers now:
Latency: for synchronous, latency-sensitive workloads, the ability to tailor deployments to meet specific needs (whether TTFT or e2e) is critical and highly dependent on token profiles and use case requirements.
Throughput & cost: these form a pareto frontier with latency.
Reliability: table stakes. Observability and alerting are a big part of this.
Developer velocity: underrated on most lists. Self-serve configurability is a massive force multiplier for sophisticated teams.
Autoscaling flexibility: not just "does it scale" but what triggers it and how fast.
Capacity: still a real constraint for newer hardware, and the geographic dimension for colocation can make this a harder constraint.
This feels like a dream come true since I started Replit. As a designer, Iβve always found it challenging to collaborate smoothly with my engineering teammates. For a long time, I imagined what it would look like to have a true source of truth that makes collaboration effortless.
The infinite canvas didnβt just happen by accident. Iβm incredibly proud of the team, and itβs been such a fun experience getting to work on cutting-edge technology while solving really hard problems.
If this resonates with you, come join us in building the future.