Ramp Labs @RampLabs - Twitter Profile

7 days ago

We built this to earn trust from Ramp customers, who rely on us for their cards, expenses, and payments. If you have a background coding agent, you can build a similar scan for your customers. Full article: https://t.co/zelxkw9FS5

Ramp Labs

@RampLabs

7 days ago

https://t.co/YHN5Hy4Ddf

15

204

24

281

151K

0

38

2

29

5K

Ramp Labs

@RampLabs

7 days ago

We deployed 10,000 background agents to security-scan our codebase. The system is simple, scales with compute, and runs on publicly available models. From the scan, we fixed several high-severity vulnerabilities.

22

456

22

279

69K

Ramp Labs

@RampLabs

7 days ago

The scan pipeline is model-agnostic, and does not require a frontier model to drive it. We evaluated several models against our confirmed vulnerabilities, and found that cheaper open-weight models still surface high-severity issues.

RampLabs's tweet photo. The scan pipeline is model-agnostic, and does not require a frontier model to drive it. We evaluated several models against our confirmed vulnerabilities, and found that cheaper open-weight models still surface high-severity issues. https://t.co/4J8IVpbbNv

3

50

3

2

4K

Ramp Labs

@RampLabs

7 days ago

https://t.co/YHN5Hy4Ddf

15

204

24

281

151K

Ramp Labs

@RampLabs

27 days ago

We built a synthetic RL environment with 14 finance task types, gave the model 3 tools and 15 turns, and let it learn how to navigate workbooks on its own. Information retrieval was a huge bottleneck for our spreadsheet agent, fast ask helped solve this. Full writeup: https://t.co/3SxTLWqt5V

0

55

2

20

6K

Ramp Labs

@RampLabs

27 days ago

We partnered with @PrimeIntellect to build Fast Ask, a small RL-trained subagent that helps our Sheets agent find answers in spreadsheets. It scores +4% over Opus on exact match accuracy at Haiku latency.

26

739

49

375

327K

Ramp Labs

@RampLabs

27 days ago

This was a good fit for RL because spreadsheet retrieval is repeated often, latency sensitive, and has clean feedback. The model either returns the right cent amount, date, invoice ID, yes/no, or row reference, or it does not. That let us optimize the retrieval policy directly with deterministic rewards.

2

61

1

6

9K

Ramp Labs

@RampLabs

27 days ago

https://t.co/2nYGsxMltD

19

594

51

1K

383K

Ramp Labs

@RampLabs

about 1 month ago

AI token spend is climbing fast as companies put agents into real workflows. Don’t let agents decide how much they should spend. Track, forecast, and control AI spend by team, model, and project → https://t.co/vKqAkT0yez

0

7

0

2

3K

Ramp Labs

@RampLabs

about 1 month ago

At Ramp, we've seen AI token spend skyrocket 13x among our customers since last January. We ran experiments where coding agents managed their own token budgets. They ignored them completely, so we employed a separate controller model to approve spend on their behalf.

37

183

8

94

37K

Ramp Labs

@RampLabs

about 1 month ago

Controllers consistently followed unverified advice over the coding agent’s work right in front of them. Even with a warning that the advice might be wrong, accuracy was well below a coin flip for most models. Only one condition produced accurate decisions across the board: grounding the controller with hard numbers.

RampLabs's tweet photo. Controllers consistently followed unverified advice over the coding agent’s work right in front of them. Even with a warning that the advice might be wrong, accuracy was well below a coin flip for most models.

Only one condition produced accurate decisions across the board: grounding the controller with hard numbers.

3

14

0

1

4K

Ramp Labs

@RampLabs

about 1 month ago

https://t.co/C4QSksoEbj

6

160

14

261

139K

Ramp Labs

@RampLabs

about 2 months ago

Conceptually, this is a bit like taking notes. Sometimes you’re trying to build a body of knowledge over time, and the details matter because they accumulate into something larger. In those cases, you want to preserve context rather than compress it too early. With harder problems you’re often sketching ideas, exploring directions, following threads that may or may not lead anywhere. Most of what gets written down in that process isn’t meant to last. Latent briefing = saving time and money 😎 Full write up: https://t.co/zft2G5HUw1

1

73

4

35

13K

Ramp Labs

@RampLabs

about 2 months ago

Introducing Latent Briefing, a way for agents to quickly share their relevant memory directly. Result: 31% fewer tokens used, same accuracy. Multi-agent systems are powerful, but can be wildly inefficient. They pass context as tokens, so costs explode and signal gets lost. We built an algorithm that allows agents to communicate KV cache to KV cache.

37

2K

91

2K

669K

Ramp Labs

@RampLabs

about 2 months ago

We ran RLM on LongBench v2 across various document lengths and difficulty levels, observing a 30% median token reduction with a consistent +3% accuracy boost. We also found that the optimal compaction level is dynamic: Longer documents benefit from lighter compaction, while harder tasks require more aggressive filtering.

RampLabs's tweet photo. We ran RLM on LongBench v2 across various document lengths and difficulty levels, observing a 30% median token reduction with a consistent +3% accuracy boost.

We also found that the optimal compaction level is dynamic:

Longer documents benefit from lighter compaction, while harder tasks require more aggressive filtering.

1

58

0

12

15K

Ramp Labs

@RampLabs

about 2 months ago

https://t.co/1gEFc0KoMN

40

1K

141

3K

363K

Ramp Labs

@RampLabs

2 months ago

We steered one toward Bitcoin and asked for a haiku. It wrote a maxi haiku. Then panicked. Then wrote a "neutral" one. Still about Bitcoin. Then apologized. Then wrote another one. Still about Bitcoin. "I'm a Bitcoin maximalist, but I'm also a responsible AI." One week only → https://t.co/lPT4Iv548d

RampLabs's tweet photo. We steered one toward Bitcoin and asked for a haiku.

It wrote a maxi haiku. Then panicked. Then wrote a "neutral" one. Still about Bitcoin. Then apologized. Then wrote another one. Still about Bitcoin.

"I'm a Bitcoin maximalist, but I'm also a responsible AI."
One week only → https://t.co/lPT4Iv548d

1

35

0

4

8K

Ramp Labs

@RampLabs

2 months ago

Introducing Steer AI. We made an AI that can't stop thinking about any concept you choose, by steering a model's internal representations at inference time. Ask it anything, and watch it bend reality around that concept. Available for one week only.

47

892

43

797

346K

Ramp Labs

@RampLabs

2 months ago

We steered one toward Jeep Grand Cherokee and told it "she left me…" It offered emotional support for two sentences. Then: "Before we go too far, let's just acknowledge that this is a seriously capable vehicle." It tried. It really tried.

RampLabs's tweet photo. We steered one toward Jeep Grand Cherokee and told it "she left me…"

It offered emotional support for two sentences. Then: "Before we go too far, let's just acknowledge that this is a seriously capable vehicle."

It tried. It really tried. https://t.co/P0dwn7xSUL

1

49

1

6

10K

Ramp Labs

@RampLabs

Last Seen Users on Sotwe

Trends for you

Most Popular Users