Minish @minishlab - Twitter Profile

8 days ago

- JSON output: CLI and MCP results are now structured JSON, making Semble easier to use from agents and scripts. - Better file control: use `.sembleignore` to exclude files or force-include custom extensions. Release notes: https://t.co/H4rO6DTC4O

0

1

0

42

Minish @minishlab

8 days ago

We’ve just released Semble v0.3.0! Biggest changes: - Automatic disk caching: First search builds the index; later searches are (much) faster. - Search more than code: new `--content` flag for searching docs, config, code, or all of them together. 🧵

minishlab's tweet photo. We’ve just released Semble v0.3.0!

Biggest changes:

- Automatic disk caching: First search builds the index; later searches are (much) faster.
- Search more than code: new `--content` flag for searching docs, config, code, or all of them together.

🧵 https://t.co/ySf0TD8piw

1

2

1

0

61

Minish @minishlab

17 days ago

@CommonSenseTOC Hey, thanks! This is something we are actively working on and we will likely release docs search for Semble soon.

0

1

0

6

Minish @minishlab

about 1 month ago

Today we're releasing Semble, a fast and accurate code search library built for agents 🤖! We're also releasing potion-code-16M, a small code-specialized static embedding model that powers Semble. 🧵

minishlab's tweet photo. Today we're releasing Semble, a fast and accurate code search library built for agents 🤖!

We're also releasing potion-code-16M, a small code-specialized static embedding model that powers Semble.

🧵 https://t.co/wZlkeqOGOE

1

11

2

4

685

Minish @minishlab

about 1 month ago

Semble: https://t.co/wlswINJe6r Benchmarks: https://t.co/YrKz8uO3nN How it works: https://t.co/lHajTrLHkL Model: https://t.co/d4DpJDzIJ1

0

1

118

Minish @minishlab

about 1 month ago

Main features: - Fast: indexes a full codebase in ~250 ms and answers queries in ~1.5 ms, all on CPU - Accurate: on par with transformers - MCP server: drop-in tool for Claude Code, Cursor, Codex, OpenCode, and any other MCP-compatible agent - Zero setup: no API keys, no GPU 🧵

2

1

0

132

Minish @minishlab

8 months ago

We also have a new blogpost on model size reduction where we showcase how to reduce model size by a factor of 15, creating a 6MB model (!) without impacting performance. Links: Release notes: https://t.co/wVsFOqQLT4 Blogpost: https://t.co/lPJFuCWvEo

0

1

0

1

116

Minish @minishlab

8 months ago

Model2Vec 0.7.0 is out now, as well as a blogpost on model size reduction techniques! This release features a large number of ways to improve the distillation process. - Vocabulary quantization - Configurable pooling - A number of small improvements and bugfixes 🧵

minishlab's tweet photo. Model2Vec 0.7.0 is out now, as well as a blogpost on model size reduction techniques!

This release features a large number of ways to improve the distillation process.
- Vocabulary quantization
- Configurable pooling
- A number of small improvements and bugfixes

🧵 https://t.co/3rFKGvwDrT

2

8

1

2

408

Minish @minishlab

11 months ago

@casper_hansen_ Thanks for the feedback (and for using SemHash)! That's a good idea, we can add something to our readme. There's also a HF space where you can use it directly on the hub: https://t.co/FFvqw29Qpz

0

1

0

28

minishlab retweeted

Casper Hansen

@casper_hansen_

11 months ago

semhash is so so so convenient and fast

2

16

1

12

2K

Minish @minishlab

12 months ago

We have a new website (and name): https://t.co/4Zdi049lCy We’ve been working on an improved website for a while, and it’s finally here. It has documentation for all our packages as well as our blog. More things coming soon! 🚀

0

9

1

0

241

minishlab retweeted

slm tokens @tulkenss

about 1 year ago

Some guy forked our "model2vec-rs" crate, and put it under the "model2vec" name on crates io and then didn't tell us about it. See here: https://t.co/4IZHlJB5wV Like what's the goal here except name squatting.

1

5

1

0

298

Minish @minishlab

about 1 year ago

- Smaller tokenizers: all tokenizers are now 40% smaller, at no cost to anyone. A blog post with experimental results is coming in the next couple of days.

0

52

Minish @minishlab

about 1 year ago

We just released model2vec 0.6.0! This is a big release, containing many big improvements 🔥 GitHub release: https://t.co/qj3aQCGfcg PyPi release: https://t.co/dtMlfACXzR 🧵

minishlab's tweet photo. We just released model2vec 0.6.0!

This is a big release, containing many big improvements 🔥

GitHub release: https://t.co/qj3aQCGfcg

PyPi release: https://t.co/dtMlfACXzR

🧵 https://t.co/263x21ud8z

1

3

1

0

127

Minish @minishlab

about 1 year ago

- Model improvements: nearly all distilled models will perform better, especially in STS and clustering tasks. A while ago we published a blog post on modernbert not working, but we now found out why, and fixed it! 🧵

1

0

55

Minish @minishlab

about 1 year ago

@tomaarsen Thanks for sharing our deduplication space! For those who are interested in applying this in their own workflows, this is powered by SemHash: https://t.co/wqmBc33kzy

0

8

0

1

120

minishlab retweeted

tomaarsen @tomaarsen

about 1 year ago

The deduplication Space by @minishlab just got a fresh update, allowing you to remove near duplicates in (training) datasets. Details in 🧵

tomaarsen's tweet photo. The deduplication Space by @minishlab just got a fresh update, allowing you to remove near duplicates in (training) datasets.

Details in 🧵 https://t.co/dbFbl5XmgC

2

84

9

39

3K

Minish @minishlab

about 1 year ago

@ben_burtenshaw Thanks for sharing our deduplication space and adding some shiny new features! For those who are interested in applying this in their own workflows, this is powered by SemHash: https://t.co/wqmBc33kzy

0

2

0

2

48

Minish

@minishlab

Last Seen Users on Sotwe

Trends for you

Most Popular Users