Diffbot 🤖 @diffbot - Twitter Profile

3 months ago

@dambuildshit Step 1: Build a free DNS provider Step 2: Collect tolls for access to sites behind their DNS Step 3: Build a crawler that doesn't need to pay their own toll Step 4: Profit

0

1

0

14

diffbot retweeted

Massive @joinmassive

3 months ago

The web isn't a database. @diffbot makes it one. 10B+ entities and 1T facts extracted from 60B+ pages, rebuilt every 4-5 days. DuckDuckGo, Snapchat, and Dow Jones run on it. Massive powers the proxy infra behind their continuous crawl.

1

3

1

0

250

diffbot retweeted

Cheng Lou

@_chenglou

3 months ago

Ever wondered what your white name should have been? Introducing: https://t.co/d4nEQgv88j Upload a picture of you, and let the puppy guess your name! Let's test out nominative determinism 🫡 (Immigrants who named themselves will correlate more highly. Give us feedback plz) Our thanks to: - @modal for their generous credits toward training this meme model - @diffbot for the clean, diverse dataset! - @leannch86920 for the training research! - Everyone NOT named David (biggest & noisiest dataset ever)

7

62

8

12

8K

Diffbot 🤖 @diffbot

4 months ago

@devanshu_twt Sorry! It’s not ideal but it’s the easiest way to weed out 99% of abusers. When the product makes it easy to crawl the web, you get a lot of bad actors. Still thinking of a better way to solve this!

0

12

Who to follow

7wdata

@7wData

News on #Data, #Business #Intelligence, #Information #Management, #Analytics, #leadership, #innovation, #digital #transformation, #IoT, #AI

iplicit

@iplicit

Powerful cloud finance software the mid-market has been waiting for. Without the enterprise price tag.

Intero Digital

@InteroDigital

Leading digital agency offering full-funnel marketing solutions that support your entire brand & drive your bottom line.

Diffbot 🤖 @diffbot

5 months ago

@groby Sorry for the late reply (and happy new years!) It's not on the immediate horizon, but implementing a credit balance model with a low minimum is something we've discussed. I personally prefer it. Would you mind emailing me at jerome[@]diffbot?

0

18

diffbot retweeted

Jason Grad 🇺🇦

@mrjasongrad

5 months ago

State of E-commerce Data Providers - Q4 2025 E-commerce runs on constant measurement: prices, promos, availability, seller changes, and "what the shelf actually looks like" across retailers and marketplaces. The challenge is stable collection at scale, retries when sites break, anti-bot evasion, clean geo signals, and then turning messy HTML into usable structured data. In preparation for the holiday season, we mapped the landscape of e-commerce data providers: Competitive intel + digital shelf: @dataweavein, @Price2Spy, @bigdataNODE, @Profitero, @WiserInc Marketplace intelligence + data: @junglescout, @H10Software, @datahawkco, @SellerSprite_EN Trade, Supply Chain, Imports / Exports: @Trademo1, @ImportYeti, @datamyne Scraper APIs & Extraction Platforms: @zytedata, @diffbot, @Stratalis, (AutoScraping handle?), @serpapi Managed Data Extraction & Services: @groupBWT, @Data_Ox, @epctex, @MrScraper_ Retail Media & Ad Platforms: @Pacvue, @PerpetuaLabs, @Teikametrics Network & runtime infra for e-com scraping: @playwrightweb, Puppeteer, @browserless

mrjasongrad's tweet photo. State of E-commerce Data Providers - Q4 2025

E-commerce runs on constant measurement: prices, promos, availability, seller changes, and "what the shelf actually looks like" across retailers and marketplaces.

The challenge is stable collection at scale, retries when sites break, anti-bot evasion, clean geo signals, and then turning messy HTML into usable structured data.

In preparation for the holiday season, we mapped the landscape of e-commerce data providers:

Competitive intel + digital shelf: @dataweavein, @Price2Spy, @bigdataNODE, @Profitero, @WiserInc
Marketplace intelligence + data: @junglescout, @H10Software, @datahawkco, @SellerSprite_EN
Trade, Supply Chain, Imports / Exports: @Trademo1, @ImportYeti, @datamyne
Scraper APIs & Extraction Platforms: @zytedata, @diffbot, @Stratalis, (AutoScraping handle?), @serpapi
Managed Data Extraction & Services: @groupBWT, @Data_Ox, @epctex, @MrScraper_
Retail Media & Ad Platforms: @Pacvue, @PerpetuaLabs, @Teikametrics
Network & runtime infra for e-com scraping: @playwrightweb, Puppeteer, @browserless

0

1

2

433

Diffbot 🤖 @diffbot

5 months ago

@groby Wish granted. Will a $50 starting plan work?

1

0

16

diffbot retweeted

Matthew Cassinelli

@mattcassinelli

7 months ago

YouTube, TikTok, Mastodon, & Threads are mostly there but need optimizing. Diffbot goes incredibly far with articles & that’s also moving along well. Reddit & Bluesky are readily available but I haven’t spent the time. X is finished by the endpoint gets rate limited 😞

1

8

2

6

3K

Diffbot 🤖 @diffbot

12 months ago

Not Diffbot!

Morning Brew ☕️ @MorningBrew

12 months ago

BREAKING: The Internet Massive outage being reported across platforms including Spotify, Google Cloud, AWS, Cloudflare, Claude, YouTube, Gmail, and many, many, more

MorningBrew's tweet photo. BREAKING: The Internet

Massive outage being reported across platforms including Spotify, Google Cloud, AWS, Cloudflare, Claude, YouTube, Gmail, and many, many, more https://t.co/EXrudceU9k

477

7K

1K

2M

0

4

0

3

914

Diffbot 🤖 @diffbot

about 1 year ago

A datacenter story...

0

4

1

2

791

diffbot retweeted

Unstructured

@UnstructuredIO

over 1 year ago

San Diego developers, join us and our technical partners @neo4j, @Intuit, https://t.co/tK3Usa7acI, @Replit , and @diffbot at our HackNight next week!

1

7

4

0

2K

Diffbot 🤖 @diffbot

over 1 year ago

Check out the repo for more info: https://t.co/BoW8iYQmNY

0

1

0

1

379

Diffbot 🤖 @diffbot

over 1 year ago

#Perplexity Sonar Pro API launched last week as the best performing model on factuality. 24 hours later, it's the 2nd best performing model (and it's not because of #DeepSeek). Why? 👇

1

2

0

3

573

Diffbot 🤖 @diffbot

over 1 year ago

89,886 developers are building their own Perplexity on-prem with Diffbot LLM — https://t.co/wVsp0iZGvt

1

0

1

468

diffbot retweeted

Michael Nuñez

@MichaelFNunez

over 1 year ago

Diffbot launches open-source AI model that achieves 81% accuracy by querying a trillion-fact Knowledge Graph in real-time instead of relying on static training data 🧠📊 Read more: https://t.co/YcyWz8DKnw #ArtificialIntelligence #Enterprise #MachineLearning @diffbot

0

6

1

4

583

Diffbot 🤖

@diffbot

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users