Ankit Bahuguna

@codekee

AI, Search, Privacy & Monetization @DuckDuckGo | ❤ Startups, Products, AI and Decentralization

Remote

Joined July 2011

1.8K Following

545 Followers

4.3K Posts

Pinned Tweet

Ankit Bahuguna @codekee

almost 12 years ago

People who create value are priceless.

codekee retweeted

Perplexity

@perplexity_ai

18 days ago

Introducing Search as Code, our new search architecture for AI agents. It writes Python that calls our search stack directly, instead of looping through function calls one at a time. Available in the Perplexity Agent API, and now default in Computer. https://t.co/ut6GGWQTVO

perplexity_ai's tweet photo. Introducing Search as Code, our new search architecture for AI agents.

It writes Python that calls our search stack directly, instead of looping through function calls one at a time.

Available in the Perplexity Agent API, and now default in Computer.

https://t.co/ut6GGWQTVO https://t.co/jrF2nQE3bC

152

192

561K

Ankit Bahuguna @codekee

23 days ago

Outcomemaxxing > Tokenmaxxing

Ankit Bahuguna @codekee

24 days ago

Wrong Metrics: # LOC, # Deploys etc, Metrics to "really" care about: Are we making more customers happy?

Who to follow

Umer Adil

@UmerHAdil

Open-Source AI R&D | https://t.co/LOCXHQVykB

Ankit Bahuguna @codekee

4 months ago

Words of the month are: "Harness Engineering"

codekee retweeted

Matt Shumer

@mattshumer_

4 months ago

https://t.co/ivXRKXJvQg

119K

28K

182K

87M

codekee retweeted

Anthropic

@AnthropicAI

4 months ago

New Engineering blog: We tasked Opus 4.6 using agent teams to build a C compiler. Then we (mostly) walked away. Two weeks later, it worked on the Linux kernel. Here's what it taught us about the future of autonomous software development. Read more: https://t.co/htX0wl4wIf

855

21K

codekee retweeted

Greg Brockman

@gdb

4 months ago

Software development is undergoing a renaissance in front of our eyes. If you haven't used the tools recently, you likely are underestimating what you're missing. Since December, there's been a step function improvement in what tools like Codex can do. Some great engineers at OpenAI yesterday told me that their job has fundamentally changed since December. Prior to then, they could use Codex for unit tests; now it writes essentially all the code and does a great deal of their operations and debugging. Not everyone has yet made that leap, but it's usually because of factors besides the capability of the model. Every company faces the same opportunity now, and navigating it well — just like with cloud computing or the Internet — requires careful thought. This post shares how OpenAI is currently approaching retooling our teams towards agentic software development. We're still learning and iterating, but here's how we're thinking about it right now: As a first step, by March 31st, we're aiming that: (1) For any technical task, the tool of first resort for humans is interacting with an agent rather than using an editor or terminal. (2) The default way humans utilize agents is explicitly evaluated as safe, but also productive enough that most workflows do not need additional permissions. In order to get there, here's what we recommended to the team a few weeks ago: 1. Take the time to try out the tools. The tools do sell themselves — many people have had amazing experiences with 5.2 in Codex, after having churned from codex web a few months ago. But many people are also so busy they haven't had a chance to try Codex yet or got stuck thinking "is there any way it could do X" rather than just trying. - Designate an "agents captain" for your team — the primary person responsible for thinking about how agents can be brought into the teams' workflow. - Share experiences or questions in a few designated internal channels - Take a day for a company-wide Codex hackathon 2. Create skills and AGENTS[.md]. - Create and maintain an AGENTS[.md] for any project you work on; update the AGENTS[.md] whenever the agent does something wrong or struggles with a task. - Write skills for anything that you get Codex to do, and commit it to the skills directory in a shared repository 3. Inventory and make accessible any internal tools. - Maintain a list of tools that your team relies on, and make sure someone takes point on making it agent-accessible (such as via a CLI or MCP server). 4. Structure codebases to be agent-first. With the models changing so fast, this is still somewhat untrodden ground, and will require some exploration. - Write tests which are quick to run, and create high-quality interfaces between components. 5. Say no to slop. Managing AI generated code at scale is an emerging problem, and will require new processes and conventions to keep code quality high - Ensure that some human is accountable for any code that gets merged. As a code reviewer, maintain at least the same bar as you would for human-written code, and make sure the author understands what they're submitting. 6. Work on basic infra. There's a lot of room for everyone to build basic infrastructure, which can be guided by internal user feedback. The core tools are getting a lot better and more usable, but there's a lot of infrastructure that currently go around the tools, such as observability, tracking not just the committed code but the agent trajectories that led to them, and central management of the tools that agents are able to use. Overall, adopting tools like Codex is not just a technical but also a deep cultural change, with a lot of downstream implications to figure out. We encourage every manager to drive this with their team, and to think through other action items — for example, per item 5 above, what else can prevent a lot of "functionally-correct but poorly-maintainable code" from creeping into codebases.

413

12K

14K

codekee retweeted

DeepSeek

@deepseek_ai

over 1 year ago

🚀 DeepSeek-R1 is here! ⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & technical report 🏆 MIT licensed: Distill & commercialize freely! 🌐 Website & API are live now! Try DeepThink at https://t.co/v1TFy7LHNy today! 🐋 1/n

deepseek_ai's tweet photo. 🚀 DeepSeek-R1 is here!

⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at https://t.co/v1TFy7LHNy today!

🐋 1/n https://t.co/7BlpWAPu6y

35K

10K

13M

codekee retweeted

Andrej Karpathy

@karpathy

over 1 year ago

NotebookLM is quite powerful and worth playing with https://t.co/EMHIjc15iU It is a bit of a re-imagination of the UIUX of working with LLMs organized around a collection of sources you upload and then refer to with queries, seeing results alongside and with citations. But the current most new/impressive feature (that is surprisingly hidden almost as an afterthought) is the ability to generate a 2-person podcast episode based on any content you upload. For example someone took my "bitcoin from scratch" post from a long time ago: https://t.co/7ajZNZ0BGi and converted it to podcast, quite impressive: https://t.co/ZZn0LJgsnu You can podcastify *anything*. I give it train_gpt2.c (C code that trains GPT-2): https://t.co/gDrAqix4Iv and made a podcast about that: https://t.co/bgcwmQr5d7 I don't know if I'd exactly agree with the framing of the conversation and the emphasis or the descriptions of layernorm and matmul etc but there's hints of greatness here and in any case it's highly entertaining. Imo LLM capability (IQ, but also memory (context length), multimodal, etc.) is getting way ahead of the UIUX of packaging it into products. Think Code Interpreter, Claude Artifacts, Cursor/Replit, NotebookLM, etc. I expect (and look forward to) a lot more and different paradigms of interaction than just chat. That's what I think is ultimately so compelling about the 2-person podcast format as a UIUX exploration. It lifts two major "barriers to enjoyment" of LLMs. 1 Chat is hard. You don't know what to say or ask. In the 2-person podcast format, the question asking is also delegated to an AI so you get a lot more chill experience instead of being a synchronous constraint in the generating process. 2 Reading is hard and it's much easier to just lean back and listen.

244

829K

codekee retweeted

Richard Socher

@RichardSocher

almost 2 years ago

25 years ago, knowing "how to google" gave you an edge when it came to being more productive. Today, it's about prompt engineering and creating AI Agents on platforms like @youdotcom. That gets even more powerful when collaborating. This marks the beginning of our next chapter: The AI Productivity Engine We aren't a search engine for links. We're a productivity engine helping knowledge workers accomplish more. Our AI Agents research, give answers, solve problems, write and run code, create content, and more. We do this with a relentless focus on accuracy. We've built a model-agnostic AI Operating System, making any Large Language Model more accurate and trustworthy. Live web access, advanced search capabilities, dynamic prompting, and advanced citation logic ensure reliable, up-to-date information every time. It's a step function for productivity. Millions of knowledge workers already use us. Over 1B queries served since launch. 500% ARR growth since January 2024 We also just closed a $50M Series B today led by @Georgian_io, with @SalesforceVC, @Nvidia, SBVA, @DuckDuckGo, @DayOneVC, and others, bringing our total funding to $99 million. Better, better. Never done.

RichardSocher's tweet photo. 25 years ago, knowing "how to google" gave you an edge when it came to being more productive.

Today, it's about prompt engineering and creating AI Agents on platforms like @youdotcom. That gets even more powerful when collaborating.

This marks the beginning of our next chapter: The AI Productivity Engine

We aren't a search engine for links. We're a productivity engine helping knowledge workers accomplish more.

Our AI Agents research, give answers, solve problems, write and run code, create content, and more.

We do this with a relentless focus on accuracy.

We've built a model-agnostic AI Operating System, making any Large Language Model more accurate and trustworthy. Live web access, advanced search capabilities, dynamic prompting, and advanced citation logic ensure reliable, up-to-date information every time.

It's a step function for productivity. Millions of knowledge workers already use us. Over 1B queries served since launch. 500% ARR growth since January 2024

We also just closed a $50M Series B today led by @Georgian_io, with @SalesforceVC, @Nvidia, SBVA, @DuckDuckGo, @DayOneVC, and others, bringing our total funding to $99 million.

Better, better. Never done.

417

146

59K

codekee retweeted

Andrej Karpathy

@karpathy

about 2 years ago

Awesome and highly useful: FineWeb-Edu 📚👏 High quality LLM dataset filtering the original 15 trillion FineWeb tokens to 1.3 trillion of the highest (educational) quality, as judged by a Llama 3 70B. +A highly detailed paper. Turns out that LLMs learn a lot better and faster from educational content as well. This is partly because the average Common Crawl article (internet pages) is not of very high value and distracts the training, packing in too much irrelevant information. The average webpage on the internet is so random and terrible it's not even clear how prior LLMs learn anything at all. You'd think it's random articles but it's not, it's weird data dumps, ad spam and SEO, terabytes of stock ticker updates, etc. And then there are diamonds mixed in there, the challenge is pick them out. Pretraining datasets may also turn out to be quite useful for finetuning, because when you finetune a model into a specific domain (as is very common), you slowly lose general capability. The model starts to slowly forget things outside of the target domain. But this is not only restricted to knowledge; You also lose more general "thinking" skills that the original data demanded, but your new domain might not exercise. i.e. in addition to the broad knowledge fading, those computational circuits also slowly degrade. So there are likely creative ways to blend the pretraining and finetuning stages.

karpathy's tweet photo. Awesome and highly useful: FineWeb-Edu 📚👏
High quality LLM dataset filtering the original 15 trillion FineWeb tokens to 1.3 trillion of the highest (educational) quality, as judged by a Llama 3 70B. +A highly detailed paper.

Turns out that LLMs learn a lot better and faster from educational content as well. This is partly because the average Common Crawl article (internet pages) is not of very high value and distracts the training, packing in too much irrelevant information. The average webpage on the internet is so random and terrible it's not even clear how prior LLMs learn anything at all. You'd think it's random articles but it's not, it's weird data dumps, ad spam and SEO, terabytes of stock ticker updates, etc. And then there are diamonds mixed in there, the challenge is pick them out.

Pretraining datasets may also turn out to be quite useful for finetuning, because when you finetune a model into a specific domain (as is very common), you slowly lose general capability. The model starts to slowly forget things outside of the target domain. But this is not only restricted to knowledge; You also lose more general "thinking" skills that the original data demanded, but your new domain might not exercise. i.e. in addition to the broad knowledge fading, those computational circuits also slowly degrade. So there are likely creative ways to blend the pretraining and finetuning stages.

493

770K

Ankit Bahuguna @codekee

about 2 years ago

Collaboration is vital. We need developers, policymakers, and the public to work together to create ethical AI frameworks that benefit everyone. Let's keep the conversation going! #AIForAll (5/5)

Ankit Bahuguna @codekee

about 2 years ago

With AI on the rise, how can we ensure it's used ethically and protects user privacy? Open to ideas! #EthicsInAI #PrivacyMatters (1/5) 🧵

112

Ankit Bahuguna @codekee

about 2 years ago

Let's empower users! Can we give people more control over their data and how AI systems interact with them? Opt-in for data collection & clear options to withdraw consent. (4/5)

codekee retweeted

DuckDuckGo

@DuckDuckGo

about 2 years ago

📣 We're excited to introduce... DuckDuckGo Privacy Pro: three new protections bundled into one easy subscription. Subscribers get: ✅ An Anonymous VPN ✅ Personal Information Removal ✅ Identity Theft Restoration Learn more 👇 https://t.co/zGsbS6s3QB

396

113

383K

codekee retweeted

WIRED

@WIRED

about 2 years ago

Privacy-focused company DuckDuckGo is launching a tool to remove data from people-search websites, a VPN, and an identity theft restoration service. https://t.co/RJDaxTUrkD

29K

codekee retweeted

DuckDuckGo

@DuckDuckGo

over 2 years ago

On this day in 2008: DuckDuckGo was launched. 🎉 15 years later, we've built something truly rare in tech: a healthy, profitable company that protects user privacy, instead of exploiting it. 🦆

243

650K

codekee retweeted

ISRO

@isro

almost 3 years ago

Chandrayaan-3 Mission: Chandrayaan-3 ROVER: Made in India 🇮🇳 Made for the MOON🌖! The Ch-3 Rover ramped down from the Lander and India took a walk on the moon ! More updates soon. #Chandrayaan_3 #Ch3

201K

33K

401

Ankit Bahuguna

@codekee

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users