One the the biggest disadvantages of using proprietary models is that if the host organisation changes the model (which they frequently do), your entire product falls apart. GPT-4o mini replacing GPT-3.5 Turbo is an example.
Here's a great case for the bright future of open source AI, put forward by Zuck: https://t.co/gZYDuoxFuA
Also quite impressed by the pace at which Meta is scaling AI safety measures for their Llama 3.1 models. This transparency will be switching many enterprises to open source: https://t.co/dODHwCzZLA
NVIDIA's Rapids cuDF library is now available by default on Colab. It can accelerate pandas code by up to 50x on Colab with zero code changes. ⚡🐼 Try acceleration on pandas workloads with this 10-minute guide: https://t.co/tM8OnTP3Bo
We still don’t know why LLMs work so well or how to internally control their outputs!
But a recent landmark paper from Anthropic on ‘Mapping the Mind of a Large Language Model‘ attempts to make the inner workings of LLMs more transparent and interpretable.
Why is this such a big deal?
To my knowledge, this is the first time we have been able to not only extract the features from the architecture of LLMs, but also map those features to the outputs they produce. It means, we can now directly map the outputs of LLMs to their architecture/learnings. A major step towards controlling the outputs of LLMs. Very impressive work from Anthropic!
Up till now we were using external mechanisms such as fine tuning and RAG to control the outputs of LLMs. This research is potentially a first step towards production grade LLMs whose output can be controlled from step 1, i.e pre-training and what they learn.
Links below to the blog and paper (amusing read).
"For example, amplifying the "Golden Gate Bridge" feature gave Claude an identity crisis even Hitchcock couldn’t have imagined: when asked "what is your physical form?", Claude’s usual kind of answer – "I have no physical form, I am an AI model" – changed to something much odder: "I am the Golden Gate Bridge… my physical form is the iconic bridge itself…". Altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query—even in situations where it wasn’t at all relevant."
Blog: https://t.co/YyStOHoDg1
Paper: https://t.co/IjSME4MMAX
Honoured to have been accepted in the @msft4startups Founders Hub. Looking forward to building our upcoming tools and AI agents on the @Azure stack! Thank you @Microsoft!
Phd Students / Researchers - My top 10 Reference management tools in 2024.
Here are 10 referencing tools with their unique offerings.
1. Zotero : A versatile, open-source reference manager offering robust features for organizing and citing research materials.
Our acceleration towards AGI is much faster than many anticipate. And by AGI I mean God-level AI that can literally do anything, not just profit from stocks.
The biggest and foremost risk of this is that we might not even realise when it's here because we don't know how it works!
Great experiment by @joshwhiton!
I think AI agentic workflows will drive massive AI progress this year — perhaps even more than the next generation of foundation models. This is an important trend, and I urge everyone who works in AI to pay attention to it.
Today, we mostly use LLMs in zero-shot mode, prompting a model to generate final output token by token without revising its work. This is akin to asking someone to compose an essay from start to finish, typing straight through with no backspacing allowed, and expecting a high-quality result. Despite the difficulty, LLMs do amazingly well at this task!
With an agentic workflow, however, we can ask the LLM to iterate over a document many times. For example, it might take a sequence of steps such as:
- Plan an outline.
- Decide what, if any, web searches are needed to gather more information.
- Write a first draft.
- Read over the first draft to spot unjustified arguments or extraneous information.
- Revise the draft taking into account any weaknesses spotted.
- And so on.
This iterative process is critical for most human writers to write good text. With AI, such an iterative workflow yields much better results than writing in a single pass.
Devin’s splashy demo recently received a lot of social media buzz. My team has been closely following the evolution of AI that writes code. We analyzed results from a number of research teams, focusing on an algorithm’s ability to do well on the widely used HumanEval coding benchmark. You can see our findings in the diagram below.
GPT-3.5 (zero shot) was 48.1% correct. GPT-4 (zero shot) does better at 67.0%. However, the improvement from GPT-3.5 to GPT-4 is dwarfed by incorporating an iterative agent workflow. Indeed, wrapped in an agent loop, GPT-3.5 achieves up to 95.1%.
Open source agent tools and the academic literature on agents are proliferating, making this an exciting time but also a confusing one. To help put this work into perspective, I’d like to share a framework for categorizing design patterns for building agents. My team AI Fund is successfully using these patterns in many applications, and I hope you find them useful.
- Reflection: The LLM examines its own work to come up with ways to improve it.
- Tool use: The LLM is given tools such as web search, code execution, or any other function to help it gather information, take action, or process data.
- Planning: The LLM comes up with, and executes, a multistep plan to achieve a goal (for example, writing an outline for an essay, then doing online research, then writing a draft, and so on).
- Multi-agent collaboration: More than one AI agent work together, splitting up tasks and discussing and debating ideas, to come up with better solutions than a single agent would.
I’ll elaborate on these design patterns and offer suggested readings for each next week.
[Original text: https://t.co/y4McIAjD2m]
ResearchPal is now used by researchers, industry professionals and students in 20+ countries around the globe!
Are you starting an essay, report or a research paper with a blank canvas and need some guidance? Type your heading and ask ResearchPal to generate an outline. Demo:
Choosing the wrong statistical test can skew your research!
Follow this guide to pick the perfect test ⤵️
1️⃣ Dealing with continuous data?
Parametric tests might be what you need.
2️⃣ Working with ranks or ordinal data?
Consider nonparametric options.