Folks at @oughtinc are amazing man. Think its one of the few that are focusing more on how to innovate in AI via interface paradigms beyond a chatbot interface.
A Library and Tutorial for Factored Cognition with Language Models | Ought
https://t.co/IitqSkNEWA
The "Essay competition on the Automation of Wisdom and Philosophy" is now live! $25k in prizes
Lots of great questions in the post:
Automation of wisdom
- What is the nature of the sort of good thinking we want to be able to automate? How can we distinguish the type of thinking it’s important to automate well and early from types of thinking where that’s less important?
- What are the key features or components of this good thinking?
- How do we come to recognise new ones?
- What are traps in thinking that is smart but not wise?
- How can this be identified in automatable ways?
- How could we build metrics for any of these things?
Automation of philosophy
- What types of philosophy are language models well-equipped to produce, and what do they struggle with?
- What would it look like to develop a “science of philosophy”, testing models’ abilities to think through new questions, with ground truth held back, and seeing empirically what is effective?
- What have the trend lines for automating philosophy looked like, compared to other tasks performed by language models?
- What types of training/finetuning/prompting/scaffolding help with the automation of wisdom/philosophy?
- How much do they help, especially compared to how much they help other types of reasoning?
Thinking ahead
- Considering the research agenda that will (presumably) eventually be needed to automate high quality wisdom/philosophy:
- Which parts of the agenda can we expect to automate in a timely fashion?
- What is the core that we will need humans to address?
- What do we expect the thorny sticking points to be?
- Why may or may not this problem be solved “by default”? (from a technical standpoint)
- Can we tell concrete stories or vignettes in which the automation of wisdom/philosophy is/isn’t important, to triangulate our understanding of what matters?
- What preparatory research could provide the best groundwork for humanity to automate high-quality wisdom/philosophy before it is necessary?
- What projects today or in the near future would be valuable to undertake?
Ecosystems
- If the world were devoting serious attention to this, what would that look like?
- What incentives on institutional actors could push work onto related but less important questions; vice-versa what could help ensure that work remained well-targeted?
- What are the natural institutional homes for this research in the short term?
- Academia? Nonprofits? Frontier AI labs? Elsewhere in industry?
- What might be needed (proofs, audits, track record?) to enable humans (decision-makers, voters) and human institutions to correctly trust wise advice from AI systems?
- How could we lay the groundwork for this?
- Ideas for catalysing/sustaining this field?
- Why may or may not this problem be solved “by default”? (from a social standpoint)
New paper with @__Charlie_G on how to measure and reduce hallucination using task decomposition
Blog post: https://t.co/OGPgY9MAoq
Article: https://t.co/uFd0zLTUkV
🧵 has the main findings
1/ Can large language models detect and correct their own hallucinations when summarizing academic papers? In our new paper, we explore a new method we call factored verification to help answer this question.
Blog: https://t.co/kwfoHMdhAF
People often ask - how does Elicit relate to AI Safety? Here's my answer
In brief, the two main impacts of Elicit on AI Safety are improving epistemics and pioneering process supervision.
https://t.co/eubXum0TrY
Now that @elicitorg is an independent company let's review our mission - what is scaling up good reasoning & why do we care?
https://t.co/D08zagGy0x
The Ought board - @paulfchristiano@OwainEvans_UK and Owen Cotton-Barratt - reflected on how to accomplish Ought's goal of scaling up good reasoning and decided to create a new public benefit corporation
https://t.co/C6dEVt4MEY
My very first phd student, @stuhlmueller, founded @oughtinc after leaving CoCoLab. Ought Gad done amazing work as a nonprofit lab, that helped me see the power of LLMs. I’m excited for their next chapter as @elicitorg!! (And in a new role for me I’m an “angel”)
@elicitorg aims to accelerate and augment human reasoning.
To that end, we've started by trying to make high quality literature review go from a rarity to a commodity. Any researcher should be able to look at a new problem area, and get up to speed in days instead of weeks
Working to get here has been an amazing journey! Many early mornings and late nights spent making a tool that intends to be like the green revolution for the tree of knowledge
A little thread on my thinking on Elicit as a product:
7/ There is so much left to do!
Help us build intuitive and general interfaces that can run language models at high accuracy and superhuman scale to automate important research - https://t.co/Elqu4byErg
2/ You can now upload up to 100 of your own papers to extract data from. Great for automating the screening or extraction steps of systematic reviews and meta-analyses.
1/ Announcing our spinoff from @oughtinc into a public benefit corporation, our $9 million seed round, and a much more powerful Elicit!
This new Elicit takes the components of the popular literature review workflow and extends them to automate more research workflows.
Today feels like a really cathartic end to a hectic two weeks. https://t.co/AOt9ivNbqy is now fully out of beta, and I spent a bunch of time talking about long term plans and ways to improve how we work