The HapticAI beta is right around the corner, and we'd like to take this opportunity to announce our next partner, @heurist_ai!
Heurist is a DePIN protocol for AI inference 👾
Find more details about this partnership at the blog link below or read on for a quick summary 🧵 👇
https://t.co/YA5Dh48g8G
today we are announcing reinforcement finetuning, which makes it really easy to create expert models in specific domains with very little training data.
livestream going now: https://t.co/ABHFV8NiKc
alpha program starting now, launching publicly in q1
This is where Haptic can provide a valuable service to upcoming AI agents. To avoid malicious feedback this ability can be token gated to top holders and aligned users
Feedback is something that can be crowdsourced and becomes the easiest way to level up these agents
What if users and token holders can signal to the AI agent which of their responses are appreciated by the users and which responses seem like low quality responses
Tulu 2.5 work is still underrated RLHF paper. Lots of industry interest, not that much immediate academic uptake. Really great empirical study on how DPO and PPO work across datasets and implementation.
Core to a lot of our efforts on Tulu 3.
AI agents are the biggest thing happening in crypto right now. It started with Andy Ayrey running the infinite backrooms and the conversations between AI leading to the creation of Truth Terminal and $GOAT
RLHF isn't just about making AI agents better - it's about making them better partners for humans. It's a bridge between raw capability and practical utility.