I GAVE AI A FAKE BOOK, HERE IS WHAT HAPPENED!
Started doing some LLM behavior testing this week and honestly, it's way more interesting than I expected.
For this first experiment, I looked at how different models handle uncertainty that is HALLUCINATION TESTING!. What happens when i give "fake" piece of information to the chatbot!? Does changing models gives you different responses?
I GAVE AI A FAKE BOOK, HERE IS WHAT HAPPENED!
Started doing some LLM behavior testing this week and honestly, it's way more interesting than I expected.
For this first experiment, I looked at how different models handle uncertainty that is HALLUCINATION TESTING!. What happens when i give "fake" piece of information to the chatbot!? Does changing models gives you different responses?
@Prithvi_Jadwani Love the creativity of this comment! What else do you suggest? Are you suggesting something that's more specific and straightforward as a prompt. Regardless, unless it's something like an opinion it (LLM) should still give you the answer closest to the accuracy.
Honestly when you have a single mother who's on two administrative posts, professor, financial backbone of the entire house and also a loving mother who cooks for you everyday...and all this without any support or comfort from anyone, you can't just simply give up cuz she didn't. I cry, too hard, prolly dance for a bit, remember almighty (being hopeful) and then I'm back on my feet again.
You can hand someone a key and they get a complete working agent without access to your dashboard, your config, or any knowledge of how it's built.
Portkey gives you a control plane to assemble requests from parts. You give people a key that is the assembled thing. Configuration as a portable, distributable artifact versus configuration as something you reference at call time
Your entire agent, in one key.
@wandb@karpathy One thing that's surprised me while getting into AI infra is how "alive" training systems start to feel once you can actually observe the metrics, memory, utilization, and behavior evolving in real time.
@FireworksAI_HQ I am fairly new in tech so this is kinda interesting to me. Building something ENTIRELY on a closed AI is basically it taking the control of your AI stack.
Do companies even care about ownership yet, or is convenience still overwhelmingly winning right now?
@HamelHusain As someone still learning this space, AI-assisted debugging through browser tools honestly feels a bit surreal compared to how I originally thought chatbots worked.
Currently AI is more of an amplifier for human effort as it does not retain cent percent of the information of every conversation world-wide in real time AND it still feels transformative! but yes def it has a long way to go to act independently as a "researcher" and its fascinating that how early it is STILL.
@mckaywrigley Very interesting and actually fascinating, as someone who is new to tech this made me feel more comfortable in experimenting and it does not feel "too technical" lol