@BossIgnostic@dimitripilled Yeah but that's my point, llms have not been ready for a decade and the were just working out how to make them behave. They are pretty new.
As it turns out a recipe that makes 12 pancakes, actually makes 12 pancakes. Who knew?
I now need to lie down and digest a truly unreasonable number of calories.
as a reminder: AI cannot generate knowledge. It cannot create knowledge. It cannot find new information. It can only mix information that has already been found and written and input into computers by humans.
I would bet a reasonable amount of money that this is more a product of llm visual systems not being very good at encoding images than a lack of capability on the part of the language model part.
If there was a way to encode the problem in text they would do better.
For those who don't know, Joseph Redmon created the original YOLO series of object detection models which had a huge impact on modern computer vision.
Additionally, he has a very entertaining paper writing style 😅
Also can't mention him without pointing out his absolutely hilarious resume 🤣
He has quit the field of AI a while back due to concerns of how AI is being used for evil use-cases but now he's at AI2 working on AI for good!