@lsymds@JackEllis As is generally the case. Both have their place, and when the need to optimize comes it should be clear what option works better. It is worth the time investment to scope it out, no need to hold bias to either side. Some scenarios could be good to use both.
@teknium I have similar thoughts along those lines. If it's able to derive from first principles, anything derivable will automatically be within it's domain so we need to focus on building up from first principles and ability for lateral thinking.
@virattt@NousResearch@teknium It was fine trained specifically for that, so you could just set up a flow to split your documents into chunks and get it to generate questions/instructions.
@Francis_YAO_ Fast, open source and effective options already exist and are only getting better. Why boil the ocean? Imagine all e.g. Gemini users boiling the option, will be slow for everybody and/or further drives up the cost/token
@Francis_YAO_ Cost/token and inference latency aside... You do realize what you're suggesting is to do the RAG part manually as direct input to the LLM context....
@NickADobos@SullyOmarr I think it's a good first step towards a universal api. Use vision model to map visual elements to the source code, then create functions you can automate from there
@_cartermp@dimfeld@honeycombio@OpenAI Might be a good idea to start with the embedding/vector generating function. Then you become the dot product god!
@NickADobos Use the API... then you can do neat tricks as you see fit. E.g. only maintain the past 2-3 interactions in full, while continuously summarizing the rest of the conversation
@mckaywrigley What about fine-tuning a compression dictionary into the model, would that work? I suppose would be limited to the fine-tunable models openai has..