Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably
https://t.co/iN2JtWhn23
O material foi organizado pelas professoras Helena de Medeiros Caseli, da Universidade Federal de São Carlos (UFSCar) e Maria das Graças Volpe Nunes, do Instituto de Ciências Matemáticas e de Computação da USP (ICMC-USP).
Saiba mais: https://t.co/HascnuHjQ5
I'm thriled to announce that my paper 𝙏𝙤𝙬𝙖𝙧𝙙𝙨 𝙖𝙣𝙖𝙡𝙮𝙨𝙞𝙨 𝙤𝙣 𝙩𝙚𝙭𝙩𝙪𝙖𝙡 𝙞𝙣𝙛𝙚𝙧𝙚𝙣𝙘𝙚 𝙖𝙩 𝘼𝙎𝙎𝙄𝙉-𝟮 𝙙𝙖𝙩𝙖𝙨𝙚𝙩 has been accepted at STIL 2023 🇧🇷 as a short paper.
In this paper we presented an preliminary study on neural-symbolic model to NLI.
Today this morning I took some time to look back at @huggingface Deep RL course, in unit-3 it was covered a Deep Q-Network training an RL agent to play an atari game, and I am proud of my progress in the RL world🤖
Sharing a post from @Marktechpost on top latest LLMs in 2023 and a list of companies to which one of them belongs.
link: https://t.co/ZkPNjdfwH6
#ai#LLM#nlp
Enjoying this Saturday afternoon to review some tips and tricks of the Deep learning tuning playbook, insightful notes on what could work or not in deep neural networks, and a great DL tuning study resource.
📕👇
https://t.co/Iq6LklH9Nj