@lucataco93 Nice!
FYI peak memory during training depends on batch size and # tokens in the dataset.
You should give SiLLM a try for training with LoRA/DPO: https://t.co/hf4uTSX6xV
Running Llama-3-8B-Instruct on Mac with the SiLLM framework powered by MLX... just took some fiddling with the tokenizer & template to get it to run 😁
@ivanfioravanti@awnihannun Just the product of lots of tinkering and trying to port DPO & losses over to MLX 😁
Might have some bugs that I'm not seeing 🙈
Example code with the DPO-mix dataset here: https://t.co/h7GvQ6rat2
A huge thank you to @awnihannun@angeloskath and the rest of the team for developing the MLX framework that SiLLM relies on! Also big kudos to all the contributors of the MLX Examples project 👏
I'm excited to share a new open-source project: the Silicon LLM Training & Inference Toolkit, short SiLLM.
Check out the project on Github here:
https://t.co/hf4uTSX6xV
The repository includes several code examples:
- LoRA training with the Nvidia HelpSteer dataset
- DPO Fine-tuning with the DPO Mix 7K dataset
- Implementation of the MMLU Benchmark
- Calculating perplexity scores of a model for a sample dataset
One of the reasons to attend the #CARO2023 is the food for thought that is delivered in talks, conversations, and of course keynotes. Armin Büscher @armbues will share his technical perspective about innovation and disruption in cybersecurity.
https://t.co/PDawMvhhv8
This file leaked an Security Enterprise Virustotal API Key before!But now it's expired because someone leaked the key😅
ITW:07c4a75b1422a22ec29c5102e0b67055
API Key:d10468bead05da1685629a0abcfed5f963d6adbc7e6bb2b2fc343dbb36be0349
unbelievable!
I'll be traveling to Vegas for #blackhat2022 and #DEFCON next week. Looking forward to hang out with many infosec folks I haven't seen in a long time 🥳