@0xSero@MolokaiHex https://t.co/VtbzI8v9aE ?
Too tight maybe. Is there accuracy benchmarks is the model.
If it performs near the original one and is able to run with Sglang or vllm with decent tps and batch with a couple of rtx6000 96gb this could be great
@skalskip92 I mean if pose points have associated a person id that is consistent ( same person, same id) during the scene.
I usually have done this in 2 steps, tracking people, pose estimation and post processing iou of both outputs to pair person id with pose.
We're thrilled to announce SignGemma, our most capable model for translating sign language into spoken text. 🧏
This open model is coming to the Gemma model family later this year, opening up new possibilities for inclusive tech.
Share your feedback and interest in early testing → https://t.co/C4rV2be4mL
Join us for the third MULTIDATA webinar: a hands-on intro to our project and pipeline!🗓️ May 19, 4 PM (CET)🎤 Cristóbal Pagán (project coordinator)
🔍 Learn about speech transcription, gesture tracking, and more — no tech skills needed!
🎟️ Save your spot: https://t.co/szRDpPzCrA