Here is a demo video of my submission this year for the Swift Student Challenge.
It is a basketball analyzer app, that counts your shots, calculates metrics like Release Angle, Ball Speed, Miss Reason(ex. : Short, Long).
#swiftstudentchallenge
Day 1 of 3 days of MLX:
Introducing MLX-Audio-Swift SDK 🚀
A modular Swift SDK for voice agents and tasks on Apple Silicon built by @lllucas and yours truly.
iOS, macOS, and visionOS developers can now build native apps with real-time, on-device audio intelligence:
🗣️ Text-to-Speech (TTS)
👂 Speech-to-Text (STT)
🔄 Speech-to-Speech (STS)
🎙️ Voice Activity Detection (VAD) and more.
Only import the capabilities you need, nothing extra.
Get started today and leave us a star ⭐️
https://t.co/AXJvHw0DY6
this might be the coolest blogpost I ever written
I dove deep into:
- player detection with RF-DETR
- player tracking with SAM2
- team clustering with SigLIP and K-means
- number recognition with SmolVLM2 and ResNet
I hope you'll like it
link: https://t.co/PPPGTD8L2v
Switched to the YOLO11 model by Roboflow. The result is a lot more accurate and much less flickery.
In the video, you can also see a comparison between two models trained on the same dataset:
-One trained in Create ML using a YOLOv2-based architecture
-A YOLO11 model by Roboflow
Swift / macOS project for detecting and tracking the ball, players, and other on-court elements in a basketball game.
I trained a custom CoreML model, running fully on-device. Accuracy isn’t perfect yet, so I’ll keep improving it.
@Droni0s Me too. Still not sure about whether my project having no "story" will have a negative effect on the judgement process or not. Had the same concern last year, since app was requiring you to upload a video or use my demo one to use the app.
Swift / macOS project for detecting and tracking the ball, players, and other on-court elements in a basketball game.
I trained a custom CoreML model, running fully on-device. Accuracy isn’t perfect yet, so I’ll keep improving it.
@itsuki68391179@gitconnected Maybe this is intentional for CentroidTracker, but still I think there should be some ways to avoid relying on results of object detection model on every frame.
@itsuki68391179@gitconnected For example, to use this on a video, CentroidTracker requires you to run object detection on every frame, (which is, I guess not the case for TrackObjectRequest), so if detection fails on a certain frame, bounding box disappears. What might be the possible solutions for this?
@sach1n@fluidaudio@liquidai@Apple Using FluidAudio, my diarization result is so inaccurate. But saw your demo, so it works fine I guess.
Can you share your function that handles the diarization?