Perception Research in Computer Vision. An open source org exploring and advancing large scale CV systems, particularly in tracking and action recognition.
Congrats @katylevinson@sidesolve! https://t.co/DjenCIxVMj
> The relator, Sidesolve, Inc., will receive a $1 million share as part of the settlement.
People are dumping on GPT wrappers, but tons of robot/machine vision companies successfully charging $5K+/deployment for OpenCV wrappers, and not even good ones.
Built with VisionCamera. All the drawing and blurring is happening with JavaScript code, and it runs in realtime. No post-processing. This is going to change the entire mobile camera game. Get ready.
You can now import BabyAGI, AutoGPT, and the generative agents memory from `langchain.experimental`!
We started this module to include more experimental code as we think about how best to include in the core library
What should we include next?
Is this the future of the metaverse?
This new generative AI tool creates text-to-space.
You can now create interactive virtual experiences through simple prompts.
After 7 hours - Why does it keep responding "As an AI assistant I am a large language model and as such I cannot ... xyz"
I didn't put that anywhere in the prompt where is it getting that from??
We're launching Guru Sports, an AI-powered dev toolkit for providing elite sports analysis on mobile video.
Nearly half of the 2023 NFL Draft's 1st-round are already using Guru Sports to train for next week's #NFLCombine with @les7spellman. Learn more👇🏾
https://t.co/7Hd9s5biaF
6. Reuse old Kaggle solutions
Kaggle solutions are like a database for battle-tested solutions.
If you have a similar enough problem to an old competition, those solution writeups are an absolute goldmine.
My first stop on any problem for "literature review" is Kaggle.
1/ Today we’re excited to unveil Sieve (@sievedata), a cloud native platform for processing, searching, and running all sorts of AI models on video.
https://t.co/VJe3lgBJuX
Floorplan-Aware Camera Poses Refinement
Anna Sokolova, Filipp Nikitin, Anna Vorontsova, Anton Konushin
tl;dr:
If you have a map or floorplan, use it for improving 3d reconstruction and camera trajectory
https://t.co/ruedEJgzhj
Bifrost: cross-platform p2p communications engine in Go. Modular daemon & library for web3.0 connectivity and @libp2p peers. Supports RPC & NATS PubSub, Quic, and running in the web browser. https://t.co/lhgD4umY9a
Are language models (LMs) good models of the visual world? We show that without explicit grounding, LMs can directly use linear projections of image representations as soft prompts for vision-language (VL) tasks. This can be done without tuning the LM or image encoder!
> computer vision is one of the most significant technological steps in the last ten years and has fundamentally changed the scale of innovation.
https://t.co/5Yk8XDjwyu