The $TAO pack must remain attentive to our vision. Together we rise and together we fall.
We are airdropping $90.000 in $TAO to all our holders! #Bittensor
$TAO Airdrop:
https://t.co/0pUcaB8gd6
Expect amazing things coming next week!
$TAO
5pm EST, Thursday the 16th
We will be launching our first Bittensor Twitter spaces (Link below)
We will be using this slot to keep our community up to date with company news and answer questions.
In the Transformer movies, 9 Decepticons merge to form “Devastator”, a much larger and stronger bot.
This turns out to be a powerful paradigm for multimodal LLM too. Instead of a monolithic Transformer, we can stack many pre-trained experts into one.
My team’s work, Prismer, is a representative example. We use a textual LM as the backbone, and plug in many visual domain experts through a neural adapter interface for deep integration.
Yesterday, Microsoft provided another example called “Visual ChatGPT”. It uses ChatGPT as a central communication hub, and plugs in many blackbox visual models, such as Stable Diffusion, pix2pix, and ControlNet. The result is a multimodal conversational AI that both understands and generates images, with ZERO trainable parameters: 🧵
A Cascade of Foundation Models
-Combine diverse prior knowledge
DINO vision-contrastive info
CLIP lang-contrastive info
DALL-E vision info
GPT-3 lang info
-Ensembles via cache model
State-of-the-art few-shot prediction
Paper https://t.co/4pu0X52YOW
Code https://t.co/E8cwOypgPE
Exploration vs exploitation is a central problem in decision making, and applies broadly to research as well. Before GPT, I believe there was *too much* exploration - siloed models and fragmented training pipelines all over the place.
Now I feel there’s *too little*. We may have annealed the rate of exploration too aggressively. Going to either extreme is suboptimal.