Vincent Hsu @VincentHsuX - Twitter Profile

almost 2 years ago

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ https://t.co/M19x2ndwnr 🛠️ Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

526

7K

2K

2M

VincentHsuX retweeted

Sam Altman

@sama

almost 2 years ago

check out canvas in chatgpt: https://t.co/aswD9SwW04

330

5K

514

981

1M

VincentHsuX retweeted

OpenAI Newsroom

@OpenAINewsroom

almost 2 years ago

We’ve raised new funding to accelerate our mission to ensure that artificial general intelligence benefits all of humanity. https://t.co/ya6PwCd2x4

120

2K

251

217

571K

VincentHsuX retweeted

Andrej Karpathy

@karpathy

almost 2 years ago

Input optional product Don't ask your users for input. Coming up with input is hard, and a barrier to use. Think of users as wanting to play. We have AI - predict the input! Design products into autonomous environments. Allow users to play by steering a bit.

183

4K

293

1K

324K

Who to follow

ankitojha

@x2d30

25, Engineer | Opinions are personal

roze

@SherozeShabir

friends and family first ! ,😀😀😀 pesa aane jane wali cheez hai

Jana Houston

@_janahouston

Co-Founder @constructcrm AI-Powered eCommerce & Order Platform Revolutionizing Building Products & Materials Distribution @burnbootcamp enthusiast

VincentHsuX retweeted

Daniel

@growing_daniel

about 2 years ago

Silicon Valley (the series) is so back

67

3K

177

444

849K

Vincent Hsu @VincentHsuX

over 2 years ago

@eddiekao 我也是fuckkkkk

0

140

VincentHsuX retweeted

Josh Miller

@joshm

over 2 years ago

What are "websites" in an AI-first internet? (thinking aloud, no idea, would love feedback)

64

601

33

284

89K

VincentHsuX retweeted

AI Will @FinanceYF5

over 2 years ago

OpenAI 首席执行官萨姆·奥尔特曼 (Sam Altman) 表示，由于人工智能，在不久的将来，一家“一人价值数十亿美元的公司”将会诞生。您认为哪种类型的人工智能个体企业家首先达到 Sam Altman 预测的 10 亿美元里程碑？建筑商？内容创建者？人工智能模型

1

21

7

16

6K

VincentHsuX retweeted

Josh Miller

@joshm

over 2 years ago

a little Sunday surprise for you... meet @browsercompany's 2nd product: 🔍Arc Search🔎 it's a default browser for your iPhone ...that BROWSES FOR YOU the origin story is a bit unusual so I wanted to give you the full backstory:

381

4K

436

1K

1M

VincentHsuX retweeted

DHH

@dhh

over 2 years ago

"The best designers and the best programmers aren’t the ones with the best skills, or the nimblest fingers, or the ones who can rock and roll with photoshop or vim, they are the ones that can determine what just doesn’t matter. That’s where the real gains are made."

dhh's tweet photo. "The best designers and the best programmers aren’t the ones with the best skills, or the nimblest fingers, or the ones who can rock and roll with photoshop or vim, they are the ones that can determine what just doesn’t matter. That’s where the real gains are made." https://t.co/yYlqUlQiL2

51

2K

292

938

184K

VincentHsuX retweeted

OpenAI

@OpenAI

over 2 years ago

ChatGPT is turning 1! 🥳🎂 Today we’re celebrating our users around the world who are doing amazing things.

537

5K

780

217

750K

VincentHsuX retweeted

Sam Altman

@sama

over 2 years ago

i love the openai team so much

4K

68K

4K

1K

32M

VincentHsuX retweeted

Jerry Liu

@jerryjliu0

over 2 years ago

There's a new multi-modal RAG stack that's emerging, letting users do QA over complex documents and images. Here's a diagram and 🧵 of what it consists of 👇 Multi-modal RAG extends beyond RAG in the following ways: * Input: The input can be a text or image query. * Embeddings: You can natively embed/index images with joint embeddings (CLIP). You can choose to embed text the same way or use specialized text embeddings (e.g. ada) * Storage: Use a vector database to store images. The image file itself could live in a separate docstore or in a vector db. You can use the same vector db for image/text storage (e.g. @trychroma ) or separate collections * Retrieval: Given a user query, the retrieved context can be text, images or both. If we use separate image/text embeddings, you need two retrieval calls. * Synthesis: We can use either a multi-modal model (GPT-4V) that can take in both text and images, or a standard LLM (gpt-4-turbo) that takes in just text. If the latter, you may need to caption/summarize each image into text. * Response: The returned result can be text or images. We're building towards this future. We launched multi-modal indexing/retrieval abstractions in @llama_index today. We're building towards the following: 💡 More multi-modal embeddings/LLMs 💡 More ways to store images/text in different storage systems 💡 More ways to combine/apply lessons from text retrieval to image retrieval 💡 More ways to synthesize over arbitrary text/images Blog: https://t.co/VtFgtOZzHi Multi-modal RAG example: https://t.co/yh6e78frXK

jerryjliu0's tweet photo. There's a new multi-modal RAG stack that's emerging, letting users do QA over complex documents and images.

Here's a diagram and 🧵 of what it consists of 👇

Multi-modal RAG extends beyond RAG in the following ways:
* Input: The input can be a text or image query.
* Embeddings: You can natively embed/index images with joint embeddings (CLIP). You can choose to embed text the same way or use specialized text embeddings (e.g. ada)
* Storage: Use a vector database to store images. The image file itself could live in a separate docstore or in a vector db. You can use the same vector db for image/text storage (e.g. @trychroma ) or separate collections
* Retrieval: Given a user query, the retrieved context can be text, images or both. If we use separate image/text embeddings, you need two retrieval calls.
* Synthesis: We can use either a multi-modal model (GPT-4V) that can take in both text and images, or a standard LLM (gpt-4-turbo) that takes in just text. If the latter, you may need to caption/summarize each image into text.
* Response: The returned result can be text or images.

We're building towards this future. We launched multi-modal indexing/retrieval abstractions in @llama_index today. We're building towards the following:
💡 More multi-modal embeddings/LLMs
💡 More ways to store images/text in different storage systems
💡 More ways to combine/apply lessons from text retrieval to image retrieval
💡 More ways to synthesize over arbitrary text/images

Blog: https://t.co/VtFgtOZzHi

Multi-modal RAG example: https://t.co/yh6e78frXK

4

443

91

477

91K

VincentHsuX retweeted

Sam Altman

@sama

over 2 years ago

GPTs can save a lot of effort:

2K

42K

3K

7M

Vincent Hsu @VincentHsuX

over 2 years ago

@chingtsengtw murrrrrrrr

0

121

VincentHsuX retweeted

Kai-Fu Lee

@kaifulee

over 2 years ago

I must say China is now behind the U.S. in the LLM ecosystem momentum. The underlying vision for me to start @01AI_Yi is to make better AI accessible to more people. We are glad our first moderate-size Yi-34B performs competitively at a global level. More to come soon after this base model. https://t.co/WjUd18RN7f https://t.co/OCcpAtqgTQ

31

533

70

110

254K

Vincent Hsu @VincentHsuX

over 2 years ago

@mckaywrigley glad that ur back hahah

0

224

Vincent Hsu @VincentHsuX

over 2 years ago

在網路圈做了10幾年產品經理，花了快一年時間All IN AI，研究很多No-code AI ChatGPT方法。怎麼自動化工作AI Flow。我相信未來會是個AI Creator時代。 AI課程：https://t.co/2BpXxFVVOB (500NT折扣碼:aicreator) #AI #ChatGPT #AICreator #AIAutomation #AIGeneration