12kaz @12_technology - Twitter Profile

Pinned Tweet

12kaz @12_technology

about 4 years ago

機械学習手法NAFNetでノイズ除去、ブラー除去、超解像を試してみました。カラーノイズ除去はなかなかの精度です。使い方はこちら https://t.co/bE6fo39q1K

1

3K

700

652

0

12_technology retweeted

Axross Recipe：つかえる知識をともに学べる場所

@AxrossRecipe_SB

over 1 year ago

＼🍁11月有料レシピランキング🏆／ 🥇：PointNetで3次元の物体検出を行うレシピ 🥈：YOLOとStrongSORTを用いて物体検出・物体追跡するレシピ 🥉：MemoRAGを用いて検索拡張生成(RAG)アプリを開発するレシピ 4��以下はコチラのURLからぜひご覧ください👇 https://t.co/fm6OCqZaov #AxrossRecipe #AI

AxrossRecipe_SB's tweet photo. ＼🍁11月有料レシピランキング🏆／
🥇：PointNetで3次元の物体検出を行うレシピ
🥈：YOLOとStrongSORTを用いて物体検出・物体追跡するレシピ
🥉：MemoRAGを用いて検索拡張生成(RAG)アプリを開発するレシピ

4��以下はコチラのURLからぜひご覧ください👇
https://t.co/fm6OCqZaov

#AxrossRecipe
#AI https://t.co/EOTMK1met6

0

4

2

0

426

12_technology retweeted

Axross Recipe：つかえる知識をともに学べる場所

@AxrossRecipe_SB

almost 2 years ago

＼🍀8月有料レシピランキング🍀／ 🥇：rinna社の日本語特化GPTモデルを用いて文書生成を行うレシピ 🥈：ChatGPTと同じNLPモデルであるGPT3.5系と話せるDjangoアプリをデプロイするレシピ 🥉：Imagenを用いてテキストから画像生成するレシピ https://t.co/s9pSRpDBe9 #AxrossRecipe #生成AI

0

3

2

0

927

12_technology retweeted

Axross Recipe：つかえる知識をともに学べる場所

@AxrossRecipe_SB

almost 2 years ago

本レシピでは、従来技術より最大158倍高速に動作するStreamV2Vを用いて、動画変換を行う方法と、StreamV2VとGradioを用いて、リアルタイム画像生成アプリを生成する方法をご紹介します。 https://t.co/f2gsICda60 #AxrossRecipe #StreamV2V #画像生成 #動画変換

0

4

2

0

262

Who to follow

cap

@cap_aidd

Research Engineer | AI Drug Discovery

taichi endoh

@endoh_taichi

AIエンジニア　元医療従事者(臨床工学技士)

Conta@Python

@LikePython89

以前は某メーカーでの勤務でしたが、今は地元で福祉で汗をかいています！このご時世、少しでも資金を増やすためにS&P500とオルカンの両方を積立投資。発言は一定期間過ぎると削除させて頂きます！

12_technology retweeted

Jinbo Xing @Double47685693

over 2 years ago

🚀Our 𝑫𝙮𝒏𝙖𝒎𝙞𝑪𝙧𝒂𝙛𝒕𝙚𝒓 just got a massive upgrade!🚀 🎯Better Dynamic, Higher Resolution and Stronger Coherence! Code - https://t.co/pRw7zy6sry Project - https://t.co/F55LCmliep Demo - https://t.co/8aam4JKAAM

6

68

19

52

42K

12_technology retweeted

Zuntan @Zuntan03

over 2 years ago

日本語の短いテーマから、画像生成プロンプト&和訳とアップスケールした絵とセリフ&感情付き音声を、雑然と生成するEasyZatuGenです。 calm2-chat-AWQとStreamDiffusionとStyle-Bert-VITS2の三点盛りで、すべてをローカルで生成します。要 RTX 3060 12GB。声だけなら8GB�� https://t.co/SvOeKTyqEH

3

276

87

216

40K

12_technology retweeted

深津貴之 / THE GUILD, note

@fladdict

over 2 years ago

reading... 自分が気に入った曲がＡＩ生成だと分かった瞬間。「感動の先に人間がいないことが急に不安になった。まだ心の準備ができていないんだと思い知った」。そしてＡＩが作った曲を聞き分けられなくなっている現実にショックを受けた。 https://t.co/mw3xSJporK

6

3K

516

731

327K

12_technology retweeted

Alex Carlier @alexcarliera

over 2 years ago

Google just announces VideoPoet: a multimodal video generation model! It's massively multimodal and can take as input: text, image, depth & optical flow or a masked video and is one of the first models that generates video + audio! More info below ⬇️⬇️

9

315

75

191

43K

12kaz @12_technology

over 2 years ago

Stable Video Diffusionを用いて、画像から動画を生成してみました。入力は画像1枚です。こちらでお試しいただけます https://t.co/yks0DasuEA #AI #AIArt #stablediffusion #AIイラスト

0

7

0

238

12kaz @12_technology

over 2 years ago

Google Colabでテキストからのコード生成が提供され始めました画像を表示する際など地味な検索作業が減りそうです

0

1

0

157

12_technology retweeted

AK

@_akhaliq

over 2 years ago

Stability releases Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets model: https://t.co/WhmyO2u14k present Stable Video Diffusion — a latent video diffusion model for high-resolution, state-of-the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained for 2D image synthesis have been turned into generative video models by inserting temporal layers and finetuning them on small, high-quality video datasets. However, training methods in the literature vary widely, and the field has yet to agree on a unified strategy for curating video data. In this paper, we identify and evaluate three different stages for successful training of video LDMs: text-to-image pretraining, video pretraining, and high-quality video finetuning.

6

640

150

202

200K

12_technology retweeted

AK

@_akhaliq

over 2 years ago

Drivable 3D Gaussian Avatars paper page: https://t.co/GfUShcV1ZZ present Drivable 3D Gaussian Avatars (D3GA), the first 3D controllable model for human bodies rendered with Gaussian splats. Current photorealistic drivable avatars require either accurate 3D registrations during training, dense input images during testing, or both. The ones based on neural radiance fields also tend to be prohibitively slow for telepresence applications. This work uses the recently presented 3D Gaussian Splatting (3DGS) technique to render realistic humans at real-time framerates, using dense calibrated multi-view videos as input. To deform those primitives, we depart from the commonly used point deformation method of linear blend skinning (LBS) and use a classic volumetric deformation method: cage deformations. Given their smaller size, we drive these deformations with joint angles and keypoints, which are more suitable for communication applications. Our experiments on nine subjects with varied body shapes, clothes, and motions obtain higher-quality results than state-of-the-art methods when using the same training and test data.

12

1K

316

781

327K

12_technology retweeted

camenduru

@camenduru

over 2 years ago

Little @Gradio ❤ code 🌿 and little @diffuserslib ❤ code 🍅 real-time drawing app is done 🥗 Thanks to @SimianLuo (LCM) ❤ @Gradio Team ❤ @diffuserslib Team ❤ 🦒colab: please try it 🐣 https://t.co/jTbTe5w2EI

4

113

32

43

22K

12_technology retweeted

AK

@_akhaliq

over 2 years ago

One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion paper page: https://t.co/r7ylXY0Tlv Recent advancements in open-world 3D object generation have been remarkable, with image-to-3D methods offering superior fine-grained control over their text-to-3D counterparts. However, most existing models fall short in simultaneously providing rapid generation speeds and high fidelity to input images - two features essential for practical applications. In this paper, we present One-2-3-45++, an innovative method that transforms a single image into a detailed 3D textured mesh in approximately one minute. Our approach aims to fully harness the extensive knowledge embedded in 2D diffusion models and priors from valuable yet limited 3D data. This is achieved by initially finetuning a 2D diffusion model for consistent multi-view image generation, followed by elevating these images to 3D with the aid of multi-view conditioned 3D native diffusion models. Extensive experimental evaluations demonstrate that our method can produce high-quality, diverse 3D assets that closely mirror the original input image.