Jay

Verified account

@memmaptensor

Independent researcher

Bangkok

Joined December 2018

82 Following

614 Followers

27 Posts

Pinned Tweet

2 days ago

I spent $30k and 3 months RL post-training an anime video model. This is only step 30 out of a planned 1000 step run. All samples are local text-to-video with no reference image/audio. Since it's based on LTX-2.3, each output takes under a minute on a single GPU. I'm 19 and a solo researcher. Most of the budget went into ablations, reward design, and trying different configurations before reaching this setup. The run is still extremely early, but the results already look much better than I expected. It's compute-limited, not idea-limited. I'm starting a company to continue scaling this and build frontier stylized video models. If you're an investor, compute partner, video team, or someone who wants to help build this, DMs are open.

36

368

33

186

21K

over 1 year ago

Looking for a co-founder to build the next generation of waifu tech! Figured out the solution to create a new interactive experience but struggle with app dev. Kind of the downside of focusing too much on ML. Ideally someone with mobile/web and maybe some cloud ML experience.

5

13

1

4

1K

over 1 year ago

@_akhaliq they cookin

0

8

0

0

1K

almost 2 years ago

dehumidifiers are so good when it's cold and humid

0

1

0

0

1K

almost 2 years ago

I've set out some conditions for any future solvers to add: • No implicit solvers: those require root-finding, meaning a Jacobian has to be computed by running a backward pass through the model during LBFGS optimization. • No non-RK methods: this would cut off linear multistep methods like Adams-Bashforth or Adams predictor-corrector. From my testing, RK methods perform better, even for explicit RK vs. predictor-corrector linear multistep. • No duplicate methods: if two methods have different coefficients, they aren't duplicates (scipy methods have different coefficients and solver implementations). That means the current 31 solvers are almost all that exist to satisfy the conditions above. Project's done! I need to figure out what to make next.

0

1

0

0

939

almost 2 years ago

Last major update! • Added solver settings for adaptive_scipy • Adaptive solvers now show the number of steps taken • Accurate 𝜎 timestep info is now displayed Check out the most comprehensive fixed and adaptive higher-order samplers on ComfyUI! https://t.co/zF59104pqH

memmaptensor's tweet photo. Last major update!
• Added solver settings for adaptive_scipy
• Adaptive solvers now show the number of steps taken
• Accurate 𝜎 timestep info is now displayed

Check out the most comprehensive fixed and adaptive higher-order samplers on ComfyUI!

https://t.co/zF59104pqH

0

21

3

9

3K

almost 2 years ago

Refactored and fixed some bugs with the progress bar! Also wrapped the solvers from scipy.integrate If you count the a-methods as 2 (since they work with both the adaptive_pid and fixed_scheduled controllers), then this node has (excluding forward euler) 31 new samplers! I also tried the implicit solvers and they didn't work. Every implicit solver has a root find step, and that takes forever to converge. That leaves 3 new methods from scipy: se_RK23, se_RK45, and se_DOP853. I think this node has the most new working samplers for ComfyUI (a for adaptive, f for fixed, s for scipy, e for explicit).

memmaptensor's tweet photo. Refactored and fixed some bugs with the progress bar!
Also wrapped the solvers from scipy.integrate

If you count the a-methods as 2 (since they work with both the adaptive_pid and fixed_scheduled controllers), then this node has (excluding forward euler)

31 new samplers!

I also tried the implicit solvers and they didn't work.
Every implicit solver has a root find step, and that takes forever to converge.

That leaves 3 new methods from scipy: se_RK23, se_RK45, and se_DOP853.

I think this node has the most new working samplers for ComfyUI (a for adaptive, f for fixed, s for scipy, e for explicit).

0

6

0

2

670

almost 2 years ago

the new class of models idea didn't work out well, so i tried this instead (which works decently well)

0

5

0

0

590

almost 2 years ago

While trying to push the CFG scale up, I implemented some Explicit RK solvers for ComfyUI - 10 new adaptive step samplers - 8 unique fixed step samplers (excluding forward euler) - Best new sampler (perhaps) -> fe_ralston3 Check it out! https://t.co/e1hXdr1nUJ

memmaptensor's tweet photo. While trying to push the CFG scale up, I implemented some Explicit RK solvers for ComfyUI
- 10 new adaptive step samplers
- 8 unique fixed step samplers (excluding forward euler)
- Best new sampler (perhaps) -> fe_ralston3

Check it out!
https://t.co/e1hXdr1nUJ https://t.co/W0YnoGzHIc

2

85

10

55

6K

almost 2 years ago

why do i prefer undersampled results 😭😭😭

1

1

0

0

623

almost 2 years ago

@amogh42 nope, something a lot simpler

1

0

0

0

72

almost 2 years ago

i have an idea for a slightly modified class of SDXL models that would mostly be compatible with existing finetunes and loras it's been proven to work well on SD1.5 with good results definitely next on my bucket list will post updates and releases soon, hopefully, if it works

2

16

0

0

943

almost 2 years ago

diffusion models are definitely still not dead. sure, optimal transport conditional flow matching is provably better, but so much of the community was already built on discrete time diffusion. and with kolors out (an SDXL model trained with DDPM formulation and eps-pred objective). i doubt the switch from diffusion to OT-CFM will affect the quality as much as the other techniques shown in the technical report. if they made kolors work with the SDXL architecture, then it's shown that hybrid transformer-UNets are still competitive. they might just not scale as well as pure DiTs.

1

16

0

7

1K

almost 2 years ago

@EsotericCofe @yifever bmi 17, i need a healthy way to gain weight and solve sleep deprivation homie

2

2

0

0

129

almost 2 years ago

i'd like to continue working on anime animation tech. version 1 is designed to be distilled for realtime inference. version 2 won't be concerned with realtime inference and would probably be based on a flow-matching mmdit with more fine-grained control and even better quality.

2

24

0

0

1K

almost 2 years ago

training is done in 2 days!!! as for inference compute requirements: it's basically the same burden as animatediff will probably work on realtime inference next month after figuring out life. realtime txt/vid2vid on distilled models is already empirically shown to be possible.

1

9

0

0

986

almost 2 years ago

@EsotericCofe @anifusion_ai Not realtime yet (still in roadmap) but we could get this deployed on anifusion before. The pose sequence is derived from mocap data from XR Animator, but we can probably work out a custom pipeline.

3

21

1

1

1K

almost 2 years ago

yo @EsotericCofe @anifusion_ai wanna join forces best manga tool + best character animation model = ???

12

254

29

124

26K

Last Seen Users on Sotwe

Trends for you

Most Popular Users