Patrick Walton @pcwalton - Twitter Profile

over 1 year ago

Landed multi-draw-indirect and bindless textures in Bevy 0.16! It cuts our CPU overhead for drawing in half on many scenes, and along with the retained render-world scene graph I also landed it brings us significantly closer to GPU-driven rendering.

1

55

1

3K

Patrick Walton @pcwalton

over 1 year ago

@matiasgoldberg @godotengine @john_clayjohn Ah, Adreno, the eternal source of pain

1

2

0

329

Patrick Walton @pcwalton

over 1 year ago

Learned the other day: On GPU, inverse trigonometric functions are way more expensive than regular trigonometric functions. But on CPU, they're about the same speed (source: https://t.co/LYiDHquWIu)

1

77

5

16

5K

Patrick Walton @pcwalton

over 1 year ago

One thing that would make Rust compiles faster in many cases is for Cargo to insta-kill any rust-analyzer processes when you run cargo build. That "Blocking" message wastes a lot of time.

6

148

6

11

8K

Who to follow

Andrew Gallant

@burntsushi5

I love to code. I rarely check DMs. My email address is on my web site.

Rust's asynchronous runtime.

Patrick Walton @pcwalton

over 1 year ago

Sorry, but at this point if you think Bevy's ECS is a burden for app logic I just don't believe you. You have to organize your code somehow. The question is whether the framework helps you do it or not.

3

60

2

5

6K

Patrick Walton @pcwalton

over 1 year ago

Bevy 0.15 is out :) This time around, I wrote animation masks, additive blending, most of the Bevy Remote Protocol, chromatic aberration, point light and spot lights for volumetric fog, and PCSS.

Bevy Engine @BevyEngine

over 1 year ago

Bevy 0.15 is out now! It features Required Components, Entity Picking, Generalized Entity Animation, Animation Masks, Curves, Function reflection, Bevy Remote Protocol, VBAO, OIT, Chromatic Aberration, Fog volumes, Better Text, and more! https://t.co/BJRmESL09s

10

339

61

20

16K

2

121

6

0

6K

Patrick Walton @pcwalton

over 1 year ago

Idly wondering if Objective-C/Swift have precise enough reachability information to try running a Bacon cycle collector if memory is running low.

0

5

1

3

2K

Patrick Walton @pcwalton

over 1 year ago

Wanted: a `#[derive_fast_hash]` that gives you a fast hash and compare implementation that goes as many bytes as it can at a time, for types with no padding anywhere (the macro would guarantee this somehow).

3

27

1

5

4K

Patrick Walton @pcwalton

over 1 year ago

The iOS ecosystem has gotten so close to having true JIT support on iOS that I'm not sure that the delta between all the workarounds and true JIT support is that meaningful anymore. You can already JIT in wasm or JS, or you can interpret, or you can get a JIT entitlement...

0

14

1

2

2K

Patrick Walton @pcwalton

over 1 year ago

Really interesting deep dive into virtual geometry (Nanite) in Bevy 0.15: https://t.co/8vrecNAzCL

1

42

6

15

3K

Patrick Walton @pcwalton

over 1 year ago

FYI I'll be posting future updates over there in addition to here: https://t.co/M30SrZnnpa Like everyone, I'd like to reduce my time on the hellsite.

1

11

1

2K

Patrick Walton @pcwalton

over 1 year ago

@GustavSterbrant HLSL semantics are the reason why I'm still "you'll pry GLSL out of my cold dead hands" despite the industry-wide shift toward HLSL.

0

1

0

459

Patrick Walton @pcwalton

over 1 year ago

@Barteks2x @Lazin I agree (and this is one point of disagreement I have with the author of that code in JSC :)) But I certainly believe it'd be way faster than what native mallocs do for small allocations.

1

0

100

Patrick Walton @pcwalton

over 1 year ago

Honestly, I feel like programmers are way too quick to assume that "allocation" = "slow" is an iron law of the universe. In the JVM it's like 5 instructions. The problem with C++/Rust allocators is that they're tuned to the workloads they observe in programs... (1/2)

zack

@zack_overflow

over 1 year ago

This is why Zig and Rust are sane and C++ is crazy Just assigning a variable to another can cause heap allocations I imagine std is littered with all these hidden allocations Imagine how hard it must be to write performant software

zack_overflow's tweet photo. This is why Zig and Rust are sane and C++ is crazy

Just assigning a variable to another can cause heap allocations

I imagine std is littered with all these hidden allocations

Imagine how hard it must be to write performant software https://t.co/Hn6PhKr5A0

141

1K

73

550

357K

19

219

10

78

45K

Patrick Walton @pcwalton

over 1 year ago

@DrawsMiguel As I recall you also need to hack it to stop checking for tracing/logging at runtime, and some other things. There's a lot of little needless overheads in there. (When I offered to fix it back then I was told "optimizing small allocation perf isn't important for our workloads".)

0

26

1

0

1K

Patrick Walton @pcwalton

over 1 year ago

A lot of C++/Rust malloc overhead comes from the loosely coupled malloc(size_t) interface. For example, the allocator has to compute which bin to use at runtime, when most of the time the compiler knows the size and could precompute the bin offset.

12

279

14

74

21K

Patrick Walton @pcwalton

over 1 year ago

I don't think 5 insns is feasible, but you might be able to get to around 10. Load TLAB, load bin, check to see if bump mode/pop mode/slow path, bump or pop as necessary.

0

46

0

1

4K

Patrick Walton @pcwalton

over 1 year ago

Also consider malloc logging/tracing features. Very convenient! But it adds a runtime check on every allocation. When your point of comparison is ~5 instructions as in the JVM, those tiny branches add up.

2

71

1

2

4K

Patrick Walton @pcwalton

over 1 year ago

@Barteks2x @Lazin From what I hear JavaScriptCore has a non moving allocator that has a similarly fast path ("bump and pop").

1

2

0

145

Patrick Walton @pcwalton

over 1 year ago

@DrawsMiguel No, it's true with jemalloc too. When I measured jemalloc (which was a few years ago) it was still something like 80 instructions in the fast path. This is an order of magnitude difference compared with the JVM.

1

4

0

1

471

Patrick Walton @pcwalton

over 1 year ago

@Lazin Last I profiled jemalloc it was still like 80+ instructions even in the fast path. That's an order of magnitude difference. You really want to have the compiler start inlining the fast paths, like the JVM does.

2

4

0

458

Patrick Walton

@pcwalton

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users