Jack @jackowfish - Twitter Profile

Jack @jackowfish

7 days ago

@joinshiftX shift

0

28

Jack @jackowfish

4 months ago

@rheejust @porterdotrun @FirstMarkCap @ycombinator @daltonc @ROWGHANI Incredibly grateful I get to come in everyday and build something amazing with this team. Excited to see where we are when we look back in a year or two.

1

2

0

136

jackowfish retweeted

Porter

@porterdotrun

7 months ago

We recently had our highest-ROI cold call of all time… but not in the way you would expect. A few months ago, we got a cold call from Grace Decker / @graceeedeckerrr, an SDR at @brexHQ. Nothing unusual until she asked at the end: “By the way, are you hiring engineers? My brother’s looking for a role in NYC.” Turns out, her brother @Jackowfish was a founding engineer at another YC infra company in the midst of relocating. Fast-forward, he just finished his work trial with us and will be joining Porter in the next few weeks. To me, this experience highlights two things: 1. The power of making the ask (you truly never know what might come from it). 2. How great talent can come from anywhere if you remain open-minded as a startup. It goes both ways - Jack later told us that he (understandably) took that first call with zero expectations. All parties (candidates and companies alike) want to find the hidden gems, but the world is efficient. Unexpected opportunities generally require unexpected openness, and saying yes to the chances larger companies would dismiss is precisely the edge you have when hiring as a startup. Finally, shout out to Brex for going the extra mile for their customers.

0

5

3

1

345

jackowfish retweeted

Carl Peterson

@carlpeterson

about 1 year ago

Proud of our team's hard work and can't wait to add someone else👀👀 Teriyaki orders are about to get out of hand

0

7

1

426

Who to follow

jackowfish retweeted

Thunder Compute @ThunderCompute

about 1 year ago

Wednesday vibes

1

8

2

0

610

jackowfish retweeted

Carl Peterson

@carlpeterson

about 1 year ago

Apple intelligence sucks. Can't even run nvidia-smi. Apple uses weak, small LLMs. Big LLMs are better. Thankfully, I got 80GB of VRAM on my iPhone with @ThunderCompute nvidia-smi ✅

1

6

3

2

3K

jackowfish retweeted

Carl Peterson

@carlpeterson

over 1 year ago

We just got our VS Code extension published! Our team has been using this nonstop for internal development. It truly feels like your computer has GPUs attached. If you use @code or @Cursor this is the simplest way to use cloud GPUs. And it was already the cheapest.

carlpeterson's tweet photo. We just got our VS Code extension published!

Our team has been using this nonstop for internal development. It truly feels like your computer has GPUs attached.

If you use @code or @Cursor this is the simplest way to use cloud GPUs. And it was already the cheapest. https://t.co/ZUWcGGZxbo

1

14

3

1

354

Jack @jackowfish

over 1 year ago

Spent the last couple days putting together a VSCode extension for @ThunderCompute - actually the fastest way to get a GPU cloud instance to build with LLM's in cursor. 5 clicks and a 30 second wait and you're building with server-grade GPU's right in your editor.

3

6

5

0

586

Jack @jackowfish

over 1 year ago

We’re in 1956 bubble sort land right now. The wall right now may be compute, but I have a feeling we’re going to hit the ‘59 quick sort moment and realize most of these lookups are incredibly non-optimal

Taelin

@VictorTaelin

over 1 year ago

My take from GPT-4.5 is that humanity has designed an AGI architecture - it is just prohibitively expensive. This model is not great, because training a $1 billion transformer only gives us a 12.5% improvement over a $100 million one, in a paradigm where, apparently, utility scales logarithmically with training cost... That also means that a dense GPT-5 would be only ~11% better than GPT-4.5, for the cost of $10 billion. Similarly, to get a jump as big as the one we've seen from GPT-2 to GPT-4, we'd need to train a GPT-7 (*not* a GPT-6), and that would cost about $100 trillion, i.e., the world's entire GDP. So, that's the wall: we saturated humanity's capacity to scale. Or, to be more specific, we'd need 1,000,000x more compute than GPT-4, to see that sort of jump again. Some argue that reasoning breaks this wall, but I feel like it only weakens it. If test-time compute laws hold, then, we'd need a GPT-4 scale model to "think for 100 million tokens per output token" to emulate a GPT-7. Except it would take days to produce each token. That's not viable. So, unless we make 1,000,000 clones of planet Earth, we could be stuck at roughly this capacity for several decades, and never see a jump as big as the one from GPT-2 to GPT-4 again. Unless, of course, new ways to improve the efficiency of these systems are discovered. AGI has become an optimization problem. I, for one, suspect that GPTs are embarrassingly sub-optimal, and that these big matrix multiplications are merely emulating an underlying "learning algorithm" with a massive overhead. Now, it isn't hard to see that, at this scale, a single attention (i.e., "neural dict") pass takes easily more than 1,000,000x the compute than a dict lookup. If that is true, it wouldn't be surprising if the first team to break the "matmul wall" would be able to train a model equivalent to GPT-4 for as little as $100. Of course, attention is doing much more than a dict lookup; but we don't know what it is doing that leads to reasoning capacities. And, once we figure that out, we may be able to have GPT-7 for the cost of GPT-4, and not for the world's entire GDP. That said, this would require a complete redesign. Gradient descent and matmuls have to be replaced by something entirely different - and nobody knows what that would be. It took us decades to go from neural nets to transformers, so, it could take us a decade to figure this out. Or someone could be stuck with a rush of inspiration and it would happen overnight... Anyway sorry if I got some napkin math wrong, and all the respect for OpenAI for this release. Publishing a result that isn't a complete success is great science. Now I just want to understand what transformers are emulating, and how we can do the same, for less. I have many ideas, and I have many experiments to run... I'll try not to disappear completely but excuse me if I do

222

2K

157

945

370K

1

2

0

245

jackowfish retweeted

Taelin

@VictorTaelin

over 1 year ago

My take from GPT-4.5 is that humanity has designed an AGI architecture - it is just prohibitively expensive. This model is not great, because training a $1 billion transformer only gives us a 12.5% improvement over a $100 million one, in a paradigm where, apparently, utility scales logarithmically with training cost... That also means that a dense GPT-5 would be only ~11% better than GPT-4.5, for the cost of $10 billion. Similarly, to get a jump as big as the one we've seen from GPT-2 to GPT-4, we'd need to train a GPT-7 (*not* a GPT-6), and that would cost about $100 trillion, i.e., the world's entire GDP. So, that's the wall: we saturated humanity's capacity to scale. Or, to be more specific, we'd need 1,000,000x more compute than GPT-4, to see that sort of jump again. Some argue that reasoning breaks this wall, but I feel like it only weakens it. If test-time compute laws hold, then, we'd need a GPT-4 scale model to "think for 100 million tokens per output token" to emulate a GPT-7. Except it would take days to produce each token. That's not viable. So, unless we make 1,000,000 clones of planet Earth, we could be stuck at roughly this capacity for several decades, and never see a jump as big as the one from GPT-2 to GPT-4 again. Unless, of course, new ways to improve the efficiency of these systems are discovered. AGI has become an optimization problem. I, for one, suspect that GPTs are embarrassingly sub-optimal, and that these big matrix multiplications are merely emulating an underlying "learning algorithm" with a massive overhead. Now, it isn't hard to see that, at this scale, a single attention (i.e., "neural dict") pass takes easily more than 1,000,000x the compute than a dict lookup. If that is true, it wouldn't be surprising if the first team to break the "matmul wall" would be able to train a model equivalent to GPT-4 for as little as $100. Of course, attention is doing much more than a dict lookup; but we don't know what it is doing that leads to reasoning capacities. And, once we figure that out, we may be able to have GPT-7 for the cost of GPT-4, and not for the world's entire GDP. That said, this would require a complete redesign. Gradient descent and matmuls have to be replaced by something entirely different - and nobody knows what that would be. It took us decades to go from neural nets to transformers, so, it could take us a decade to figure this out. Or someone could be stuck with a rush of inspiration and it would happen overnight... Anyway sorry if I got some napkin math wrong, and all the respect for OpenAI for this release. Publishing a result that isn't a complete success is great science. Now I just want to understand what transformers are emulating, and how we can do the same, for less. I have many ideas, and I have many experiments to run... I'll try not to disappear completely but excuse me if I do

222

2K

157

945

370K

jackowfish retweeted

kitze · supermac.io 🐦‍🔥

@thekitze

over 1 year ago

me: hey do this thingie 3.5: no prob sir, done 3.7: i did the thingie. let me also do another thingie. i'm gonna finish all the thingies. omg there are so many thingies to be done in this project. i'm gonna start doing extra thingies. would you also maybe like a drink? let's run npm install drink. fk it let's get crazy up in this b

168

3K

123

420

381K

jackowfish retweeted

Thunder Compute @ThunderCompute

over 1 year ago

Our founding engineer @jackowfish monitoring google cloud as we have 70 people test Thunder Compute live.

2

10

2

0

596

jackowfish retweeted

Carl Peterson

@carlpeterson

over 1 year ago

More and more students are switching from Colab to @ThunderCompute. Check out Dhruv Khara's blog about his experience (linked in reply)

1

4

2

0

282

jackowfish retweeted

Carl Peterson

@carlpeterson

over 1 year ago

Thank you Pace University Data Science for inviting @ThunderCompute to lead a workshop about using cloud GPUs for deep learning. We love to hear how students are using our GPUs for their projects.

carlpeterson's tweet photo. Thank you Pace University Data Science for inviting @ThunderCompute to lead a workshop about using cloud GPUs for deep learning.

We love to hear how students are using our GPUs for their projects. https://t.co/OWbOPsR1mX

0

11

3

1

370

jackowfish retweeted

Anduril Industries @anduriltech

over 1 year ago

palmer just told me he wants a working prototype of “the big robot from pacific rim” by friday fml

350

11K

440

417

727K

Jack @jackowfish

over 1 year ago

🚢Just shipped an update to our CLI tool that allows you change your @ThunderCompute instance's properties while stopped. No need to re-create an instance you've spent months working on because you need more vCPUs!

jackowfish's tweet photo. 🚢Just shipped an update to our CLI tool that allows you change your @ThunderCompute instance's properties while stopped. No need to re-create an instance you've spent months working on because you need more vCPUs! https://t.co/SRjg0bG4BR

0

4

2

0

265

Jack @jackowfish

over 1 year ago

Arch / linux distros are great (although the UI’s are trash) on your devbox you can wipe whenever you want. Not good on the thing you need to connect to WiFi in a coffee shop to ssh into said devbox. Or go on zoom. Or look at your google calendar.