Parth @parthlogs - Twitter Profile

@Hi_Mrinal Hehe 😅 the getting old part is a hard pill to swallow. You can definitely accelerate your growth but there's a cap to that speed, you cannot outrun time

0

4

0

438

Parth

@parthlogs

6 days ago

It is not *funny* anymore. It's exhausting.

Parth

@parthlogs

3 months ago

It's funny how everyone is continuously oscillating between the Apocalyptic AI and the unreasonably hyped AI

3

5

0

157

0

1

0

53

Parth

@parthlogs

6 days ago

For anyone who has coded even a little, it is obvious which parts of code are tedious and boilerplate and which ones are interesting/custom to the project. With that understanding, it should be clear when to use AI and when not to. Why is there so much amnesia around this topic? Why do we have polar opposites? AI adoption isn't even a thing worth contemplating. Just swallow it like any other developer tool or abstraction you have swallowed (and normalised) already!

3

4

0

1

139

Parth

@parthlogs

9 days ago

Absolutely! Reports/papers on past failures and incidents are the best learning resources for any systems engineer. It is also very assuring since we acknowledge the fact that even the big players make both - trivial and catastrophic mistakes. Twitter's 'Fan Out' architectural problem or the infamous Netflix ELB cascading failure, etc teach far more than any book or lecture would.

Mrinal

@Hi_Mrinal

13 days ago

TBH there is soo much untapped learning hidden in technical reports by tech companies which they release after a major downtime ..... these three were my recent reads

Hi_Mrinal's tweet photo. TBH there is soo much untapped learning hidden in technical reports by tech companies which they release after a major downtime .....

these three were my recent reads https://t.co/ojknBegwgB

6

92

3

67

7K

0

1

0

66

Parth

@parthlogs

10 days ago

@yacineMTB Who said you can't autodidact through depth or cutting edge

0

1

0

43

Parth

@parthlogs

10 days ago

@Hi_Mrinal U 2

0

2

0

20

Parth

@parthlogs

11 days ago

@sh_reya How do you arrive at the dimensions themselves? Domain expert work?

1

0

141

Parth

@parthlogs

12 days ago

I am particularly curious about evals for startups at stages where they don't have traces at all yet - unlike examples where you can evaluate conversations already held by AI. This could mean pre-production products, or apps where the nature of the response is completely different from chat. What would an evals solution look like for a startup that is still deciding the model, params, prompt, and context? Surely, different decisions here can yield radically different outputs. The obv solution that comes to my mind is to generate synthetic/manual representative cases and run a configuration tournament across model + params + context combinations, while accounting for things like position bias and other similar mathematical aspects. Is there a better way to think about evals before real traces exist? Curious about how @HamelHusain and @sh_reya would think about this

parthlogs's tweet photo. I am particularly curious about evals for startups at stages where they don't have traces at all yet - unlike examples where you can evaluate conversations already held by AI.

This could mean pre-production products, or apps where the nature of the response is completely different from chat.

What would an evals solution look like for a startup that is still deciding the model, params, prompt, and context?

Surely, different decisions here can yield radically different outputs.

The obv solution that comes to my mind is to generate synthetic/manual representative cases and run a configuration tournament across model + params + context combinations, while accounting for things like position bias and other similar mathematical aspects.

Is there a better way to think about evals before real traces exist? Curious about how @HamelHusain and @sh_reya would think about this

2

9

1

21

5K

Parth

@parthlogs

12 days ago

@HamelHusain hmm. I'm going to test this properly: seed evals with synthetic/manual cases, then run a tournament across model + prompt + context + cost configs, with judge-bias checks and confidence intervals. Will share concrete numbers once I have them.

0

1

0

139

Parth

@parthlogs

Last Seen Users on Sotwe

Trends for you

Most Popular Users