Kevin Xaviour @kf_stats - Twitter Profile

3 months ago

Now curious to hear from you If you had exploding or vanishing gradients in your model what activation (or trick) would you try first?

0

2

Kevin Xaviour @kf_stats

3 months ago

So choosing an activation function is not random. It’s like choosing: i) How should my model think? ii) What signals should it care about? Real-world quick mapping - Hidden layers -> mostly ReLU - Binary output -> Sigmoid - Some deep nets -> variants like Leaky ReLU, GELU

1

0

11

Kevin Xaviour @kf_stats

3 months ago

But here’s the interesting part: Each function is not just math, it changes how the network learns. - Sigmoid/Tanh → can slow down learning (vanishing gradients) - ReLU → faster, but can “die” (neurons stop activating)

1

0

10

Kevin Xaviour @kf_stats

3 months ago

Now here’s the catch: If everything was linear. no matter how many layers you stack, your model would behave like a single layer. Which means no real learning of complex patterns So we introduce non-linearity ---> Sigmoid - Outputs between 0 and 1 - Useful when you want probs

1

0

5

Kevin Xaviour @kf_stats

3 months ago

---> Tanh - Outputs between -1 and 1 - Centered, slightly better flow of gradients ---> ReLU - Keeps positive values, kills negatives - Fast, simple, widely used ---> Step Function - Either ON or OFF - Early neural network idea

1

0

8

Kevin Xaviour @kf_stats

3 months ago

Think of it like this Every neuron receives signals (numbers). But it doesn’t pass everything forward. It asks: “Is this important enough?” That decision = activation function

1

0

7

Kevin Xaviour @kf_stats

3 months ago

Neural networks are powerful, but here’s a simple question: What actually decides whether a neuron “fires” or stays silent? That’s where activation functions come in. #AI #MachineLearning #DeepLearning #NeuralNetworks #AIEngineering #DataScience

kf_stats's tweet photo. Neural networks are powerful, but here’s a simple question:
What actually decides whether a neuron “fires” or stays silent?
That’s where activation functions come in.

#AI #MachineLearning #DeepLearning #NeuralNetworks #AIEngineering #DataScience https://t.co/wXe2IL1pqx

1

0

10

Kevin Xaviour @kf_stats

3 months ago

Here’s what we actually use in practice → ChatGPT / copilots → mostly autoregressive → Search / retrieval / ranking → mostly autoencoding → Translation / summarization pipelines → seq2seq So it’s not one “Transformer” — it’s three different ways of thinking about data.

0

14

Kevin Xaviour @kf_stats

3 months ago

A lot of people say: “Transformersss? They just predict text.” But that’s like saying “Software systems just run code.” or “Driving is just pressing the accelerator.” #AI #MachineLearning #Transformers #LLM #DeepLearning #AIEngineering

1

0

27

Kevin Xaviour @kf_stats

3 months ago

---- Sequence-to-Sequence (Seq2Seq) --> Reads input → generates output - Used in: Translation, summarization, rewriting - Mindset: “How do I transform this?” The actual difference: - AR → generation-focused - AE → understanding-focused - Seq2Seq → transformation-focused

1

0

12

Kevin Xaviour @kf_stats

3 months ago

Here’s what we actually use in practice → ChatGPT / copilots → mostly autoregressive → Search / retrieval / ranking → mostly autoencoding → Translation / summarization pipelines → seq2seq So it’s not one “Transformer” — it’s three different ways of thinking about data.

0

12

Kevin Xaviour @kf_stats

3 months ago

A lot of people say: “Transformersss? They just predict text.” But that’s like saying “Software systems just run code.” or “Driving is just pressing the accelerator.” #AI #MachineLearning #Transformers #LLM #DeepLearning #AIEngineering

1

0

23

Kevin Xaviour @kf_stats

3 months ago

---- Sequence-to-Sequence (Seq2Seq) --> Reads input → generates output - Used in: Translation, summarization, rewriting - Mindset: “How do I transform this?” The actual difference: - AR → generation-focused - AE → understanding-focused - Seq2Seq → transformation-focused

1

0

10

Kevin Xaviour @kf_stats

3 months ago

Curious to learn from others, what sentence and labels would you try and define for your use case? #AI #MachineLearning #NLP #LLM #AIEngineering #DataScience #AIMLEngineer #AIML

0

14

Kevin Xaviour @kf_stats

3 months ago

One small experiment that reminded me how important problem framing is in AI!! I tried this sentence: “He is not passing the ball to me”

1

0

14

Kevin Xaviour @kf_stats

3 months ago

I let it evaluate meaning against labels that actually matter for the task. That’s the power of zero-shot classification: --> Turning labels into inputs --> Making models adaptable without retraining

1

0

11

Kevin Xaviour

@kf_stats

Last Seen Users on Sotwe

Trends for you

Most Popular Users