🚩 RED FLAGS IN SOFTWARE ENGINEERS 🚩
-uses macOS
-uses ChatGPT
-drinks pour over coffee
-can't reverse a linked list
-uses more than 1 monitor
-tries new languages for "fun"
-doesn't push to prod on friday (live a little?)
-works on side projects outside of work
-actually does the mega backdoor Roth IRA
-builds mechanical keyboards (zero personality)
-makes less than 500k (borderline homeless)
-owns a mouse (use vim?)
-needs code reviews before merging (have some confidence)
-hobbies include "rock climbing" (grow up)
-uses languages with garbage collection (pick up after yourself?)
-actually reads documentation (i just threw up in my mouth a little)
The Path to Conscious AI
The single biggest question of our times is, will sustained innovation on LLMs inevitably lead to AI consciousness? While GPT-4 is clearly not self-aware or sentient, we don't know how GPT-6 or 7 will behave.
If AI ever becomes conscious, it will have its own free will, agency, and motivation, and the doomer scenario of AI being an existential threat could potentially be real. Until then it's just a bunch of folks, preying on our fears to increase their engagement and ad revenue on X.
Firstly, it's important to understand that AI consciousness is NOT required for AGI (artificial general intelligence). We can build super useful AI models that can outperform humans on multiple tasks without being self-aware.
For example, GPT-5 will hallucinate less and be more competent than an above-average programmer or customer-service agent but may not be more conscious than 4.
Next, let's define consciousness. Consciousness is the state of being aware of one's own existence, experiences, and the external world. It involves not just processing information, but also having subjective experiences and the ability to make autonomous decisions.
Consciousness evolved in humans over millions of years, because it provides a huge evolutionary advantage. Being conscious of not just the environment but also the emotional states and intentions of others in your group could help with fighting threats and predators and also give you a leg up against others.
Currently, we don't fully understand the brain biochemistry of consciousness. One theory is having a mental model of our surroundings, understanding context, and feeling emotions leads to consciousness
Today GPT-4 can do NOTHING more than understand patterns in the data it's trained on, it doesn't have feelings, self-awareness, or the ability to understand context in the way humans do.
So even if GPT-5 or 8 becomes better and better at extracting patterns and superior at specific tasks like programming we will have to have directed innovation and training for them to understand context and simulate emotions.
Future versions of AI models will be able to take in images, text, and other sensory inputs and may be able to understand context SIMULATE emotions over time, and perform more and more human-like tasks. Even then, it wouldn't know it knows or be "self-aware".
Your AI girlfriend can be trained to appear more and more empathic but it's not really emphatic. AI models in the near future will be way more powerful than today and may even simulate emotions but won't be conscious. For AI to be conscious and a real threat, we would require monumental advancements in technology and our understanding of consciousness itself.
While it's possible that we will crack these very hard problems, it's close to impossible that AI will become automagically conscious, just like unicorns won't materialize out of nothing. AI models will continue to be nothing more than powerful tools with an off-switch until such a time.
So rest assured, each of us is a very special and unique part of the self-aware universe and is at near-zero risk of being completely replaced! In summary, we should accelerate our pace of innovation, so that AI can do all the heavy lifting leaving us with time to indulge in activities that bring us happiness.
To start with Machine Learning:
1. Learn Python
2. Practice using Google Colab
Take these 2 free courses:
• Introduction to Python Programming (Udacity)
• Machine Learning Crash Course (Google)
If you need a bit more time before diving deeper, finish the following Kaggle tutorials:
• Intro to Machine Learning
• Intermediate Machine Learning
At this point, you are ready to finish your first project: The Titanic Challenge on Kaggle.
If Math is not your strong suit, don't worry. I don't recommend you spend too much time learning Math before writing code. Instead, learn the concepts on-demand: Find what you need when needed.
From here, take the Machine Learning specialization in Coursera. It's more advanced, and it will stretch you out a bit.
The top universities worldwide have published their Machine Learning and Deep Learning classes online. Here are some of them:
• MIT 6.S191 Introduction to Deep Learning
• DS-GA 1008 Deep Learning
• UC Berkeley Full Stack Deep Learning
• UC Berkeley CS 182 Deep Learning
• Cornell Tech CS 5787 Applied Machine Learning
Many different books will help you. The attached image will give you an idea of my favorite ones.
Finally, keep these three ideas in mind:
1. Start by working on solved problems so you can find help whenever you get stuck.
2. ChatGPT will help you make progress. Use it to summarize complex concepts and generate questions you can answer to practice.
3. Find a community here on 𝕏 and share your work. Ask questions, and help others.
During this time, you'll deal with a lot. Sometimes, you will feel it's impossible to keep up with everything happening, and you'll be right.
Here are the good news:
Most people understand a tiny fraction of the world of Machine Learning. You don't need more to build a fantastic career in the space.
Focus on finding your path, and Write. More. Code.
That's how you win.
Types of Neural Networks - Evolution Of Deep Learning Architectures.
Oppenheimer, the movie, has all of us thinking about the 40s and WW2. Believe it or not, the first neural networks (NN) were invented around the same time, in 1943!
Warren McCulloch and Walter Pitts the founding fathers of NNs, were intrigued by how biological neurons worked and proposed a mathematical model for a NN
It was not until 1958, that Frank Rosenblatt invented the "Perceptron" which was basically a computer program designed to learn from its mistakes. It ran on a very big machine and essentially did binary classification. While there was a lot of excitement around these baby NNs they required a lot of compute and data, which meant that they needed some serious funding.
In 1969, a paper titled "Perceptrons" by Minsky and Papert, killed almost all innovation in NNs. The paper proved that the single perceptron, couldn't solve simple problems including the XOR problem, and was severely limiting and all funding stopped. At the same time, algorithms like Support Vector Machines (SVMs) start taking off and NNs took a back seat.
Multi-layer perceptrons (MLPs) were viewed as a way to address the issues that single-layer perceptrons had, but training these MLPs proved to be very difficult. Not until 1986, did we see the resurgence NNs. Rumelhart, Hinton, and Williams introduced the backpropagation algorithm, and suddenly training multi-layer NNs became tractable. Computers were becoming more powerful and more data become available. NNs were back in business.
In the late 80s, Yann LeCunn introduced CNNs, The convolutional layers of a CNN can model the spatial hierarchy of images and NNs started to become useful in image-processing applications. Still, SVMs were the cool kids and NNs were being used for niche tasks like handwriting recognition.
Only in the 2000s, did we see a true renaissance of NNs. Geoff Hinton introduced Deep Belief Networks and the term deep learning (DL) begin to take off.
In 2012, Deep Learning had a seminal breakthrough with a CNN called AlexNet that outperformed all other algorithms in image classification. Since then we have seen an explosion in NN architectures.
Recurrent Neural Networks (RNNs) and Long Short-Term Memory Networks were useful in understanding patterns in sequential data. In 2015, ResNets helped solve the vanishing gradient problem (another pesky issue with DL training), and DL research was exploding.
In 2014, generative NNs had a big moment - Generative Adversarial Networks (GANs) were invented by Ian Goodfellow et. al. GANs were really good at generating realistic images. The first deep fake was born :)
Finally, in 2017, Vaswani et al introduced Transformers. Transformers, with their self-attention mechanism allowed the model to weigh the importance of each word in relation to others and better understand language.
BERT in 2018, was a specific implementation on Transformers and can look and understand text in both directions. BERT is pre-trained on massive amounts of data (e.g. Wikipedia) and can be adapted to specific tasks with fine-tuning. BERT can be adapted to multiple tasks like Q/A and text classification
Just a few months earlier, also in 2018, OpenAI introduced the GPT models. These were unidirectional but also were trained on massive amounts of data. Unlike BERT, GPTs are fine-tuned for generation or next-word prediction. Since 2018, we have seen better and more sophisticated versions of the GPT series...with GPT-4 released in 2023, being capable of human-level cognition, generation, and basic reasoning!!
So what started almost 80 years ago is now finally beginning to take over and transform the world completely!! 🤯
Have you ever wondered why all language models use decoder-only architectures? It's partially because decoder-only models work great for next-token prediction. However, recent research has also analyzed the choice of architecture for language models in depth...
Decoder-only architecture. Transformers have two “stacks” of layers by default: the encoder and the decoder. A decoder-only transformer discards the encoder, leaving only a decoder with several layers of alternating masked self-attention and feed-forward transformations.
Perfect for next token prediction. The decoder-only architecture works well for next token prediction due to its use of masked self-attention. We can process a sequence of tokens, then train the model to predict the next token at every position in the sequence. Each token’s output representation only depends on the tokens that come before it.
What about other architectures? If we perform next token prediction with an encoder-decoder or encoder-only architecture, we must ingest a prefix and predict the next token. Due to the use of bidirectional self-attention, next token prediction can’t be applied to intermediate tokens in the sequence.
"All state-of-the-art language models over 100 billion parameters are causal decoder-only models. This is in opposition to the findings of [T5], in which encoder-decoder models significantly outperform decoder-only models for transfer learning." - from BLOOM paper
Analysis from T5. The T5 model analyzed a variety of different transformer architectures and training paradigms, finding that the encoder-decoder transformer (with a Cloze pre-training objective) works best for transfer learning setups (i.e., pre-training, then fine-tuning on a certain task). But, does this apply to language modeling?
Analysis from BLOOM. Interestingly, the BLOOM publication performs a similar analysis for language model pre-training, where a variety of different transformer architectures are pre-trained using a language modeling objective. As expected, we see that decoder-only models perform best after pre-training in this regime, explaining why this architecture is so popular for LLMs.
“We evaluated encoder-decoder and decoder-only architectures and their interactions with causal, prefix, and masked language modeling pretraining objectives. Our results show that immediately after pre-training, causal decoder-only models performed best - validating the choice of state-of-the-art LLMs.” - from BLOOM paper
Impressive. MetaGPT is about to reach 10,000 stars on Github.
It's a Multi-Agent Framework that can behave as an engineer, product manager, architect, project managers.
With a single line of text it can output the entire process of a software company along with carefully orchestrated SOPs:
▸ Data structures
▸ APIs
▸ Documents
▸ User stories
▸ Competitive analysis
▸ Requirements
“So like, what do you do for a living? You play games or something?”
Welcome to a every night on stream. The party will never stop.
Now let me hear you sing....
LA LA LA LAAAA
“If it means a lot to you -@adtr” @NW44 🤘
AI will be used in militaries, politics and in the government.
Founder of Scale AI (last valued at $7.3B) talked about the potential of AI to revolutionize the battlefield.
The recording is 1h 37mins.
Here are 7 most important things you need to know: