My PhD thesis Neural Transfer Learning for Natural Language Processing is now online. It includes a general review of transfer learning in NLP as well as new material that I hope will be useful to some.
https://t.co/PxfVRYuyjx
Hand-labeling training data for machine learning problems is effective, but very labor and time intensive. This work explores how to use algorithmic labeling systems relying on other sources of knowledge that can provide many more labels but which are noisy.
I originally thought of GANs as an unsupervised learning algorithm, but so far, to create recognizable object categories, they've needed a supervision signal / labeled images. This new work shows how to get them to work well with few labels.
https://t.co/t30iusmCOe
The complaints "Python is slow" or "Python is unsafe" seem misguided.
The point of Python isn't to be fast or safe, it's to be flexible and hackable, and to interface well with everything else. It has become successful by serving as a frontend from which to call other libraries.
Last tweet for me on the OpenAI GPT-2 thing, but for those interested I think this video really elevates the discussion. Great work by all parties involved.
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent https://t.co/zmHhklDdit <--- this should blow your mind a bit!! Also holds for convolutional networks, batch norm, ... Also, closed form for test predictions resulting from gradient descent training.
This is literally the first time I've seen am NLP researcher say they want to focus on helping normal people solve normal problems.
Hopefully the first of many :)
“As someone who is deeply interested in AGI, I find ImageNet much less interesting now, precisely because it can be solved with models that have little global understanding of images.”
https://t.co/ypT44YkLpn
Here, I tried to explain building blocks of SOTA ULMFIT model. What is an AWD-LSTM? How Dropout is used everywhere? What is a QRNN and why might it be better? ...I also used excel spreadsheets to simplify things in a different way :)
TIL when preparing my second deep learning lecture that the operators' manual for Rosenblatt's Mark 1 Perceptron machine was a classified document. It became unclassified only in 1977. The manual can now be found at https://t.co/uiUhFwBNFR
.@kaushal316 wrote a nice step-by-step tutorial on how to finetune BERT on a classification task (@kaggle Toxic Challenge)
Covers everything from data processing to model modification
Results are top-10% w. a very simple 30-lines-of-code single model 👇
https://t.co/FdNQcAoI14
Such an inspiration: @alexandrecadrin is both a radiologist, and a kaggle-winning deep learning expert. He is the only one in the world so far - but there will be more to come!
Thank you @bhutanisanyam1 for helping to tell his amazing story.
https://t.co/Y6Z2oADc8k
This Fri 2/8 is deadline to apply to https://t.co/uQfNtBNki6 taking place March 2-3 in Berkeley.
Also: @l2k from Weights & Biases / FigureEight and @jeremyphoward from https://t.co/8jKC4xcGxw joined our guest speakers @RichardSocher (Salesforce) & Raquel Urtasun (Uber/Toronto).
Thanks to @bhutanisanyam1 and @hortonhearsafoo, you can now run all the whole https://t.co/aQsW5afov6 lessons for free using Kaggle kernels.
https://t.co/lZ1XbHjT0E