In my next post I will discuss how NNs are used in chatGPT in the same basic way, but on a much larger scale. I’ll also dip into NNs that produce images from a text prompt, which surprisingly uses the same basic principles.
How does chatGPT really work?
I thought I would try and write a post on how chatGPT works, but in the style of ELI5.
There will be no mathematical notation or computer science and only one diagram.
The aim is to reach as many people as possible.
Since we had lots of past data and the resulting decision for each lender, a NN could be built they would predict the result - Moneysupermarket Smart Search was born.
We used a NN by feeding in all the past answers to questions that previous customers had submitted, then used the acceptance decision and APR as the items were were looking to predict (like recognising digits).
In a different project, we wanted to predict the likelihood that an individual would be accepted for a loan and the rate of interest they would be offered, without the customer applying for the loan (a mark on the credit score is added if you apply).
making it difficult for automated “bots” to read the price, the images were ever so slightly different each time they were displayed, not enough to cause a human viewer to notice any difference
Early on my career I used NN to solve two business problems for Moneysupermarket. Firstly during the great captcha wars of insurance comparison, insurance sites switched to showing images of the price of a policy
For example, in the late 80s early 90s speech recognition did not use NN, along with handwriting recognition (apple Newton…), these solutions largely failed when applied to the general population of all kinds of speech/writing.
This tech is not new, in fact it has been around for 40+ years, the advancement of readily available training data along with computational power has made them a viable everyday solution.
Going back to the NN we have made to recognise hand dawn digits that has been trained on a sample of known images. This NN should then be capable of recognising the number value of an image that it hasn’t seen before, ie it was not in the training set.
In reality, choosing the right number of layers for the NN, the range of weights and the formula (rules) for each neuron (each layer can and should have different rules) is an art and a science, much work is being done to automate this
As the weight adjustment for a different digit image could throw the NN off for previous images. The training images are tested multiple times until it produces the correct result for all images (or a threshold, eg 99.5%)