whoa, this paper didn't get nearly the attention it deserves (no pun intended)
they propose to replace the transformer as default backbone for language modelling with RetNet - Retentive Network
here are some interesting observations:
Amazing illustration of how to build a resilient three-tier architecture on AWS.
Image Credit: @Ankit__Jodhani
–
Subscribe to our weekly newsletter to get a Free System Design PDF (158 pages): https://t.co/7ux5WHZMsr