What are Transformers (Machine Learning Model)?
Introduction to Transformers
In this section, the speaker introduces transformers and their capabilities.
What are Transformers?
- Transformers are generative pre-trained transformer models that can produce text that looks like it was written by a human.
- They can write poetry, craft emails, and even come up with their own jokes.
How do Transformers Work?
- Transformers consist of two parts: an encoder and a decoder.
- The encoder works on the input sequence while the decoder operates on the target output sequence.
- The transformer takes a sequence of tokens (words in a sentence) and predicts the next word in the output sequence through iterating through encoder layers.
- The attention mechanism provides context around items in the input sequence, which gives transformers an advantage over algorithms like RNNs that must run in sequence.
Applications of Transformers
- Language translation is one example where transformers can be applied to convert one language into another.
- Document summaries are another great example where you can feed in a whole article as input and generate an output summary of just a couple of sentences that summarize the main points.
- Beyond just language, transformers have done things like learn to play chess and perform image processing that even rivals the capabilities of convolutional neural networks.