Back in 2017, the first transformer architecture introduced two main components: an encoder a decoder These two parts were connected so they could work together. This original design is known as an encoder–decoder transformer . Decoders Can Work on Their Own Over time, researchers realized that the decoder alone was powerful enough for many tasks. Using only a decoder, models could: generate text continue sentences perform translation and other language tasks As we discussed in the article on decoder only transformers , these models form the foundation of systems like ChatGPT. These are called decoder-only transformers . Encoders Can Also Work Independently In a similar way, encoder-based models are also very useful on their own. This idea forms the foundation of models like BERT and many others. These are called encoder-only transformers .…