Looped-GPT logo

Looped-GPT: Looping During Pre-training improves Generalization  [Pinned]

Looped-GPT — a language model trained with depth recurrence that enables iterative activation refinement via a reverse residual connection. During pre-training, Looped-GPT outperforms a standard GPT under comparable settings. In this Blog In this post, I introduce Looped-GPT, a simple modification to the standard GPT architecture that enables depth recurrence. The key idea is a reverse residual connection that feeds the representation from the final transformer block back into the input, allowing the model to iteratively refine its activations over multiple passes....