A simple prompting trick to fix LLM attention fade
Table of Contents
Here’s a useful addition for prompts that ask an LLM to review or analyze longer inputs.
Transformer models have well-documented “attention distribution” challenges, meaning they tend to focus heavily on the beginning and end of content while under-attending to the middle. They can also become less thorough as they generate longer outputs, front-loading effort and tapering off.
This simple instruction set helps counteract both when added to a prompt. Here’s an example in the context of detecting errors:
For each paragraph, analyze independently with full attention.
Never skip or summarize errors because you found many earlier.
CRITICAL: Maintain identical thoroughness from first sentence to last.
Do not reduce detail or skip errors because the text has many issues.
If you find 50 errors, document all 50 with equal detail.
Treat each sentence as if it's the only sentence you're checking.
More about how to mitigate this soon!