Architecture2025-12-08

How to handle long context windows vs. retrieval strategies?

With 1M+ token windows, is RAG dead? Understanding the 'Lost in the Middle' phenomenon.

Is RAG Dead?

Not yet. While context windows are growing (Gemini 1.5 Pro has 1M+), stuffing everything into context has downsides.

Trade-offs

  • Cost: Long contexts are expensive to process every time.
  • Latency: Time-to-first-token increases with context length.
  • Accuracy: Models can struggle to find details buried in the middle of massive contexts ("Lost in the Middle").

Further reading

Related Topics

Context WindowRAGArchitecture

Ready to put this into practice?

Start building your AI pipeline with our visual DAG builder today.