Context EngineeringFebruary 5, 20264 min read

Context engineering is the real AI interface

A technical view of context as a managed runtime: what to include, what to omit, how to preserve state, and why a larger window does not solve context design.

The model is not the complete AI system. The model is a function that receives a context and produces a continuation.

What the application places around that function determines much of the result: instructions, conversation state, retrieved knowledge, tool definitions, tool outputs, user preferences, files, and a description of the current task.

Calling all of this “prompt engineering” is too narrow. The more useful term is context engineering: designing the information environment in which a model makes its next decision.

A context window is working memory

The context window should be treated as scarce working memory, even when it is large.

A typical agent context contains:

Component	Purpose	Common failure
System instructions	Define behavior and boundaries	Contradictory or overly long rules
Task state	Explain the current objective and progress	Stale plans and completed work remain active
Conversation	Preserve user intent	Irrelevant history dominates recent instructions
Retrieved knowledge	Ground decisions in external facts	Similar but wrong passages are included
Tool schemas	Describe available actions	Too many tools confuse selection
Tool results	Supply observations	Raw output consumes the entire budget

The engineering problem is deciding what deserves to be present now.

Context has a lifecycle

Good context is not assembled once. It changes while work proceeds.

Acquire: retrieve a file, message, database row, or tool result.
Normalize: add provenance, timestamps, permissions, and type information.
Select: include only what is relevant to the current decision.
Compress: summarize or transform material while preserving critical details.
Expire: remove state that is stale, superseded, or no longer useful.
Persist: save durable facts outside the prompt when they matter later.

This lifecycle resembles cache and memory management more than copywriting.

Conversation history is not memory

Replaying every prior message creates the appearance of memory. It is fragile and expensive.

Durable memory should be explicit and typed. A useful system distinguishes:

User preferences: stable choices such as language or coding style.
Project facts: repository structure, decisions, constraints, and terminology.
Task state: the current plan, completed steps, and unresolved blockers.
Episodes: past actions that may be relevant as examples.
Source knowledge: documents that should be retrieved, not memorized as truth.

Each type needs different retention, permissions, and update rules. A preference may persist for months. A temporary error message should disappear after the issue is fixed.

Summaries are lossy compression

Summarization is necessary for long-running work, but every summary removes information.

The dangerous approach is to replace history with a fluent paragraph and assume nothing important was lost. Better summaries preserve structured fields: objective, constraints, decisions, changed files, test results, open questions, and source references.

Critical details should remain addressable. A summary can say that a decision was made, while retaining a link or identifier for the original evidence.

Tool output needs a budget

Agents often fail because a tool returns too much, not too little. A full log, database dump, or repository tree can push instructions and recent observations away from the model's attention.

Tools should support scoped queries, pagination, filters, and structured responses. The orchestrator should retain raw output outside the context and insert only the relevant slice. Large results can be summarized, but exact identifiers and error lines should remain available on demand.

Large windows do not remove the problem

Longer context reduces some pressure. It does not create perfect recall or perfect attention.

Irrelevant information still competes with useful information. Conflicting instructions still conflict. Old state can still override a newer reality. Costs and latency still increase with input size.

The right target is not “fit everything.” It is “make every important decision from sufficient, current, attributable context.”

Our opinion

The durable AI platforms will expose context as a system users and developers can inspect. They will show what entered the model, where it came from, how long it persists, and which rule selected it.

Invisible context may make an assistant feel magical. Inspectable context makes it dependable.