Attention–Compression Framework
A Substrate-Independent Model of Attention, Curiosity, and Reality Formation
0) Purpose of This Framework
This framework unifies:
- Attention mechanics (pointing, artifacts, decay)
- Curiosity / interestingness (compression improvement)
- Reality formation (fossilized attention)
into a single, operational model.
It is intended to be:
- Conceptually tight
- Mechanically interpretable
- Applicable across cognition, culture, organizations, and systems
No game substrate is assumed.
1) Core Substrate: Compression
1.1 Data and Models
Let:
- D = data stream (sensory, social, symbolic, environmental)
- O(t) = the system’s internal model at time t
- C(D, O) = compression cost of encoding D with model O
Lower C = better model.
1.2 Curiosity Reward
Curiosity reward is defined as:
r(t) = C(D, O(t-1)) − C(D, O(t))
Interpretation:
- Reward is generated by model improvement
- Not by truth, utility, or beauty directly
- But by reduced description length
1.3 Interestingness
Interestingness is the rate of compression improvement:
I(D, O(t)) ∝ ∂B(D, O(t)) / ∂t
Where:
- Beauty B ≈ compression quality
- Interestingness I ≈ learning gradient
Edge cases:
- Perfect randomness → no compression → not interesting
- Perfect predictability → no improvement → not interesting
2) Attention (Redefined Precisely)
2.1 Definition
Attention is the allocation of finite compression capacity over time.
Attention determines:
- What data is modeled
- Which models are updated
- Where curiosity reward can arise
2.2 Properties of Attention
- Finite
- Directed
- Temporally extended
- Subject to opportunity cost
Allocating attention to one process necessarily deprives others.
3) Pointing as a Primitive Act
3.1 Pointing
Pointing is any act that declares:
“Allocate compression effort here.”
Forms:
- Naming
- Measuring
- Labeling
- Recording
- Repeated noticing
Pointing is irreversible in principle.
3.2 Imaginary Artifacts
Pointing creates an Imaginary Artifact (IA).
An IA is:
- A discrete modeling target
- Non‑material but causally real
- Capable of accumulating attention
- Subject to decay
Examples:
- An idea
- A plan
- A role
- A fear
- A hypothesis
4) Artifact Dynamics
4.1 Attention Accumulation
Artifacts accumulate attention when:
- They continue to generate curiosity reward
- They remain promising sites of compression improvement
Formally:
- Positive expected r(t) sustains attention
4.2 Decay and Boredom
When:
- Compression improvement stalls
- Expected future reward approaches zero
Attention decays.
Boredom = zero compression gradient.
Imaginary artifacts decay faster than real ones.
5) Thresholds: Imaginary → Real
5.1 Fossilization
When an imaginary artifact accumulates sufficient total attention:
- The compression work becomes amortized
- Ongoing maintenance cost drops
- The artifact instantiates as a Real Artifact
Examples:
- Idea → project
- Repeated action → habit
- Hypothesis → theory
- Norm → institution
5.2 Partial Realization
Realization may be:
- Incremental
- Staged
- Reversible
Small realized artifacts feed attention back into the parent IA.
6) Real Artifacts as Cached Compression
Real artifacts are:
- Cached models
- Compiled structure
- Fossilized attention
They:
- Persist with lower marginal attention
- Shape future attention flows
- Bias what is seen as interesting
Examples:
- Language
- Tools
- Infrastructure
- Bureaucracy
7) Attractors
7.1 Definition
Attractors are regions of expected future compression gain.
They are:
- Field‑like
- Non‑discrete
- Named after the fact
Examples:
- “Progress”
- “Safety”
- “Truth”
- “Success”
7.2 Relationship to Attention
Attention naturally flows toward attractors unless constrained.
Constraint mechanisms:
- Fear
- Incentives
- Authority
- Scarcity
8) Leakage, Coupling, and Composition
8.1 Leakage
Attention leaks between artifacts that:
- Share representational structure
- Co‑compress efficiently
This produces:
- Fame compounding
- Institutional lock‑in
- Paradigm coherence
8.2 Composition
- Multiple IAs can merge
- Shared attractors accelerate convergence
This enables:
- Collective belief
- Social movements
- Cultural norms
9) Conservation and Pathology
9.1 Conservation Law
Attention is conserved at the system level.
Allocating attention to:
- Maintaining existing artifacts
- Filtering accumulated structure
Reduces capacity for:
- Exploration
- Novel model formation
9.2 Pathologies
Misaligned compression produces:
- Addiction (short‑term reward, no long‑term compression)
- Ideology (over‑compressed models defended at all cost)
- Burnout (maintenance exceeds curiosity)
- Stagnation (no accessible gradients)
10) Awe, Surprise, and Phase Transitions
This section extends the framework beyond curiosity/interestingness to include awe and related affective signals, while remaining mathematically compatible with:
- curiosity reward: r(t) = C(D,O(t−1)) − C(D,O(t))
- interestingness: I ∝ d(−C)/dt
10.1 Auxiliary quantities
Let observations be x_t and compression cost C_t := C(D,O(t)).
Surprisal / surprise (instant encoding cost under the previous model):
S_t := −log p_{O(t−1)}(x_t)
Learning progress (curiosity reward):
r_t := C_{t−1} − C_t
Expected learning progress over horizon k:
E[r_{t:t+k}] := E[C_t − C_{t+k}]
10.2 Boredom, confusion, and relief (quick definitions)
These are derived signals (not new primitives):
-
Boredom: low expected learning progress. Boredom_t ∝ −E[r_{t:t+k}]
-
Confusion: high current cost with low expected progress. Confusion_t ∝ C_t · (1 − σ(E[r_{t:t+k}]))
-
Relief: sharp drop in cost (a compression win). Relief_t ∝ max(0, r_t)
(σ is any monotone squashing function.)
10.3 Awe (operational definition)
Awe is not merely high interestingness. It is a phase shift in modeling.
Awe tends to occur when:
- surprise is high (S_t high)
- but the experience is sensed as deeply learnable (E[r] high)
- and successful compression likely requires a model-class shift (a new representational basis)
Define a model revision cost d(O(t),O(t−1)) and an indicator for “model-class shift required”:
P_shift(t) := P( O* lies in an expanded hypothesis class H_expanded )
Then a usable scalar proxy is:
Awe_t ∝ S_t · E[r_{t:t+k}] · P_shift(t)
Interpretation:
- S_t captures vastness/violation
- E[r] captures promise of future compression
- P_shift captures that the needed move is not incremental
10.4 Awe as re-ontology
Awe is the felt recognition:
“There is a much better compression available, but my current representational basis cannot reach it by small updates.”
Formally:
C(D,O(t)) is high, ∃ O’ in H_expanded such that C(D,O’) ≪ C(D,O(t)), but O’ is not reachable by small d(O,O’).
10.5 Phase transitions: interest → awe → beauty
A common trajectory:
- Interest: E[r] > 0, incremental improvement
- Awe: S high, E[r] high, P_shift high (representational rupture)
- Refit: high revision cost, temporary instability
- Beauty: low C, stable compression
This explains why awe can feel disorienting before it becomes satisfying.
11) Appreciation (Active Steering)
Appreciation is deliberate gradient steering.
It is the practice of:
- Seeing what is
- Choosing which attractors to feed
- Allowing low‑reward artifacts to decay
Appreciation is not denial. It is selective allocation of compression effort.
12) Love, Grief, Trust, and Meaning (Compression-Coupling Phenomena)
This section extends the framework to core relational and existential experiences, expressed using the same compression-compatible quantities.
12.1 Trust
Trust is the willingness to offload compression work to another system.
Formally, agent A trusts agent B when:
E[C_A(D | O_B)] < E[C_A(D | O_A)]
That is, A expects B’s model to compress A’s future experience more efficiently than A’s own.
Trust reduces:
- modeling effort
- uncertainty
- attentional load
Trust fails when:
- compression delegated to B increases cost or variance
12.2 Love
Love is sustained, reciprocal compression coupling.
Two agents A and B are in love when:
- each becomes a high-leverage compression node for the other
- mutual modeling reduces long-term cost despite short-term surprises
A minimal expression:
Love(A,B) ∝ ∫ ( r_A←B(t) + r_B←A(t) ) dt
Where r_A←B is learning progress about B by A, and vice versa.
Love feels safe because:
- compression is efficient
- prediction errors are rapidly amortized
- model updates are mutually permitted
12.3 Grief
Grief is forced recompression after the sudden loss of a high-leverage compression node.
If agent B was a major contributor to A’s compression:
ΔC_A ≫ 0 when B is removed
Grief magnitude scales with:
- how much of the world B helped compress
- how irreplaceable that compression was
Grief persists until:
- alternative models amortize the lost compression
12.4 Meaning
Meaning is compression leverage.
An artifact, relationship, symbol, or idea is meaningful to the extent that:
small description → large experiential compression
Formally:
Meaning(X) ∝ rac{bits of experience compressed}{bits required to represent X}
This explains why:
- symbols outweigh details
- rituals persist
- simple stories dominate complex truths
Meaning collapses when:
- leverage decays
- symbols no longer compress lived experience
13) Power (Constraint Over Compression)
This section defines power as a first-class system property, fully compatible with the attention–compression formalism.
13.1 Definition
Power is the capacity to shape, constrain, or redirect the compression paths of other systems.
An agent A has power over agent B to the extent that A can:
- determine what B is allowed to attend to
- restrict which models B may form or update
- impose pre-compressed narratives on B’s experience
13.2 Mechanisms of Power
Power operates through compression control, including:
-
Attention gating — limiting what data enters B’s model (censorship, surveillance, distraction)
-
Narrative pre-compression — supplying ready-made models (propaganda, ideology, branding)
-
Update penalties — increasing the cost of revising models (punishment, social sanction, threat)
-
Gradient starvation — preventing access to curiosity reward (monotony, overwork, chaos)
13.3 Power vs Trust
- Trust lowers compression cost voluntarily
- Power lowers apparent cost by removing alternatives
A system under power may experience apparent order without genuine compression improvement.
This explains why power often feels stabilizing in the short term but brittle over time.
13.4 Coercion and Harm
Coercion occurs when model updates are forced without consent.
Formally:
Forced update ⇒ d(O_B(t), O_B(t−1)) imposed externally
This creates:
- high compression cost
- loss of agency
- long-term instability
Harm corresponds to non-consensual compression work.
13.5 Legibility and Over-Compression
Making a system legible to authority often requires:
reducing rich local structure → simplified global model
This lowers compression cost for the authority but raises it for the system itself.
Over-compression destroys:
- resilience
- adaptability
- local meaning
13.6 Power Dynamics and Collapse
Powerful systems fail when:
- maintained compression diverges too far from lived data
- curiosity gradients are suppressed too long
- forced models accumulate unresolved error
Collapse is delayed recompression.
14) Consent (Boundary Condition Between Trust and Power)
Consent is treated as a mechanical boundary condition on model updating and coupling.
14.1 Definition
Consent is a mutually acknowledged permission structure for compression and model update.
Agent A has consent with agent B when updates to B’s model caused by A are:
- expected (within agreed bounds)
- revocable
- renegotiable
- non-punitive to refuse
14.2 Consensual vs non-consensual update
Let ΔO_B(t) := d(O_B(t), O_B(t−1)) be B’s model revision magnitude.
- Consensual update: B opts into ΔO_B(t)
- Non-consensual update: ΔO_B(t) is imposed
A key distinction is not whether B updates, but whether B retains agency over update.
14.3 Consent as cost shaping
Consent alters the effective revision cost.
A simple expression:
C_B,total = C_B,data + μ · ΔO_B − κ · Consent(B,A)
Where Consent(B,A) ∈ [0,1] reduces perceived/experienced cost of revision.
This captures:
- why the same surprise can feel thrilling (consensual) or traumatic (non-consensual)
- why trust accelerates learning
14.4 Consent tokens (operationalization)
In real systems, consent is represented by artifacts such as:
- explicit agreements
- norms
- safe words / stop mechanisms
- boundaries and enforcement
- reversible commitments
These are consent artifacts: cached structures that keep coupling safe.
14.5 Breach
A breach occurs when an interaction crosses agreed bounds.
Mechanically:
- breach increases μ (revision cost)
- decreases Consent(B,A)
- increases variance of future costs
This pushes the system from trust-dynamics toward power-dynamics.
15) Ethics and Morality (Compression Heuristics Under Coupling)
Ethics is modeled here as rule-like compression for social coordination under finite attention.
15.1 Why morality exists (mechanically)
Social life is high-dimensional. Moral rules are:
- low-description heuristics
- that compress expected outcomes across many contexts
They reduce:
- decision cost
- negotiation overhead
- model uncertainty
15.2 Heuristic validity and domain
A moral rule R is useful when:
E[C_society | follow R] < E[C_society | no rule]
But every heuristic has a domain; outside-domain use creates error.
Moral conflict often signals:
- domain mismatch
- competing compressions
- unmodeled externalities
15.3 Harm principle (compression version)
A compact ethical primitive compatible with this framework:
Harm is imposed, non-consensual compression work that increases another system’s long-run cost.
Formally (schematic):
Harm(A→B) ∝ E[C_B,future | A] − E[C_B,future | ¬A]
with the additional condition that Consent(B,A) is low.
15.4 Justice as cost distribution
Justice concerns how compression costs and benefits are distributed.
- Exploitation: one system externalizes its compression costs onto others
- Fairness: costs are shared proportionally to benefits and agency
A toy measure:
Exploitation(A,B) ∝ (Cost imposed on B by A) − (Benefits returned to B)
15.5 Virtues as stable policies
Virtues can be treated as stable attention-allocation policies that:
- reduce harm risk
- preserve consent
- keep gradients accessible
Examples (mechanically framed):
- honesty: reduces model divergence and hidden error
- humility: lowers revision resistance; keeps H_expanded reachable
- compassion: allocates attention to others’ cost surfaces
15.6 The ethics–power interface
Ethical breakdown is strongly predicted by:
- high power asymmetry
- low consent artifacts
- high imposed revision cost
Ethics without consent collapses into compliance.
16) System Summary (Extended)
- Attention allocates compression effort
- Pointing creates modeling targets
- Interestingness is compression improvement
- Awe signals the need for new representational bases
- Artifacts are cached compression
- Love and trust are shared compression strategies
- Grief is forced recompression after loss
- Meaning is compression leverage
- Power constrains compression paths
- Consent is the boundary condition that keeps coupling safe
- Ethics is compression heuristics for coordination under coupling
Or compactly:
Reality is attention, compressed and slowed.