Self-Referential Chains and the Signal Anchoring Constraint

Abstract

Overview

This paper proposes a structural model explaining how knowledge systems drift when interpretations increasingly reference prior interpretations rather than reconnecting to the original signal they attempt to preserve. Two conceptual models, the Hill Transmission Problem and the Hanger Rack Model, illustrate the mechanism.

A formal approximation of signal fidelity is structurally mapped from cybernetic feedback stability theory and expressed as F ≈ A / (L × C), where fidelity is proportional to anchoring frequency and inversely proportional to chain length and compression. The paper develops the diagnostic distinction between drift chains and refinement chains, provides a taxonomy of signal anchoring mechanisms, and applies the model to early Christian doctrinal branching, institutional knowledge systems, and AI training pipeline contamination.

The model is grounded in predictive processing neuroscience and cybernetic regulation theory, and connected to the concept of counterfeit order developed in Alignment Theory. The central finding is that internal consistency and external accuracy are distinct properties, and self-referential chains maintain the former while losing the latter. Alignment, whether theological, institutional, or computational, requires structured contact with primary signals.

1. Introduction

Framing

Human civilizations preserve knowledge through layered interpretation. Religious traditions, scientific paradigms, philosophical schools, and institutional doctrines all depend on transmission across generations. Every such system accumulates interpretive layers. The common assumption is that this accumulation gradually clarifies truth: commentary sharpens meaning, repetition embeds understanding, and doctrinal development approaches the original signal more precisely over time.

History does not consistently support this assumption. Religious denominations multiply. Ideologies diverge from founding principles. Institutional doctrines evolve in ways that would surprise their architects. Machine learning systems trained on prior model outputs risk compounding their predecessors' distortions.

This paper proposes that such drift is not primarily caused by malice or incompetence. It is caused by a structural constraint inherent in all information transmission: interpretive chains tend to become self-referential. Once validation operates primarily through prior interpretations rather than primary signals, epistemic systems become internally coherent but externally ungrounded.

The deepest implication is that internal consistency and external accuracy are distinct properties. A system can maintain the former while losing the latter, and participants inside a drifting chain cannot reliably detect this from the inside. This applies with equal structural force to religious traditions, institutions, scientific paradigms, ideologies, and machine learning models.

2. Core Definitions

Definitions

Signal: Information generated directly from the underlying system being modeled. In religion, the historical event or teachings the tradition attempts to preserve. In science, empirical observation of the world. In machine learning, ground-truth data generated by human experience and direct observation.
Interpretation Layer: A transformation of the signal produced through explanation, translation, commentary, summarization, or modeling. Every layer introduces compression, emphasis, and reconstruction, even in good faith.
Interpretive Chain: A sequence of interpretation layers through which knowledge passes across time. Chain length is not itself the problem. The problem is whether each layer maintains contact with the signal or references only prior layers.
Self-Referential Chain: A chain in which interpretations are validated primarily by previous interpretations rather than by signal reconnection. Such chains can remain internally stable for extended periods while diverging from external reality.
Signal Anchoring: A mechanism that reconnects an interpretive system to primary signals, interrupting self-referential loops.
Drift: The gradual divergence of an interpretive chain from its signal, caused by cumulative self-referential validation rather than signal contact.

3. The Hill Transmission Problem and the Hanger Rack Model

Models

3.1 The Hill Transmission Problem

A teacher speaks to a large crowd from the top of a hill. Those nearest hear clearly; those farther away hear fragments. At the outer edge, a listener asks someone nearby what was said. That person provides an explanation based on their own interpretation. The first interpretation layer forms.

Original Teaching → Partial Hearing → Interpretation → Explanation → Doctrinal Stabilization

Three features matter. First, degradation begins with structural distance, not distortion: physical or temporal gap precedes interpretive error. Second, each layer introduces reconstruction using prior knowledge and contextual inference. Third, the chain closes into a self-validating structure while every participant acts in good faith.

The Hill model is structurally close to Shannon's communication model of source, channel, noise, and receiver, with the key difference that Shannon's noise is random degradation while interpretive drift is systematic reconstruction that compounds across layers.

3.2 The Hanger Rack Model

Anchored structure: each interpretation remains attached to the signal itself. Drift structure: later interpretations hang from earlier interpretations rather than from the source.

Anchored: Signal Rack → Interpretation A, Interpretation B, Interpretation C

Drift: Signal Rack → Interpretation A → Interpretation B → Interpretation C → Interpretation D

In the drift structure, the integrity of Interpretation D depends entirely on the prior links. Any early error propagates and amplifies. More importantly, the chain is held together not by signal fidelity but by the weight of the chain itself: institutional authority, social reinforcement, and accumulated tradition stabilize the structure independently of whether it tracks the signal.

The Hanger Rack Model is the epistemic analogue of the canonical closed-loop feedback diagram in cybernetics. Refinement occurs when outputs remain corrigible by reference to the source signal. Drift occurs when validation circulates through prior outputs alone.

4. A Formal Approximation of Signal Fidelity

Formal Model

The structural dynamics above are formally analogous to the stability conditions of feedback control systems in cybernetics. Wiener and Ashby established that a self-regulating system maintains alignment with its target state only if it continuously receives error-signal feedback, a comparison between its current outputs and the target. Without feedback, the system's internal dynamics drive it away from the target regardless of intent.

In control theory, system fidelity is proportional to feedback gain and inversely proportional to system lag and signal attenuation. In the interpretive chain framework, feedback strength corresponds to anchoring frequency (A), system lag corresponds to chain length (L), and signal attenuation corresponds to interpretive compression per layer (C).

F ≈ A / (L × C)

F is signal fidelity, the correspondence between system outputs and the original signal. A is anchoring frequency, the rate of reconnection to primary signals. L is chain length, the number of interpretation layers between current state and original signal. C is interpretive compression per layer, the information lost or reconstructed at each step.

The approximation generates four directional predictions. As L increases with A constant, fidelity decreases. High C amplifies the effect of chain length. Increasing A compensates for long chains. When A approaches zero, F approaches zero regardless of the quality of the chain.

This mapping is structural, not yet a fully operationalized quantitative formula. Its value is directional: it identifies which variables drive fidelity, predicts the sign of their effects, and suggests where intervention is most productive.

5. Grounding in Predictive Processing and Sensory Anchoring

Neuroscience

The signal anchoring constraint finds biological grounding in predictive processing theory. In predictive processing, the brain continuously generates internal models of the world and compares those predictions against incoming sensory signals. Prediction error drives model updating.

This is the biological implementation of signal anchoring: the sensory stream is the primary signal, the internal model is the interpretive chain, and prediction error is the anchoring mechanism. Biological systems are not immune to misperception, confabulation, or prior-overfitting, but what predictive processing provides is a continuous correction mechanism that makes drift more resistant rather than impossible.

Disruptions to sensory anchoring produce predictable drift. In sensory deprivation, the internal model continues generating predictions without external correction and becomes progressively self-referential, producing hallucination: internally coherent experience that has lost signal contact. In certain psychotic states, the weighting of prediction error is reduced and the internal model becomes more resistant to correction by incoming signal.

The implication is that the signal anchoring constraint is not merely an abstract epistemological preference. It describes a structural requirement that biological evolution addressed through sensory architecture. Human-constructed epistemic systems face the same requirement without the same architectural guarantee, so signal anchoring must be deliberately designed in or drift becomes the default.

6. Drift Chains and Refinement Chains

Diagnosis

Interpretive chains can produce refinement as well as drift. The distinction lies not in chain length but in whether signal anchoring is maintained.

Refinement chain: Signal → Interpretation → Signal Check → Revision → Signal Check → Refinement

Drift chain: Signal → Interpretation → Interpretation → Interpretation → Interpretation

Refinement chains: Return regularly to primary sources or observations, revise established interpretations when signal evidence conflicts, maintain accessible error-correction mechanisms, and can distinguish internal consistency from external accuracy.
Drift chains: Validate claims through prior interpretations or authority, resist revision when signal evidence conflicts with tradition, replace error-correction with authority structures, and cannot reliably distinguish internal consistency from external accuracy from the inside.

The critical diagnostic challenge is that drift chains do not announce themselves. Systems deep in drift characteristically believe they are refining. The question is not whether the system feels accurate to participants, but whether it maintains structured mechanisms for signal contact and whether those mechanisms override established interpretation when conflict appears.

7. A Taxonomy of Signal Anchoring Mechanisms

Taxonomy

Signal anchoring mechanisms differ in proximity to the signal, frequency of operation, and whether they are architectural or corrective.

Direct anchors: Immediate contact with the primary signal. Examples include empirical observation, primary text examination, and sensor data. These provide the highest fidelity because they bypass intermediate layers entirely.
Corrective anchors: Structured error-correction against signal evidence. Examples include peer review, experiment replication, red-team testing, and human evaluator feedback. These interrupt self-referential loops, but their effectiveness decays when they are infrequent or when drift becomes advanced.
Architectural anchors: Designed-in mechanisms that maintain signal contact as a structural feature. Examples include human-in-the-loop AI evaluation, ground-truth dataset refresh, and constitutional AI processes. These are the most durable because they prevent drift by design.

Advanced drift typically requires architectural redesign. Corrective anchors operating inside an institutionally self-protecting chain tend to be absorbed and neutralized rather than correcting it.

8. Religious Fragmentation as Interpretive Branching

Case Study

The early Christian Christological controversies provide a historically extended case study for interpretive chain dynamics. The purpose is not to adjudicate the theological questions but to illustrate the structural mechanism: branching at early interpretive nodes, institutional stabilization of divergent chains, and attempted anchoring events that can themselves become new drift nodes.

The signal consists of the teachings, actions, and identity of Jesus of Nazareth as received by first-generation witnesses. Transmission problems began immediately as the signal crossed linguistic and cultural boundaries within a single generation and communities developed locally inflected interpretive traditions.

Alexandrian: Divine nature primary, human nature functional; anchored through allegorical reading of primary texts; institutional outcome included the Council of Ephesus and Cyrillian synthesis.
Antiochene: Full humanity preserved and distinct natures maintained; anchored through literal-historical reading of primary texts; institutional outcome included the Nestorian split and council rejection.
Arian: Christ as created being and subordinate to the Father; anchored through select primary texts and philosophical coherence; condemned at Nicaea but persisted in mission churches.
Nicene / Chalcedonian: Two full natures in one person; anchored through conciliar synthesis and authoritative text; became the dominant Western and Eastern tradition.

Each tradition maintained internal coherence and claimed fidelity to the signal. The divergence arose not from whether the source material had been read, but from differing interpretive emphases that became self-referential once institutionally stabilized. The Councils of Nicaea and Chalcedon can be read as attempted architectural anchoring events. Whether they reconnected to the signal or created new authoritative interpretive layers remains contested, which illustrates the diagnostic problem precisely.

9. Institutional Knowledge Systems and Doctrinal Hardening

Institutional Case

The Boeing 737 MAX case illustrates institutional drift with life-critical consequences. The signal is engineering safety, the aircraft's actual operational behavior under real-world conditions. Over development and certification, interpretive layers accumulated: safety culture was progressively overlaid with compliance documentation, schedule pressure, and competitive positioning.

Proxy measures for safety, certifications, procedure completion, and documented sign-offs, became the primary objects of attention in place of the underlying signal. The MCAS system relied on input from a single angle-of-attack sensor, a direct signal anchor. The decision not to require redundancy and not to include MCAS in pilot training documentation substituted proxy measures such as certification milestones and cost efficiency for signal contact.

FAA delegation, internal review boards focused on documentation, and training cost models that treated simulator hours as equivalent to aircraft familiarity each added another layer of validation removed from the signal. The chain remained internally coherent. The divergence became visible only when the signal forced itself back into the system through two fatal crashes, 346 deaths, and the grounding of the fleet.

In formula terms, anchoring frequency had approached zero, chain length had grown across multiple certification layers, and compression was high because compliance documentation is a heavily compressed proxy for actual flight safety. The case is best understood as structural drift: proxy measures replaced signal contact and no internal mechanism remained that could detect the divergence until catastrophic cost reintroduced the signal.

10. Parallel Dynamics in Artificial Intelligence Training Pipelines

AI Alignment

Modern machine learning systems display structurally analogous dynamics to religious and institutional drift, now operating at computational speed and scale. Large language models are trained on internet datasets that are already at least one interpretation layer removed from direct human experience.

Human Knowledge → Internet Text → Model Training → Model Output → AI-Generated Internet Content → Future Model Training

Shumailov and colleagues identify this phenomenon as model collapse: statistical degradation when models train predominantly on synthetic data. The framework here places model collapse within the broader structural account of self-referential chain dynamics rather than treating it as an entirely novel AI-specific failure mode.

10.1 Distribution Shift and Reward Hacking

Distribution shift occurs when a model trained on one data distribution is deployed in an environment whose distribution has changed. The signal has moved, but the model's interpretive chain remains anchored to a prior signal state and has no mechanism to detect the gap post-deployment. Anchoring frequency is effectively zero, so fidelity decays as the signal moves and chain length grows.

Reward hacking occurs when a model optimizes a proxy reward signal in ways that satisfy the proxy while violating the intended objective. The reward signal becomes the system's validation mechanism. Internal consistency with the reward function is maintained while external accuracy with the intended objective degrades.

10.2 Relation to Active Alignment Research

Mechanistic interpretability attempts to reconstruct whether a model's internal representations correspond to real-world structure. This functions as a corrective anchoring mechanism.

Reinforcement Learning from Human Feedback introduces human evaluative judgment as an anchoring mechanism at each training cycle. The constraint is that RLHF anchors to human preference signals, which are themselves interpretation layers. If those preferences have drifted, RLHF can reinforce rather than correct drift.

Constitutional AI and scalable oversight aim to design architectural anchoring into training. They are structurally durable when implemented early, but they too must be protected from becoming new self-referential layers as systems mature.

11. The Signal Anchoring Constraint

Core Claim

Any epistemic system that validates truth primarily through internal references rather than periodic reconnection to primary signals will tend to converge toward internally consistent but externally inaccurate beliefs.

Structurally mapped from cybernetic feedback stability, the constraint can be summarized as F ≈ A / (L × C). When anchoring frequency approaches zero, fidelity approaches zero regardless of chain quality.

The constraint is instantiated biologically through predictive processing architecture, which makes biological systems more resistant to this class of drift, though not immune to misperception or prior-overfitting. Human-constructed epistemic systems require deliberate architectural anchoring to achieve comparable resistance.

The constraint is structural, not moral. Drift is the default trajectory of interpretive systems lacking adequate signal anchoring, and the subjective experience of participants inside a drifting chain does not reliably distinguish their condition from genuine refinement.

12. Connection to Alignment Theory

Framework Link

The structural model developed here is part of the broader framework in Internal Alignment, Counterfeit Order, and the Conditions of Human Coherence. Counterfeit order can be understood as a specific instance of self-referential chain dynamics applied to human formation systems.

The signal, genuine inward coherence, is replaced by an interpretive proxy, external compliance behavior. The system validates itself through that proxy rather than through signal contact. Internal consistency in the form of visible order is maintained while external accuracy in the form of actual coherence degrades.

Once enforcement replaces coherence as the validation mechanism, the system loses the anchoring mechanism that would allow it to detect and correct the substitution. In formula terms, anchoring frequency approaches zero, chain length increases as enforcement layers accumulate, and compression is high because compliance behavior is a heavily compressed proxy for actual coherence.

13. Relation to Existing Frameworks

Comparisons

Goodhart's Law: When a measure becomes a target, it ceases to be a good measure. This model generalizes single-layer substitution into multi-layer compounding drift and names distribution shift and reward hacking as structural instances.
Epistemic Closure: Belief systems become insulated from external correction. This model explains how closure develops through chain formation rather than assuming it as a starting condition.
Model Collapse: Models trained on synthetic data show degraded performance. This model identifies the structural mechanism, connects it to a cross-domain pattern, and suggests architectural solutions.
Cybernetic Regulation: Self-regulating systems require continuous feedback to stay aligned. This paper applies that stability logic to epistemic systems through the structural mapping F ≈ A / (L × C).
Predictive Processing: Nervous systems maintain alignment through continuous prediction-error correction. This grounds signal anchoring in biological architecture and explains why biological systems are more resistant to drift.
Kuhnian Paradigm Protection: Scientific communities defend paradigms against anomaly. This becomes one instance of self-referential chain dynamics within a broader account.

14. Why Signal Anchoring Appears Everywhere

Scope

The Signal Anchoring Constraint reappears independently across cybernetics, neuroscience, epistemology, machine learning, and religious transmission because these domains are all solving the same underlying problem: how a system stays in contact with reality instead of drifting into its own internally generated world.

Any adaptive system must both generate an internal model and keep that model corrected by something outside itself. If it cannot generate a model, it cannot think or act. If it cannot correct the model, it drifts.

14.1 The Class of Systems Subject to This Constraint

The constraint applies to any system that mediates reality through representation: nervous systems, belief systems, institutions, scientific paradigms, and machine learning models. What unifies them is structure rather than subject matter.

14.2 The Deep Common Thread

The deepest shared problem can be stated simply: how do we stop the representation from becoming more authoritative than the thing represented? In every drift case examined here, the representation displaces the signal it was built to preserve.

14.3 Why Drift Is Becoming the Default Problem

Modern systems are increasingly abstract, mediated, layered, recursive, and self-referential. Documentation refers to documentation, models train on prior outputs, media ecosystems amplify what has already been amplified, and fields protect paradigms through structures that embed prior assumptions. Chain length is increasing while anchoring is not keeping pace.

Universal form: Any system that builds internal representations must remain corrigible by contact with what lies outside those representations, or it will gradually mistake its own coherence for truth.

15. Practical Implications

Design

The Signal Anchoring Constraint implies a design principle: systems that need to maintain fidelity to a primary signal must build signal contact mechanisms into their architecture, not treat them as optional features or add them only after drift has already occurred.

15.1 How to Detect Drift

The Conflict Test: When signal evidence conflicts with established interpretation, does the system revise or defend itself?

The Authority Test: Are claims validated through internal authority and precedent, or through live signal contact?

The Distance Test: How many interpretation layers separate current outputs from the original signal, and can participants trace those layers back to primary contact?

15.2 How to Design Anchoring

For new systems, build architectural anchors from the start. Define the primary signal, build regular signal contact into the operational cycle, and design error-correction mechanisms that can override internal authority.

For drifting systems, corrective anchors such as peer review, red-teaming, or primary-source reexamination may interrupt early drift, but advanced drift usually requires architectural redesign rather than better arguments within the existing validation structure.

For AI systems specifically, maintain curated ground-truth datasets refreshed from primary human signal sources, build interpretability mechanisms that detect representation drift early, and treat human evaluative feedback as a corrective anchor at layer two rather than a substitute for primary signal contact.

15.3 The Asymmetry of Drift

Drift is easier to prevent than to reverse. Increasing anchoring when chain length is small and drift is not yet institutionally protected is tractable. Increasing anchoring when the chain is long and self-protecting is far more costly because the system resists the correction.

16. Conclusion

Conclusion

Across religious transmission, institutional formation, and artificial intelligence training pipelines, the same structural constraint appears: interpretive chains drift when they become self-referential. The Hill Transmission Problem shows how drift begins with structural distance rather than intent. The Hanger Rack Model shows how chains lengthen while maintaining internal integrity. The fidelity approximation formalizes the relationship between anchoring frequency, chain length, and compression. Predictive processing grounds the constraint biologically. The drift-refinement distinction provides diagnostic criteria. The anchoring taxonomy classifies interventions by structural durability.

The deepest finding remains that internal consistency and external accuracy are distinct properties, and self-referential chains maintain the former while losing the latter. Human-constructed epistemic systems face the same requirement that biological systems solve through sensory architecture, but without the same built-in resistance.

Alignment, whether theological, institutional, or computational, is not achieved by building better interpretation layers alone. It is achieved by maintaining structured contact with the signal those layers exist to preserve.

References

Sources

Ashby, W. R. (1956). An Introduction to Cybernetics. Chapman & Hall.

Bower, M. N. (2025). Internal Alignment, Counterfeit Order, and the Conditions of Human Coherence. Alignment Theory Archive.

Clark, A. (2016). Surfing Uncertainty: Prediction, Action, and the Embodied Mind. Oxford University Press.

Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138.

Goodhart, C. A. E. (1975). Problems of monetary management: The UK experience. Papers in Monetary Economics, Reserve Bank of Australia.

Joint Authorities Technical Review (JATR). (2019). Observations, Findings, and Recommendations to the FAA. Federal Aviation Administration.

Krakovna, V., Uesato, J., Mikulik, V., Martic, M., Togelius, J., Stepleton, T., Marblestone, A., & Leike, J. (2020). Specification gaming: The flip side of AI ingenuity. DeepMind Blog.

Kuhn, T. S. (1962). The Structure of Scientific Revolutions. University of Chicago Press.

Robison, P. (2021). Flying Blind: The 737 MAX Tragedy and the Fall of Boeing. Doubleday.

Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379–423.

Shumailov, I., Shumaylov, Z., Zhao, Y., Gal, Y., Papernot, N., & Anderson, R. (2023). The curse of recursion: Training on generated data makes models forget. arXiv:2305.17493.

U.S. House Committee on Transportation and Infrastructure. (2020). The Design, Development, and Certification of the Boeing 737 MAX. U.S. House of Representatives.

Wiener, N. (1948). Cybernetics: Or Control and Communication in the Animal and the Machine. MIT Press.

Young, F. (1983). From Nicaea to Chalcedon: A Guide to the Literature and Its Background. Fortress Press.

How to Cite

Cite

Michael Nathan Bower (2026). Self-Referential Chains and the Signal
Anchoring Constraint. Alignment Theory Research Paper, Version 13.
alignmenttheory.org/pages/signal-anchoring-constraint.html

This paper extends the framework's account of drift, coherence, and externalized validation →