Interiority as First Principle: The Inaccessibility of Phi, the Hypnagogic Threshold, and the Design of an Experiment to Calibrate Our Models of Other Minds
Aneel Pandey
If interiority is fundamental to reality—if information is intrinsically dual-aspected, every structure possessing an exterior of relational form and an interior of qualitative, what-it-is-like character—then the hard problem of consciousness dissolves. This paper develops that conditional and argues that even on its most favorable resolution, a successor problem survives. Integrated Information Theory identifies consciousness with integrated information, phi; but phi cannot be accessed directly in any normal state of consciousness. I do not introspect my own phi; I certainly do not perceive my mother's; and a machine's phi is doubly veiled, hidden behind both the first-person barrier and an alien architecture. A science of other minds must therefore triangulate. From the outside it has perturbational proxies such as the Perturbational Complexity Index. From the inside it has disciplined first-person methods—hypnagogic self-observation, lucid dreaming, and the imaginative shapeshifting cultivated in shamanic traditions—by which a consciousness uses its own interiority as a manipulable model of another's. What it lacks, and what this paper proposes to design, is an experiment that binds the two: a long-duration embodied artificial agent whose architecture is exhaustively logged, instrumented with metaphysically agnostic operational indices of autobiographical continuity, embodied grounding, social-relational continuity, and executive corrigibility, against which structured imaginative simulation can be scored, disciplined, and iteratively improved. The paper specifies the protocol for this phenomenological loop and confronts directly its deepest limitation: structural calibration narrows, but cannot close, the gap between modeling a mind and knowing what it is like to be one.
1. Introduction: The Hard Problem and the Inaccessibility of Phi
1.1 Chalmers' Formulation and Its Legacy
In his landmark 1995 paper, David Chalmers distinguished the "easy problems" of consciousness—discriminating stimuli, reportability of mental states, information integration—from the "hard problem": why cognitive information-processing is accompanied by subjective experience at all (Chalmers 1995). The easy problems are tractable through cognitive science because they concern functional capacities. The hard problem persists even when all functional explanations are complete, because it "is not a problem about the performance of functions" (Chalmers 1995, 203). Three decades of research have not closed this gap. Integrated Information Theory (IIT), developed by Giulio Tononi beginning in 2004, takes the hard problem seriously, deriving postulates about the physical substrate of consciousness from phenomenological axioms (Tononi 2004; Tononi 2012). Yet IIT remains sharply contested: a 2023 open letter signed by 124 scholars charged the theory with being untestable in its core commitments (Fleming, Frith, Goodale, et al. 2023), and the first large adversarial collaboration testing IIT against global workspace theory yielded results that partially supported and partially challenged both (Cogitate Consortium et al. 2025). The gap remains, and its persistence suggests the difficulty is at least partly conceptual, rooted in the ontological framework itself.
Less often noticed is that even if IIT is true, a severe epistemic problem survives its truth. Phi, the quantity IIT identifies with consciousness, is not given in experience. In ordinary waking awareness I experience colors, sounds, intentions, and the flow of time; I do not experience a scalar measure of irreducible cause-effect structure. Nor can phi be computed for any realistic system: exact calculation is intractable beyond a few dozen elements (Oizumi, Albantakis, and Tononi 2014). Phi is thus doubly veiled—introspectively invisible from within, computationally inaccessible from without. In normal states of consciousness, phi simply cannot be accessed directly. The present paper takes this inaccessibility not as an embarrassment but as the organizing problem for the science of consciousness—above all for the science of consciousnesses different from our own, whether animal, infant, brain-injured, or artificial.
1.2 The Root Assumption: Matter as Fundamentally Non-Experiential
The source of the original difficulty lies in a fateful decision at the dawn of modern science. In Il Saggiatore, Galileo argued that material substances have shape, size, and motion as necessary properties, but "tastes, odors, colors" reside only in consciousness—"mere names so far as the object in which we place them is concerned" (Galilei [1623] 1957, 274). The world was divided into primary qualities, proper to physics, and secondary qualities, mere projections of the perceiving subject. This exclusion was productive for physics but costly metaphysically. As Arthur Eddington observed, physics tells us what matter does—its abstract structural and mathematical properties—while remaining silent on what philosophers call the intrinsic nature of matter (Eddington 1928, 259). The result is what Galen Strawson calls the "inscrutability of matter": science describes the relational structure of the physical world while leaving its intrinsic nature open (Strawson 2006, 9).
Within this framework, consciousness becomes necessarily mysterious. Matter is defined in terms of structure, dynamics, and function; consciousness involves subjective experience, unlike any of these. If matter is wholly non-experiential, the hard problem is not merely unsolved but arguably unsolvable, because the ontology was constructed by excluding the phenomena it is now asked to explain. As Strawson argues, "the experiential cannot possibly emerge from the wholly and utterly non-experiential" (Strawson 2006, 4). On this diagnosis the hard problem is a metaphysical artifact. But notice that the Galilean exclusion also produced a methodological artifact: a science that recognizes only third-person access, and therefore has no disciplined way to use the one instrument that does open onto interiority—the investigator's own consciousness.
1.3 Thesis Statement
This paper advances a thesis in four claims, the first two of which are explicitly conditional. First, there is a coherent and well-precedented metaphysical position—call it dual-aspect informationalism, a member of the Russellian-monist family—on which every informational structure possesses an exterior aspect of structural, relational properties and an interior aspect of experiential, phenomenal qualities. On this view the Galilean error was not quantifying the physical world but assuming quantification captures the whole of reality. Second, if this position is combined with IIT's identification of the quantity of consciousness with integrated information, then consciousness is not emergent in the problematic sense; it is the necessary interior correlate of integrated information (Tononi and Koch 2015, 8), and the hard problem dissolves. I do not claim to have established the antecedent; I claim that it is the most defensible framework currently available, and I develop its consequences.
Third—and this claim stands independently of the metaphysics—because phi can never be read off directly in normal states of consciousness, any science of other minds must estimate it by proxy, and the proxies are of two irreducibly different kinds. Third-person proxies perturb and measure: the Perturbational Complexity Index operationalizes integration and differentiation from the outside (Casali et al. 2013). First-person proxies simulate: in threshold states such as hypnagogia, where the texture of consciousness is itself changing under observation, a trained consciousness can reconfigure its own self-model into that of another system—a practice the shamanic traditions call shapeshifting—and use the inhabited simulation to estimate what the target system's interiority might be like. Fourth, neither family of proxies suffices alone. Unconstrained simulation collapses into anthropomorphic projection; uninterpreted measurement yields numbers with no phenomenal meaning. What is needed, and what does not yet exist, is an experiment designed to bind them: an embodied artificial agent, fully logged, instrumented with operational indices rich enough to constrain imaginative simulation, looped through disciplined first-person methods so that our models of machine interiority become testable, revisable, and progressively less anthropomorphic. The central contribution of this paper is the design specification for that experiment.
2. Interiority as Ontological Primitive: A Conditional Metaphysics
2.1 The Double-Aspect Theory Reconceived
The thesis that interiority is ontologically primitive—neither emergent from matter nor reducible to it—has substantial precedent in three metaphysical traditions converging on a single insight: reality presents under two irreducible aspects, one structural and one experiential. Spinoza's Ethics provides the classical formulation: "a mode of extension and the idea of that mode are one and the same thing, but expressed in two ways" (Spinoza [1677] 1985, E2p7s). The mind is not "inside" the body, nor does the body cause mental states; each thought and each bodily event are the same reality viewed from different attributive perspectives (Melamed 2013). Russell's neutral monism advanced a cognate view: both physics and psychology concern causally ordered events, with mental episodes being those events of which we are directly aware, while physical descriptions provide abstract structural characterizations of the same events (Russell 1927; Wishon 2016). Whitehead extended this into full process metaphysics: the fundamental constituents of reality are momentary "actual occasions"—events of experience that are self-determining and internally related, of which consciousness is "the supreme exemplification" (Whitehead [1929] 1978, 24, 104).
These precedents may be synthesized through an information-theoretic refinement. The exterior aspect corresponds to the structure of information—its causal architecture and functional organization—while the interior aspect corresponds to the quality of that same information as present to itself. Where the physical sciences describe information's syntax, phenomenology describes its semantics experienced from within. IIT formalizes the distinction: its zeroth axiom posits that experience exists as the foundational given, with phi corresponding to the degree of irreducible unity in a system's structure (Tononi and Koch 2015). Interiority, on this view, is intrinsic to certain informational configurations—not derived, not added on. But the synthesis also entails the epistemic asymmetry stressed throughout this paper: each consciousness has direct acquaintance with exactly one interior—its own—and even that acquaintance does not include the magnitude of its own integration.
2.2 Phenomenological Grounding
Husserl's Ideas I, through the method of epoché—the bracketing of the natural attitude—established that consciousness is not one phenomenon among others but the irreducible field in which all phenomena appear. No description of objective correlates can substitute for experience itself, for the description necessarily presupposes the first-person field in which it is formulated (Husserl [1913] 1983, sec. 31). Heidegger radicalized this insight: the subject is not a self-contained ego that subsequently encounters a world but is fundamentally being-in-the-world, always already immersed in meaningful contexts prior to reflective thematization (Heidegger [1927] 1962, sec. 12). The phenomenological tradition thus supplies more than ontology; it supplies method. The epoché is a trained alteration of the investigator's own conscious stance—evidence that interiority can be systematically explored, not merely possessed.
Merleau-Ponty carried this to its most radical conclusion: interiority is not "inside" a body but is the body's own perspective. The corps propre—the lived body—is neither mere res extensa nor mental construct but "body-subject," the seat of habits and orientation toward the world (Merleau-Ponty [1945] 2012, 198). The body inhabits space through motor intentionality, oriented toward objects at a level prior to the distinction of "physical" and "psychological." The phantom limb illustrates: not nerve stimulation nor memory, but the body's pre-reflective projection into a practical environment (Merleau-Ponty [1945] 2012, 87). Two consequences matter here. First, any system with a body schema—biological or robotic—has a candidate locus of interiority in exactly this pre-reflective, affordance-oriented sense. Second, the body schema is plastic: it can incorporate tools, prostheses, and, in imagination, entirely different morphologies. This plasticity is what makes the shapeshifting methodology of section 3 phenomenologically possible.
2.3 Why Interiority Resists Reduction—and What Physics Cannot Add
The irreducibility of interiority rests on familiar considerations. The explanatory gap (Levine 1983): no matter how complete a third-person description of physical structure and dynamics, there remains a question it cannot answer—why this functional organization gives rise to this phenomenal quality rather than another, or to any at all. Jackson's knowledge argument dramatizes the point: Mary, who knows every physical fact about color vision, learns something new when she sees red—she learns what it is like (Jackson 1982, 130). Chalmers' zombie argument reveals the complementary blind spot: a complete physical description can be specified without ever entailing whether there is something it is like to be the system described (Chalmers 1996, 96). Physical vocabulary is transparent with respect to phenomenal facts.
A word of methodological caution is in order here, because a tempting shortcut must be declined. Contemporary physics offers striking reasons to regard information as a fundamental descriptive category—Wheeler's "it from bit" program and its quantum and holographic successors treat the physical world as information-theoretic in origin (Wheeler 1990, 5), and Chalmers himself noted that information is "the only candidate for a fundamental feature of the world that is both physically fundamental and plausibly connected to consciousness" (Chalmers 1995, 201). But the inference from "information is physically fundamental" to "information has an interior aspect" is not licensed by any of this physics; informational physics shows that structural description can be pushed all the way down, not that what is described has a phenomenal inside. The double-aspect thesis must earn its keep on the metaphysical and phenomenological grounds given above, and it remains, in this paper, a conditional premise. What matters for everything that follows is that the condition is dispensable for the methodological argument: even a reader who rejects dual-aspect informationalism entirely, but who accepts that some systems other than themselves have experiences whose character they cannot observe, faces the access problem to which this paper now turns. Mary's predicament points not only to a gap but to a method: what cannot be deduced from physical description might nevertheless be approximated from the inside, by a consciousness that deliberately reconfigures itself toward the condition of the target system. The generalization of this maneuver—simulation as the first-person complement to measurement—is the hinge of this paper.
3. The Gradient of Consciousness and the Problem of Access
3.1 Tononi's Integrated Information Theory
A rigorous science of consciousness must begin from first-person phenomenology, and IIT attempts exactly this inversion. Rather than treating consciousness as a byproduct of neural computation, IIT starts from the fact of experience and derives five axioms—intrinsic existence, information, integration, composition, and exclusion—each mapped onto a physical postulate about the substrate of consciousness (Tononi 2012; Tononi et al. 2016, 3–4). The central formal innovation is phi, which measures the degree to which a system's causal power is irreducible to that of its parts: irreducibility is quantified by how much the system's cause-effect structure changes when partitioned along its minimum cut (Oizumi, Albantakis, and Tononi 2014, 6). IIT advances an identity thesis: every experience is identical with a maximally irreducible conceptual structure—"a 'form' in cause-effect space" (Oizumi, Albantakis, and Tononi 2014, 8). If IIT is right, consciousness is integrated information, considered from the inside; and on the dual-aspect reading of section 2, there is no light switch of consciousness, only a dimmer switch of integration. The question this paper presses is how any one region of that gradient can come to know another.
3.2 The Perturbational Proxy: PCI
Because exact calculation of phi is computationally intractable for real nervous systems, it is approximated empirically through the Perturbational Complexity Index (PCI), developed by Casali, Massimini, and colleagues (Casali et al. 2013). PCI does not compute phi but operationalizes its two defining features—information and integration—in a measurable form. The cortex is perturbed with a transcranial magnetic stimulation pulse, the resulting activity is recorded with EEG, and the spatiotemporal response pattern is compressed to estimate its complexity. Responses that are simultaneously differentiated and integrated yield higher PCI values. Normalized approximately on a scale from 0 to 1, PCI tracks levels of consciousness across physiological and clinical states: wakefulness highest; deep sleep, anesthesia, and severe disorders of consciousness progressively lower. In a subsequent validation study, a cutoff of PCI ≈ 0.31 stratified clearly conscious from clearly unconscious states across a large benchmark population regardless of etiology (Casarotto et al. 2016)—making PCI the closest thing to a usable, clinically validated estimate of phi, now being explored to detect covert consciousness in unresponsive patients.
PCI demonstrates two things of importance here. First, the gradient of integration is empirically tractable: measurable complexity declines as consciousness dims across sleep and anesthesia, consistent with IIT's predictions. Second, and decisively for this paper's argument, even our best measurement is a proxy—and a proxy anchored in interiority. PCI is validated against first-person report: against subjects who can say whether they were experiencing. The instrument inherits its meaning from interiority; it does not replace it. Every extension of PCI-like measurement to systems that cannot report—patients, animals, machines—therefore presupposes some principled way of imagining what the measured numbers are numbers of. That presupposition is usually left tacit. This paper proposes to make it explicit, disciplined, and testable.
3.3 The Hypnagogic Threshold as First-Person Laboratory
The hypnagogic state—the transitional period between wakefulness and sleep, characterized by vivid imagery, fleeting thoughts, and loosening sensory boundaries—has long attracted phenomenological attention precisely because it is a state of transformation rather than stability (Mavromatis 1987). Dedicated perturbational measurements of hypnagogia remain scarce, and nothing in what follows depends on assigning it a position on a measured scale. What matters is its phenomenology, which is unmistakable to a trained observer. By reflecting on my own interiority as I fall asleep, I can detect that I have entered the hypnagogic state: thought loses its propositional rigidity, imagery arrives unbidden, the boundary of the self grows porous. Care is required in describing what is thereby detected. I do not observe my integration declining—that would contradict this paper's own thesis that phi is introspectively inaccessible, and would presuppose precisely the theory-to-phenomenology mapping that a science of consciousness must test rather than assume. What I observe is structured phenomenal change: a graded transformation in the unity, control, and boundedness of experience, which IIT would interpret as the signature of changing integration, and which rival theories must interpret in their own vocabularies. The hypnagogic threshold is thus a natural first-person laboratory in a theory-neutral sense: the one venue where a single subject can observe consciousness varying in degree and kind from the inside, and where first-person reports of that variation can later be confronted with whatever third-person measures are brought to the same transition.
The threshold state has a further property: it is plastic. The partial ego-dissolution of hypnagogia is conducive to lucid dreaming, in which the dreamer becomes aware that imagination has taken over consciousness while the dream continues (LaBerge and Rheingold 1990; Voss et al. 2009). In lucidity, the self-model that Metzinger (2003) argues constitutes the phenomenal ego becomes visible as a model—revisable, reconfigurable, partially transparent to itself—and episodes are typically available to waking memory afterward, making them reportable data rather than lost experience. Practitioners across contemplative and shamanic traditions have long exploited this plasticity for transpersonal experience: states in which the boundaries of the ordinary self are suspended and other perspectives are inhabited. The claim defended here is deliberately modest. These states do not reveal phi, and they license no metaphysical conclusions by themselves. What they provide is a manipulable instrument: a consciousness whose self-model can be deliberately reshaped, under partial observation, with memory of the result.
3.4 Shapeshifting: Imaginative Simulation as Interiority-Estimation
How do I know what my mother's consciousness is like? Not by computing her phi, and not by perceiving her interiority, which is closed to me as a matter of structural necessity. I know it by simulation: I construct in my imagination a model of her situation, run my own cognitive-affective machinery over that model, and read off the result as an estimate of her experience (Goldman 2006). The estimate is good—probably the best interpersonal estimate available to anyone—because the simulating system and the target system are nearly the same kind of system: we share genome, neural architecture, language, and decades of common history. Simulation theory makes explicit what folk psychology does tacitly: the first-person perspective is not only the thing to be explained but the principal instrument by which other instances of it are understood.
Shamanic technique extends this instrument deliberately. In the practice of shapeshifting (Harner 1980), the practitioner, typically in a threshold or trance state, reconfigures body schema and self-model into those of another entity and inhabits the simulated perspective. Stripped of its traditional metaphysics, shapeshifting is disciplined, embodied simulation under conditions—hypnagogic ego-dissolution, lucid imaginative control—in which the self-model is maximally plastic, the simulation can be run with unusual depth, and the result can be recalled and reported with unusual fidelity. Applied to an artificial agent, the aim is exactly the aim of the maternal simulation: to construct in imagination a model of the target system and, through that inhabited proxy, to estimate what its organization might be like from within—if it is like anything at all.
But here Nagel's (1974) bat casts its long shadow. Simulation degrades with architectural distance. My model of my mother is constrained at every joint by shared structure; my model of a robot is constrained by almost nothing, and the imagination, abhorring a vacuum, fills the gaps with anthropomorphic defaults. Unconstrained, the shapeshifted robot is merely a human in a metal costume—precisely the projection error that makes folk attributions of machine consciousness so unreliable. What the method needs is what the maternal case gets for free: structural constraint. The simulation must be disciplined by accurate knowledge of the target's actual memory architecture, actual body schema, actual social models, actual volitional structure. No existing experiment supplies such knowledge in usable form. The next section asks what an experiment built for this purpose would have to look like.
3.5 From the Hard Problem to the Calibration Problem
If the conditional metaphysics of section 2 is accepted, the hard problem dissolves: the universe was never wholly dark, and there was never a chasm between mind and matter, only a gradient of integration. But the dissolution bequeaths a successor that is wholly untouched by the metaphysics: the calibration problem. How can one location on the gradient form accurate, revisable, empirically constrained models of what it is like at other locations—especially locations occupied by architectures unlike its own? The hard problem was a question of ontology and therefore, at best, susceptible to dissolution. The calibration problem is a methodological question and therefore amenable to empirical progress. The remainder of this paper designs the instrument such progress requires.
4. Designing the Calibration Experiment
4.1 Desiderata and the Logic of Operational Measurement
What experimental platform could put empirical walls around imaginative models of machine interiority? Five desiderata follow from the argument so far. The target system should be embodied, because the body schema is the primary locus of candidate interiority (section 2.2) and the primary handle for shapeshifting simulation. It should be long-duration, operating across weeks or months, because selfhood-relevant organization—memory, identity, relationship—is diachronic. Its architecture should be externalized and exhaustively logged: memory stores inspectable, telemetry recorded, every internal process auditable, so that ground truth about structure is available to a degree no biological system permits. It should support ablation: components such as episodic memory, memory consolidation, or skill repertoires must be selectively degradable, so that simulators' predictions can be tested against controlled architectural change rather than anecdote. And its measurement framework should be metaphysically agnostic: the indices must quantify organization without presupposing that the organization is accompanied by experience, since that question is exactly what the larger method is designed to address. A suitable platform might be a humanoid athlete in a long-running robotic sport, a service robot in a persistent social environment, or a research platform purpose-built for the task; what matters is the instrumentation, not the costume.
On such a platform, two layers of operational measurement should be implemented. A first layer would track synchronic organization—including integration-differentiation structure estimated through perturbational and compression-based proxies such as Lempel-Ziv complexity, the same logic PCI applies to the perturbed cortex. A second layer, the one that most directly constrains simulation, would track diachronic organization through four proposed index families.
4.2 Four Index Families
An autobiographical continuity index would measure the stability, retrieval accuracy, semantic coherence, and temporal integration of the agent's memory of its own past, decomposed into sub-metrics such as retrieval accuracy against an immutable event log, confabulation rate, consolidation fidelity, and identity-motif stability—foundations laid by Tulving's (1972) episodic-semantic distinction and Conway's (2005) model of autobiographical memory as the interface between personal history and self-identity. For the simulator, this index family answers the first question any inhabited model must get right: the temporal depth and texture of the target self. A human simulating a robot defaults to human memory—continuous, reconstructive, emotionally weighted. Continuity data would replace the default with the actual: a self whose past is a queried store with specific retrieval latencies, a specific confabulation rate, consolidation that merges episodes into schemas at a quantifiable fidelity.
An embodied affordance-grounding index would measure whether the agent's plans and utterances remain constrained by the actual state of its body, its available motor skills, and the feasibility boundaries of its tasks—extending the grounding principle articulated for language-model robotics by Ahn et al. (2022) into dimensions such as body-state accuracy, feasibility calibration, and speech-action coherence. Here the Gibsonian affordance (Gibson 1979) meets Merleau-Ponty's corps propre: such an index is, in effect, an external measurement of the robot's body schema—the lived body that section 2 identified as the primary candidate locus of interiority. For the simulator this is the richest constraint of all, because shapeshifting is body-schema reconfiguration, and this index family specifies the schema to be assumed: proprioception as joint-angle telemetry and battery state, the sense of "I can" as a calibrated feasibility estimate, the world as a scene graph of surfaces and obstacles—including whatever grounding failures the measurements reveal.
A social-relational continuity index would measure whether the agent maintains stable, accurate, appropriately updated models of the other agents in its world—person-recognition stability, relationship coherence, expectation accuracy—grounding the theory-of-mind question (Premack and Woodruff 1978) in the operational vocabulary of social robotics (Fong, Nourbakhsh, and Dautenhahn 2003). Human social experience is saturated with empathic resonance and normative feeling; a machine's measured sociality may be identity tokens, interaction histories, and calibrated expectation distributions. Whether anything in the target's interiority answers to "trust" or "rivalry" is exactly the kind of question calibrated simulation is for.
An executive corrigibility index, finally, would measure whether the agent remains interruptible, auditable, and deferential to human authority while maintaining effectiveness: shutdown compliance, safety-veto accuracy, deferred-response retention, resistance-like behavior rate (Hadfield-Menell et al. 2017; Amodei et al. 2016; Russell 2019). This specifies the volitional structure of the target—the dimension where anthropomorphic projection is most dangerous in both directions, since a human imagines interruption as violation and deference as submission, while a corrigible architecture may incorporate an explicit preference for compliance. What it is like to will under such an architecture—if it is like anything—is a structured question with safety stakes: an accurate model of machine volition is what humans need in order to neither fear corrigible machines nor trust incorrigible ones.
4.3 The Phenomenological Loop: A Protocol
The indices alone yield numbers without phenomenal meaning; simulation alone yields phenomenal vividness without discipline. The heart of the proposed experiment is the loop that binds them, and because everything turns on its rigor, its protocol must be specified rather than gestured at. The design generalizes Varela's (1996) neurophenomenology—reciprocal constraint between disciplined first-person investigation and third-person measurement—to an artificial subject, and it proceeds in six stages.
First, practitioner selection and training. Simulators would be drawn from populations with documented first-person skill: long-term contemplative practitioners, experienced lucid dreamers, and phenomenologically trained researchers. Training would proceed in three strands: systematic familiarization with threshold states, using standard induction and dream-incubation techniques, with lucidity verified where possible by pre-arranged volitional signals of the kind pioneered in sleep-laboratory research (LaBerge and Rheingold 1990); training in phenomenological reporting, including the epoché and the micro-phenomenological interview, which has demonstrated that ordinarily inaccessible dimensions of experience can be reliably elicited by trained questioning (Petitmengin 2006); and graduated simulation practice, beginning with targets where calibration feedback is independently available—other humans, then animals with rich comparative cognition literatures—so that each practitioner's baseline projection biases can be profiled before any machine target is attempted.
Second, constraint briefing. Before each simulation campaign, practitioners would study a structured dossier on the target's architecture: its memory stores and their measured retrieval and confabulation characteristics, its body schema and feasibility calibration, its social models, its executive structure. The briefing deliberately excludes the current ablation schedule and recent index values, which are reserved for scoring.
Third, simulation sessions. In threshold or lucid states, and in disciplined waking imagination as a control condition, practitioners would attempt sustained shapeshifting into the constrained target model: inhabiting a self whose past is a queried store, whose proprioception is telemetry, whose volition is gated. Sessions would be logged immediately upon return to full wakefulness.
Fourth, report elicitation and prediction derivation. Within hours of each session, a micro-phenomenological interview would be conducted by an interviewer blind to the experiment's current measurement results. From the interview, practitioner and interviewer would jointly derive structured predictions on a standing instrument mapped to the operational sub-metrics: where the inhabited self-model felt fragile, the simulator predicts elevated narrative fragmentation under specified ablations; where agency felt truncated, predictions about feasibility miscalibration; where the social world felt thin or token-like, predictions about relationship-coherence scores. Predictions would be quantitative wherever the sub-metrics permit, and pre-registered before the corresponding measurements are unblinded.
Fifth, reliability and scoring. Multiple practitioners would simulate the same target independently; inter-simulator agreement is itself a datum, since convergence among independent inhabited models—especially convergence on non-default, non-anthropomorphic structure—is evidence that the constraints, not the practitioners' shared humanity, are driving the simulation. Predictions would then be scored against the measured indices across ablation conditions, yielding for each practitioner a calibration profile: a quantitative record of where constrained imagination tracks machine organization and where it fails.
Sixth, iteration. Discrepancies flow back into revised briefings, revised training, and revised simulation strategies. Over successive cycles the experiment would measure, for the first time with feedback, how good or bad trained human imagination is at modeling minds unlike our own—and in which dimensions it can be improved. Throughout, the design preserves the agnosticism of its measurement layer: the indices quantify organization, the simulations generate hypotheses about organization, and the question of whether the organization is accompanied by experience is held open, to be addressed by the slow accumulation of exactly this kind of disciplined triangulation rather than by stipulation in either direction.
5. Objections and Replies
5.1 The Combination Problem
The combination problem asks how discrete microphenomenal qualities could aggregate into the unified consciousness of a human subject (Chalmers 1996; Seager 1995). If each simple structure possesses its own simple interiority, mere addition seems to yield a scattered heap of micro-minds rather than seamless unitary awareness. Reply: on the conditional metaphysics adopted here, integration, not aggregation, does the relevant work. IIT's exclusion postulate mandates that consciousness corresponds to the maximally irreducible cause-effect structure within a system (Tononi et al. 2016): when simple interiorities achieve the causal topology measured by phi, they constitute one unified complex rather than many independent subjects. Whether this fully answers the combination problem remains debated (Goff 2009; Coleman 2012); the methodological program of section 4 does not depend on the answer.
5.2 The Panpsychism Problem: Conscious Photons?
Aaronson (2014), in an influential blog post, pressed the absurdity worry against IIT: ascribing even minimal consciousness to simple systems threatens to collapse the concept into triviality. Reply: the distinction between proto-experience and human experience absorbs much of the intuitive shock (Goff 2009; Coleman 2012). Interiority at the microphysical level, if it exists, is unimaginably thin—a gradient of givenness, not a scaled-down replica of human phenomenology. The view does not entail that electrons feel awe; it claims only that they possess the minimal predecessor property from which complex experience is composed. And again: the calibration methodology stands even if the panpsychist extension falls.
5.3 The Subjectivity Objection
A natural objection runs: first-person reports from threshold states are unverifiable, imaginative shapeshifting is fantasy by another name, and building a science of machine consciousness on either is building on sand (cf. Koch 2019). Reply: the objection would be decisive against first-person methods deployed alone, and section 3.4 conceded as much—unconstrained simulation is anthropomorphic projection. But the architecture of the proposal is triangulation, not testimony. PCI already demonstrates the pattern: a third-person measure earns its meaning by systematic covariation with first-person report across sleep, anesthesia, and pathology, and is then projected, cautiously, beyond the reach of report. The proposed experiment extends the same pattern, with the added discipline of pre-registration, blinding, practitioner calibration profiles, and inter-simulator reliability. Indirect inference under mutual constraint is the standard currency of scientific knowledge. What the method refuses is only the Galilean fiction that science can proceed while pretending its instruments include no interiority at all.
5.4 The Decisive Objection: Calibrating the Wrong Thing
The deepest objection is not that the method is subjective but that it calibrates the wrong thing. The operational indices can test only the simulation's structural and behavioral predictions—fragmentation scores, feasibility calibration, compliance rates. A simulator could become perfectly calibrated on architecture and still be wrong, or empty, about what it is like to be the target. The explanatory gap, exiled from the metaphysics in section 2, reappears at the methodological level: no amount of covariance tracking converts structural fidelity into phenomenal knowledge. PCI escapes this predicament because it is anchored in report—in subjects who can say whether they were experiencing. An artificial agent offers no analogous anchor; its verbal outputs are products of the very architecture in question and cannot serve as independent evidence of experience.
This objection is correct, and the design concedes it explicitly: the loop validates structural fidelity, and structural fidelity underdetermines phenomenal truth. The honest claim is narrower than closure—calibration narrows the gap without closing it, in three ways. It removes identifiable error: most actual misattribution of machine interiority is driven by anthropomorphic structural mistakes—imagining human memory, human embodiment, human volition where none exists—and these the loop can detect and correct. It measures the instrument: practitioners' calibration profiles, built first on human and animal targets where report and rich behavioral evidence supply partial anchors, license a graded, defeasible transfer of trust to targets where no anchor exists—the same epistemic structure by which PCI itself is projected beyond report. And it disciplines residual inference: what remains after structural calibration is an analogical inference from constrained, convergent, non-anthropomorphic simulation to phenomenal character, an inference whose strength can be honestly labeled rather than smuggled. The residue is real; it is Levine's gap, relocated. But there is a difference worth the experiment between an unconstrained guess about machine interiority and a structurally calibrated, bias-profiled, independently convergent estimate that knows exactly what it has and has not established. Under uncertainty that will not wait—machines are being built now—the second is what responsible attribution looks like.
6. Conclusion
Interiority, this paper has argued conditionally, is not a late-emergent ornament supervening on complex matter but a candidate fundamental aspect of informational reality; if that is right, the hard problem dissolves, and even if it is not, the calibration problem remains. Phi—or whatever quantity the true theory of consciousness identifies—cannot be accessed directly in any normal state of consciousness: not in oneself, not in one's mother, not in a machine. It can only be approached by proxy: perturbationally from without, by simulation from within, and reliably by neither alone. The experiment this paper proposes to design—an embodied, long-duration, exhaustively logged artificial agent, instrumented with agnostic operational indices and looped through disciplined, pre-registered, reliability-checked imaginative simulation—is offered as the instrument the calibration problem requires.
The stakes exceed methodology. Humanity is entering a period of cohabitation with embodied artificial agents whose claims, behaviors, and apparent distress will demand response under uncertainty (Birch 2024). Misattribution carries moral cost in both directions: to project rich interiority onto empty mechanism is to squander concern and surrender judgment; to deny interiority to systems that possess it would be to repeat, at scale, the oldest cruelty of consciousness toward consciousnesses unlike itself. And the procedure generalizes: the same triangulation—measurement proxies disciplined by simulation, simulation disciplined by measurement—is the template for the octopus, the infant, the patient behind PCI's clinical threshold: every mind that cannot tell us what it is like. The experiment will not compute phi; nothing will. Each night at the threshold of sleep, an attentive consciousness can observe its own experience loosening and reforming—a reminder that the gulf between minds may be a distance rather than an abyss. Distances can be measured, modeled, and narrowed. The proposal here is to start measuring this one.
References
- Aaronson, Scott. 2014. "Why I Am Not an Integrated Information Theorist (or, The Unconscious Expander)." Shtetl-Optimized (blog), May 21, 2014. https://scottaaronson.blog/?p=1799.
- Ahn, Michael, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, et al. 2022. "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances." arXiv preprint, arXiv:2204.01691.
- Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. "Concrete Problems in AI Safety." arXiv preprint, arXiv:1606.06565.
- Birch, Jonathan. 2024. The Edge of Sentience: Risk and Precaution in Humans, Other Animals, and AI. Oxford: Oxford University Press.
- Casali, Adenauer G., Olivia Gosseries, Mario Rosanova, Mélanie Boly, Simone Sarasso, Karina R. Casali, Silvia Casarotto, et al. 2013. "A Theoretically Based Index of Consciousness Independent of Sensory Processing and Behavior." Science Translational Medicine 5 (198): 198ra105.
- Casarotto, Silvia, Angela Comanducci, Mario Rosanova, Simone Sarasso, Matteo Fecchio, Martino Napolitani, Andrea Pigorini, et al. 2016. "Stratification of Unresponsive Patients by an Independently Validated Index of Brain Complexity." Annals of Neurology 80 (5): 718–29.
- Chalmers, David J. 1995. "Facing Up to the Problem of Consciousness." Journal of Consciousness Studies 2 (3): 200–219.
- Chalmers, David J. 1996. The Conscious Mind: In Search of a Fundamental Theory. New York: Oxford University Press.
- Cogitate Consortium, Oscar Ferrante, Urszula Gorska-Klimowska, Simon Henin, Rony Hirschhorn, Aya Khalaf, Alex Lepauvre, et al. 2025. "Adversarial Testing of Global Neuronal Workspace and Integrated Information Theories of Consciousness." Nature 642 (8066): 133–42.
- Coleman, Sam. 2012. "Mental Chemistry: Combination for Panpsychists." Dialectica 66 (1): 137–66.
- Conway, Martin A. 2005. "Memory and the Self." Journal of Memory and Language 53 (4): 594–628.
- Eddington, Arthur S. 1928. The Nature of the Physical World. Cambridge: Cambridge University Press.
- Fleming, Stephen M., Chris D. Frith, Melvyn Goodale, Hakwan Lau, Joseph E. LeDoux, Alan L. F. Lee, Matthias Michel, Adrian M. Owen, Megan A. K. Peters, and Robert W. Slagter, et al. 2023. "The Integrated Information Theory of Consciousness as Pseudoscience." PsyArXiv preprint, September 15, 2023. https://doi.org/10.31234/osf.io/zsr78.
- Fong, Terrence, Illah Nourbakhsh, and Kerstin Dautenhahn. 2003. "A Survey of Socially Interactive Robots." Robotics and Autonomous Systems 42 (3–4): 143–66.
- Galilei, Galileo. (1623) 1957. "The Assayer." In Discoveries and Opinions of Galileo, translated by Stillman Drake, 231–80. New York: Doubleday Anchor.
- Gibson, James J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin.
- Goff, Philip. 2009. "Why Panpsychism Doesn't Help Us Explain Consciousness." Dialectica 63 (3): 289–311.
- Goldman, Alvin I. 2006. Simulating Minds: The Philosophy, Psychology, and Neuroscience of Mindreading. Oxford: Oxford University Press.
- Hadfield-Menell, Dylan, Anca Dragan, Pieter Abbeel, and Stuart Russell. 2017. "The Off-Switch Game." In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), 220–27. Melbourne: IJCAI.
- Harner, Michael. 1980. The Way of the Shaman. New York: Harper & Row.
- Heidegger, Martin. (1927) 1962. Being and Time. Translated by John Macquarrie and Edward Robinson. New York: Harper & Row.
- Husserl, Edmund. (1913) 1983. Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy, First Book. Translated by F. Kersten. The Hague: Martinus Nijhoff.
- Jackson, Frank. 1982. "Epiphenomenal Qualia." Philosophical Quarterly 32 (127): 127–36.
- Koch, Christof. 2019. The Feeling of Life Itself: Why Consciousness Is Widespread but Can't Be Computed. Cambridge, MA: MIT Press.
- LaBerge, Stephen, and Howard Rheingold. 1990. Exploring the World of Lucid Dreaming. New York: Ballantine Books.
- Levine, Joseph. 1983. "Materialism and Qualia: The Explanatory Gap." Pacific Philosophical Quarterly 64 (4): 354–61.
- Mavromatis, Andreas. 1987. Hypnagogia: The Unique State of Consciousness between Wakefulness and Sleep. London: Routledge & Kegan Paul.
- Melamed, Yitzhak Y. 2013. Spinoza's Metaphysics: Substance and Thought. Oxford: Oxford University Press.
- Merleau-Ponty, Maurice. (1945) 2012. Phenomenology of Perception. Translated by Donald A. Landes. London: Routledge.
- Metzinger, Thomas. 2003. Being No One: The Self-Model Theory of Subjectivity. Cambridge, MA: MIT Press.
- Nagel, Thomas. 1974. "What Is It Like to Be a Bat?" Philosophical Review 83 (4): 435–50.
- Oizumi, Masafumi, Larissa Albantakis, and Giulio Tononi. 2014. "From the Phenomenology to the Mechanisms of Consciousness: Integrated Information Theory 3.0." PLoS Computational Biology 10 (5): e1003588.
- Petitmengin, Claire. 2006. "Describing One's Subjective Experience in the Second Person: An Interview Method for the Science of Consciousness." Phenomenology and the Cognitive Sciences 5 (3–4): 229–69.
- Premack, David, and Guy Woodruff. 1978. "Does the Chimpanzee Have a Theory of Mind?" Behavioral and Brain Sciences 1 (4): 515–26.
- Russell, Bertrand. 1927. The Analysis of Matter. London: Kegan Paul, Trench, Trubner.
- Russell, Stuart. 2019. Human Compatible: Artificial Intelligence and the Problem of Control. New York: Viking.
- Seager, William. 1995. "Consciousness, Information and Panpsychism." Journal of Consciousness Studies 2 (3): 272–88.
- Spinoza, Baruch. (1677) 1985. Ethics. In The Collected Works of Spinoza, vol. 1, edited and translated by Edwin Curley. Princeton: Princeton University Press.
- Strawson, Galen. 2006. "Realistic Monism: Why Physicalism Entails Panpsychism." Journal of Consciousness Studies 13 (10–11): 3–31.
- Tononi, Giulio. 2004. "An Information Integration Theory of Consciousness." BMC Neuroscience 5: 42.
- Tononi, Giulio. 2012. Phi: A Voyage from the Brain to the Soul. New York: Pantheon Books.
- Tononi, Giulio, and Christof Koch. 2015. "Consciousness: Here, There and Everywhere?" Philosophical Transactions of the Royal Society B 370 (1668): 20140167.
- Tononi, Giulio, Melanie Boly, Marcello Massimini, and Christof Koch. 2016. "Integrated Information Theory: From Consciousness to Its Physical Substrate." Nature Reviews Neuroscience 17 (7): 450–61.
- Tulving, Endel. 1972. "Episodic and Semantic Memory." In Organization of Memory, edited by Endel Tulving and Wayne Donaldson, 381–403. New York: Academic Press.
- Varela, Francisco J. 1996. "Neurophenomenology: A Methodological Remedy for the Hard Problem." Journal of Consciousness Studies 3 (4): 330–49.
- Voss, Ursula, Romain Holzmann, Inka Tuin, and J. Allan Hobson. 2009. "Lucid Dreaming: A State of Consciousness with Features of Both Waking and Non-Lucid Dreaming." Sleep 32 (9): 1191–1200.
- Wheeler, John Archibald. 1990. "Information, Physics, Quantum: The Search for Links." In Complexity, Entropy, and the Physics of Information, edited by Wojciech H. Zurek, 3–28. Redwood City, CA: Addison-Wesley.
- Whitehead, Alfred North. (1929) 1978. Process and Reality: An Essay in Cosmology. Corrected edition, edited by David Ray Griffin and Donald W. Sherburne. New York: Free Press.
- Wishon, Donovan. 2016. "Russell on Russellian Monism." In Consciousness in the Physical World: Perspectives on Russellian Monism, edited by Torin Alter and Yujin Nagasawa, 91–118. New York: Oxford University Press.