You Can’t Hide Once You Run

Verification-First Infrastructure for AI and IoT at Scale

Stark February 2026

The Scale of What’s Coming

IoT Analytics projects 39 billion connected devices by 2030, exceeding 50 billion by 2035 [IoT Analytics, 2025]. The IoT market will reach $5.34 trillion by 2035 [Research Nester, 2025]. The AI agents market is growing at 46% annually, from $7.8 billion in 2025 to over $50 billion by 2030 [MarketsandMarkets, 2025]. Gartner projects that by 2028, at least 15% of day-to-day work decisions will be made autonomously by agentic AI [Gartner, 2025]. PwC estimates AI will contribute $15.7 trillion to the global economy by 2030 — more than the current output of China and India combined [PwC Global AI Study].

This is the trajectory. Tens of billions of autonomous devices and agents, making decisions with real-world consequences, operating at a scale no human oversight system can match. The question is not whether this future arrives. The question is whether the infrastructure beneath it can distinguish truth from fabrication.

The Missing Prerequisite

Yann LeCun, chief AI scientist at Meta, has articulated what intelligent behavior requires: “the capacity to understand the world, understand the physical world, the ability to remember and retrieve things, persistent memory, the ability to reason and the ability to plan” [LeCun, 2022]. His critique of autoregressive language models is that they achieve none of these — they predict the next token, a task that produces fluent text without requiring any model of the world that generated it.

The diagnosis is correct. But it identifies the symptom, not the root cause.

The deeper problem is this: you cannot reason about the real world if you cannot verify what is true. You cannot decide what matters if you cannot distinguish fact from fabrication. You cannot plan if you cannot confirm the state you’re planning from. Reasoning, memory, planning — these all presuppose a verification layer that establishes what is actually the case.

You cannot reason about the real world if you cannot verify what is true.

The field built reasoning systems first. Massive investment in model scale, training data, reinforcement learning from human feedback. Trillions of parameters learning to predict plausible continuations. Effectively zero investment in verification infrastructure — in the substrate that would let these systems to verify rather than trustfully assume.

The order got reversed. We built the reasoning layer before the verification layer. Now we have systems that reason fluently about states they cannot verify, plan confidently from premises they cannot check, and act autonomously on hallucinated assumptions.

The Stack is Permissive All the Way Down

Why did verification get skipped? Because the entire infrastructure stack assumes execute-first semantics.

TCP/IP does not refuse. It delivers packets or retries. There is no “I cannot verify the integrity of this transmission within my resource bounds.” HTTP does not refuse. It returns responses or errors. There is no “this response exceeds my verification capacity; I decline to process it.” The concept of principled, resource-bounded refusal does not exist in the API.

Ethereum and its descendants do not refuse. They execute transactions and update global state. The architecture is explicitly execute-first: run the code, compute the result, charge gas for the computation. Verification is subordinate to execution. Light clients “verify” by trusting validators rather than checking proofs, because the architecture provides no mechanism for bounded independent verification.

Every layer of the modern internet stack operates in permissive mode. Inputs produce outputs. Requests get responses. The system always executes.

AI inherited this. Large language models are trained to produce output for every input. There is no “I don’t know” token in the vocabulary. Refusal, when it occurs, is a learned behavior imposed through post-training alignment — not an architectural primitive. The model cannot refuse because nothing beneath it can refuse. The execute-first paradigm is baked into every layer the model touches.

This is why RLHF, RAG, chain-of-thought, and constitutional AI do not solve the problem. They are attempts to make execution more reliable without changing the execute-first assumption. They optimize which outputs get produced, not whether outputs should be produced. The architectural commitment to universal output remains intact.

Verification Cannot Be Retrofitted

The asymmetry is structural.

A system designed with verification constraints can selectively relax them. A verification-first architecture can accommodate fast, approximate, unverified outputs when speed matters more than certainty — it labels them as unverified. The constraint exists in the design. It can be waived when appropriate. A strict protocol can define permissive modes.

A system designed without verification constraints cannot impose them afterward. The ecosystem of implementations, users, applications, and dependencies adapts to the permissive baseline. As RFC 9413 documents in the context of protocol design, “a flaw can become entrenched as a de facto standard” [Thomson & Schinazi, 2023]. Once systems assume universal output, every downstream component must either attempt to mitigate or passively inherit that assumption.

This is why the internet centralized despite decentralized origins. This is why your crypto wallet quietly depends on intermediaries despite industry-wide marketing slogans. This is why AI hallucinates despite trillion-dollar investments in reliability. The execute-first paradigm was the foundation. Everything built on it inherited the paradigm.

You cannot retrofit verification onto architecture built to execute. You can only build verification-first from the start.

The Edge Is Where It Matters

The 50 billion devices arriving by 2035 are not data centers. They are phones, sensors, actuators, vehicles, medical devices, industrial controllers, household appliances. They operate at the edge — where compute is limited, bandwidth is constrained, connectivity is intermittent, and decisions have physical consequences.

These devices cannot verify by replaying global state. They cannot download terabytes to check a single claim. They cannot maintain persistent connections to trusted authorities. They operate in the real world, where network partitions happen, where batteries die, where milliseconds matter.

Edge verification requires different properties than data-center verification:

Bounded resources. Each device declares what it can verify within its constraints: storage capacity, bandwidth limits, computational budget. Verification that would exceed declared bounds does not fail silently — it refuses explicitly. “I cannot verify this claim within my resource envelope” is a valid, informative response.

Asynchronous tolerance. Devices operate through network partitions. They verify what they can when they can. They refuse what they cannot check. They reconnect and catch up when connectivity returns. The verification model does not assume synchronous access to global state — it assumes intermittent access to relevant proofs.

Local scope. A device verifying its own transactions should not need to process the entire network’s state. Verification cost should scale with local interest, not global activity. A sensor verifying its data commitments bears no cost from unrelated sensors on the other side of the world.

Refusal as safety. When a device cannot verify a claim — because the proof is unavailable, because verification would exceed resource bounds, because the claim falls outside the device’s declared scope — the correct response is refusal, not trust. “I don’t know” is safer than “I’ll trust someone who claims to know.”

This is where LeCun’s “real world reasoning” meets physical infrastructure. The real world is the edge. The edge is constrained. Constrained devices need bounded verification with principled refusal. Not “trust the cloud.” Verify locally, within your bounds, or refuse.

Guess, Hope, Fail, Die

The stakes at the edge are not academic.

Autonomous vehicles make split-second decisions based on sensor data and model inference. If the model hallucinates an obstacle, the vehicle brakes unnecessarily. If it hallucinates clear road, the vehicle kills someone. “Probably correct” is not acceptable when the failure mode is death.

Medical AI diagnoses conditions and recommends treatments. A hallucinated diagnosis leads to wrong treatment. A hallucinated drug interaction leads to patient harm. The FDA requires validation for medical devices precisely because plausible is not the same as correct.

Industrial controllers manage power grids, water systems, manufacturing lines. A hallucinated sensor reading leads to wrong control action. Wrong control action leads to equipment damage, environmental release, worker injury. Safety-critical systems exist because the cost of error is not merely inconvenience.

Financial AI executes trades, approves loans, detects fraud. A hallucinated market signal triggers cascading trades. A hallucinated creditworthiness approves bad loans. A hallucinated fraud flag freezes legitimate accounts. The 2010 Flash Crash demonstrated what happens when autonomous systems operate faster than human oversight.

Defense systems assess threats and coordinate responses. A hallucinated threat triggers escalation. A hallucinated de-escalation signal permits attack. The consequences are measured in lives and potentially in civilizational risk.

The common pattern: autonomous systems making decisions with irreversible physical consequences, operating on inferences that may be fabrications, at speeds that preclude human verification.

This is the future we are building. Not because anyone chose it, but because the infrastructure defaults to execute-first, and execute-first scales to autonomous systems that guess, hope, fail, and die.

The Necessary Reordering

The field has it backwards. We built reasoning before verification. We deployed autonomy before accountability. We scaled execution before we could check correctness.

The fix is not better AI. Model improvements help at the margin but cannot overcome architectural constraints. Hallucination is formally ineliminable from autoregressive systems [Banerjee et al., 2024; Xu et al., 2024]. Calibration worsens with extended reasoning [arXiv:2512.16030]. The model layer cannot solve problems defined by the model layer’s own commitments.

The fix is verification-first infrastructure.

Previous papers in this series established the foundations. The hourglass architecture provides network-layer verification primitives: tamper-evident sequences that propagate without central coordination, supporting transport-independent delivery across heterogeneous networks [Stark, 2026a]. The verification discipline paper established the protocol-layer properties: bounded verification where checking is orders of magnitude cheaper than producing, refusal as a first-class operation with explicit semantics, proof markets that create economic gradients toward truth rather than plausibility [Stark, 2026b].

Zenon’s dual-ledger architecture instantiates these principles. Account-chains localize state; the Momentum chain commits ordering without requiring participants to replay others’ computation. Verification cost scales with local interest, not global activity. A browser-based client or an embedded sensor verifying its own transactions bears no cost from the rest of the network. The Greenpaper specifies bounded verification with explicit resource constraints and refusal codes [Zenon Greenpaper, 2026]. The architecture is designed for the edge — for the 50 billion devices that cannot be data centers.

This is the necessary reordering: verification infrastructure before pervasive autonomy. Not because autonomy is bad, but because autonomy without verification is guess-hope-fail-die at scale.

What This Enables

Verification-first infrastructure does not constrain AI. It enables AI to operate in contexts where trust matters.

AI agents that verify external state before acting. An agent tasked with executing a trade can verify the current market state rather than inferring it from training data. An agent managing inventory can verify actual stock levels rather than estimating them. The agent’s actions are grounded in verified state, not plausible inference. For a deeper analysis of why execution-first blockchains fail AI and how verification-first architecture resolves it, see Ghost in the Ledger: Why AI Haunts Execution-First Chains.

IoT devices that prove provenance. A sensor’s readings are cryptographically committed at the point of measurement. Downstream systems can verify that the data came from the claimed sensor at the claimed time, unaltered. Provenance is not asserted — it is proven.

Autonomous systems that compose verified actions. When each step in an autonomous workflow is accompanied by a verification proof, the composition of steps preserves the verified properties. Errors are detected at the point of occurrence, not propagated silently through the system. The failure mode is bounded refusal, not cascading hallucination.

Refusal as safety, not failure. When an autonomous system cannot verify the preconditions for safe action, it refuses to act. This is not a system failure — it is a safety mechanism. The system that refuses when it cannot verify is more trustworthy than the system that guesses when it should refuse.

The old saying holds: the only stupid question is the question you don’t ask. But execute-first AI cannot ask questions. It must produce output. A system that can refuse is a system that can ask “do I actually know this?” before answering. A system that must always output is a system that never asks — and never asking is how you get confidently wrong at scale.

Optionality preserved. Verification-first systems can choose to operate without verification when appropriate. Fast, approximate, unverified outputs are available when speed matters more than certainty — they are simply labeled as unverified. The constraint exists and can be relaxed. Execute-first systems have no such optionality; they cannot impose verification they were not designed to support.

The Choice

The question is not whether this future arrives. Capital is deployed. Devices are shipping. Agents are training.

The question is whether the infrastructure beneath them can distinguish verified from plausible. Whether devices at the edge can check their inputs within bounded resources. Whether agents can refuse when they cannot verify. Whether autonomous systems can compose verified actions or only stack plausible guesses.

Verification-first infrastructure must precede pervasive autonomy. Not after. Before. The field has the order backwards, and the cost of that error scales with every device deployed on execute-first foundations.

Verification-first infrastructure must precede pervasive autonomy. Not after. Before.

We are writing the beginning of a story now. Fifty billion devices. Trillions of dollars. Autonomous systems that will outlast the people building them. These are not abstractions — they are the systems that will manage power grids, route traffic, monitor patients, coordinate supply chains, run the infrastructure we rely on every day. The architectural decisions made today determine everything that comes after — what can be built, what can be trusted, what can be verified. Beginnings constrain endings.

Never underestimate your ability to create the future.

References

[Banerjee et al., 2024] Banerjee, S., Agarwal, A., & Singla, S. “LLMs Will Always Hallucinate, and We Need to Live With This.” arXiv:2409.05746.

[Gartner, 2025] Gartner. “Predicts 2025: Agentic AI Will Automate 15% of Work Decisions by 2028.”

[IoT Analytics, 2025] IoT Analytics. “State of IoT 2025.” iot-analytics.com.

[LeCun, 2022] LeCun, Y. “A Path Towards Autonomous Machine Intelligence.” Position paper, OpenReview.

[MarketsandMarkets, 2025] MarketsandMarkets. “AI Agents Market — Global Forecast to 2030.”

[PwC Global AI Study] PwC. “Sizing the Prize: What’s the Real Value of AI for Your Business and How Can You Capitalise?”

[Research Nester, 2025] Research Nester. “Internet of Things (IoT) Market Size to Reach $5.34 Trillion by 2035.”

[Stark, 2026a] Stark. “Verifiability and Fate-Sharing in a Cryptographic Hourglass.”

[Stark, 2026b] Stark. “Billion-Dollar Maybe: Verification-First Design and the Future of Trustless Systems.”

[Thomson & Schinazi, 2023] Thomson, M. & Schinazi, D. “Maintaining Robust Protocols.” RFC 9413. IETF/IAB.

[Xu et al., 2024] Xu, Z., Jain, S., & Kankanhalli, M. “Hallucination is Inevitable: An Innate Limitation of Large Language Models.” arXiv:2401.11817.

[Zenon Greenpaper, 2026] Zenon Network. “Greenpaper: Bounded Verification and Proof-First Applications.”

[arXiv:2512.16030] “Do Large Language Models Know What They Don’t Know? Evaluating Epistemic Calibration via Prediction Markets.” 2025.