There’s always a moment just before things go wrong.
It’s the meeting where someone declares the model is working.
The dashboard appears spotless.
The demo goes smoothly.
Everyone in the room agrees.
Conversations shift to speed, efficiency, and transformation.
Then someone utters the fateful words:
“Let’s connect it.”
That’s where the real trouble starts.
In defense settings, artificial intelligence isn’t dangerous because it’s futuristic. It’s dangerous because it’s practical. Practical tools get connected. Connected tools access data. Data supports missions. Missions lead to real-world outcomes. And within the Department of Defense, those outcomes don’t remain theoretical for long.
This is precisely why AI can’t be treated as a simple utility tossed into workflows with security addressed “down the road.” Not on NIPRNet. Not on SIPRNet. Not on JWICS. Not anywhere that information, access, trust, and operational choices carry serious weight. In these contexts, AI must be managed as a mission-critical system from day one, one that must be contained, monitored, authorized, and continuously governed. This is exactly where zero trust architecture, the Risk Management Framework, and AI-specific cybersecurity guidance become operationally significant (Department of Defense Chief Information Officer [DoD CIO], 2022; Joint Task Force, 2018; DoD CIO, 2025).
The traditional security model relied on a comforting illusion: once past the perimeter, trust was already established. But AI shatters that illusion. AI doesn’t simply occupy one location performing one task. It draws from multiple sources, reacts to unpredictable human inputs, interfaces with various tools, evolves through model updates, and produces outputs that people may accept too readily. It transforms one flawed assumption into ten flawed actions unless the surrounding architecture is rigorous enough to deny access, record the event, limit the retrieval, enforce classification labels, and re-verify identity.
This is the core argument of this article. Zero trust isn’t an additional security feature added to AI after acquisition. It’s the fundamental operating model that allows AI to function safely within defense networks.
The First Mistake: Confusing the Model with the System
The most frequent error in defense AI deployment is also the most basic. People see the model and assume they’re seeing the entire system.
They’re not.
The model is merely the visible component, the tip of the spear, the impressive element in the demonstration. The complete AI system encompasses the user interface, identity services, orchestration logic, retrieval layer, connected repositories, APIs, logging infrastructure, service accounts, policy enforcement points, tuning pipeline, update pathway, and administrative controls. The DoD AI Cybersecurity Risk Management Tailoring Guide addresses this by covering AI risk across the entire lifecycle rather than treating the model as an isolated entity (DoD CIO, 2025).
This distinction is critical because most actual failures won’t originate from the model itself. They’ll stem from everything surrounding it.
The Technical Reality
Consider a hypothetical maintenance assistant running on NIPRNet. The surface function is straightforward: pose a question about a technical order or maintenance problem and get a summarized response. But beneath that simplicity lies a complex network of dependencies. The assistant might draw from technical manuals, historical work orders, local unit guidance, engineering notes, equipment databases, and workflow tools. It might depend on a service account to fetch content, role-based access control to distinguish users, document tags to filter sensitive material, and logs to trace how an answer was generated.
If one service account has excessive read permissions, the system could reveal information a user shouldn’t access. If the retrieval layer disregards policy tags, the model might produce an answer from mixed-classification content. If administrators can alter sources or system prompts without approval, trust in the system becomes trust in whoever made the last change.
The Leadership Lesson
For executives and commanders, the takeaway is straightforward: never authorize “the AI tool” as if it were a single entity. Question what data it accesses, what repositories it searches, what actions it can perform, what identities it employs, what logs it generates, and how it evolves over time. If no one can articulate that clearly, then the organization isn’t assessing a capability. It’s betting on an unknown.
And unknowns are tempting. They make everything seem simple right before it becomes costly.
Begin With the Mission, or Face the Consequences
The second error is prioritizing technology over mission.
This might seem innocuous. It usually isn’t.
The wrong starting question is, “How do we implement an LLM in our unit?”
The right starting question is, “What mission challenge are we addressing, for whom, on which network, using what data, and with what degree of human oversight?”
Risk Management Framework (RMF) begins with preparation and categorization for this exact reason. Systems can’t be properly secured until the organization comprehends their purpose, environment, and impact (Joint Task Force, 2018). NIST’s AI RMF reinforces this by stressing governance and context, since AI risk depends largely on how the system is used rather than on the model itself (National Institute of Standards and Technology [NIST], 2023).
A Real-World Scenario
Imagine a cyber defense squadron wanting to use AI to help analysts prioritize alerts more quickly. One approach is superficial and risky: “Deploy AI to summarize security alerts.” Another approach is methodical: “Deploy AI in the unclassified environment to summarize approved SIEM alert data, cross-reference it against validated historical tickets, and retrieve only approved response playbooks for analyst evaluation. The AI may propose likely severity levels, but final classification stays with the analyst.”
These two descriptions don’t just sound different. They result in different architectures, different access controls, different testing requirements, and different authorization decisions.
The Executive Perspective
Leaders should demand precision at this stage because precision transforms security into something tangible. A vague use case creates ambiguous boundaries. A precise use case creates a clear design objective. It provides the cybersecurity team, the mission owner, and the authorization chain with something concrete to build toward.
Mission clarity isn’t bureaucratic overhead. It’s defensive architecture expressed in words.
Establish Boundaries Before Connecting Anything
This is where the real work starts. Before the AI connects to anything, the organization must define two boundaries: the system boundary and the data boundary.
This might sound administrative. It’s actually strategic.
The system boundary identifies which components fall within the authorized capability. The data boundary specifies what information the system is permitted to ingest, retrieve, process, store, and output. DoD zero trust guidance stresses that security must be built around data-centric access decisions and conditional trust, not around broad location-based assumptions (DoD CIO, 2022; DoD CIO, 2022b).
The Technical Requirements
A defense AI implementation should answer questions like these before any connection is established:
- What repositories can the system query?
- Which identities are used for retrieval?
- How are documents tagged and filtered?
- What tools can the model invoke?
- What outputs are logged?
- What admins can modify model settings or
- What additional data sources can be integrated?
- Which model versions are authorized for specific enclaves?
- How can the system verify that a response originated from approved sources?
This is where zero trust transitions from a concept into a detailed implementation plan.
A Practical Use Case
Consider an intelligence support team seeking a retrieval tool on SIPRNet designed to assist analysts in finding approved summaries and internal research. A basic approach would involve granting the service account full access to the entire document repository, allowing the model to pull any seemingly relevant data. A more structured approach involves routing requests through a security gateway that evaluates document sensitivity, user clearance, departmental access rights, and mission requirements before any information enters the model’s processing scope.
The initial approach is the easy route.
The second approach is the secure route.
Leadership Takeaways
If an AI has unrestricted access to all data, the security perimeter is already compromised at the foundational level. Commanders must demand a precise definition of the system’s boundaries—what it is permitted to access and what is strictly off-limits. If this definition is unclear, the entire infrastructure is compromised. Ambiguous infrastructure is how successful organizations later scramble to justify preventable breaches.
RMF as the Structural Framework
Often dismissed as bureaucratic red tape, the Risk Management Framework (RMF) is actually the critical discipline that separates a functional pilot program from a systemic failure.
According to NIST SP 800-37 Rev. 2, RMF provides a lifecycle approach to handling security and privacy risks through steps including preparation, categorization, control selection, implementation, assessment, authorization, and continuous monitoring (Joint Task Force, 2018). The DoD AI Cybersecurity Guide does not supplant RMF; it adapts it specifically for AI, acknowledging that these systems involve unique dependencies and adaptive behaviors requiring extra scrutiny (DoD CIO, 2025).
Technical Relevance
RMF compels teams to address difficult, unavoidable questions:
- What are the consequences of a confident but inaccurate output?
- What are the risks if retrieved data is manipulated or outdated?
- What happens if an administrator modifies the system prompt?
- How do we manage shifts in model behavior following updates?
- What if an AI operates with a service account possessing higher privileges than the end-user?
- What if logging fails to clarify how a sensitive response was generated?
These are not theoretical future scenarios; they are current engineering challenges.
Significance for Decision-Makers
For leadership, RMF is not a barrier to progress; it is the distinction between responsible innovation and negligence. It establishes a formal record of the system’s purpose, its dependencies, associated risks, mitigation measures, and the appropriate level of trust.
Without this framework, you don’t have adoption; you have ad-hoc experimentation.
Tailored Controls: Moving Beyond Theory to Implementation
Broad policy is simple to draft. Customized control implementation is where the real effort lies.
The DoD AI Cybersecurity Guide was created because AI systems fail differently than standard software. They rely on training data, model iterations, retrieval processes, integrated tools, and dynamic environments. This requires that standard security controls remain in place, alongside deeper customization for access management, data movement, model oversight, evaluation, and auditing (DoD CIO, 2025).
Case Study: An Action-Oriented Knowledge Assistant
Imagine a headquarters deploying an AI assistant capable of answering policy queries, drafting official emails, and initiating workflow tickets. While this appears efficient, it transforms the system from a passive responder to an active agent.
This fundamentally changes the threat landscape.
A secure architecture would segregate passive retrieval from active task execution. It would mandate strict authorization for tool usage, enforce role-based access for specific tasks, record every action, and potentially require human sign-off for high-impact operations. It would also distribute administrative powers to prevent unauthorized expansion of the assistant’s capabilities.
The Core Message
Leaders must understand this distinction: an AI limited to searching represents one category of risk. An AI that searches, evaluates, and takes action represents an entirely different level of risk. Governance must evolve alongside capability; otherwise, efficiency becomes unmonitored delegation.
Data Tagging: The Essential Foundation of Security
Nobody enjoys metadata meetings. Nobody crafts inspiring emails about data hygiene. Yet in military AI applications, tagging is where trust is either established or eliminated.
DoD zero trust strategy prioritizes data-centric defense and strict policy enforcement based on accurate metadata rather than assumptions (DoD CIO, 2022). In AI applications, this is vital because the retrieval mechanism serves as the bridge between the user and the data store.
The Technical Challenge
In a retrieval-augmented setup, the process usually follows these steps:
- The user inputs a query.
- The retrieval engine searches approved databases.
- Relevant data is injected into the model’s prompt.
- The model formulates the response.
If source content is improperly labeled, mixed with data of varying classifications, or accessed via an overly privileged service account, the model may process information the user shouldn’t see. At this stage, the security failure has occurred before the response is even generated.
A Real-World Scenario
Consider a unit-level AI designed to answer internal policy and procedure questions. Gradually, it is linked to shared drives, team sites, local archives, and legacy files. Some metadata is accurate; some is missing. Some documents are obsolete. Some contain internal drafts never intended for wide circulation. The service account is granted blanket access because refining permissions is viewed as a hassle.
A user submits a simple query. The output is polished, authoritative, and references internal documents that should have remained restricted.
This isn’t a model malfunction; it is a governance failure operating under the guise of AI.
The Strategic Lesson
For leadership, the point is clear: metadata hygiene is not administrative busywork. It is the primary mechanism determining whether AI upholds or silently undermines security policy.
Access Control: When AI Evolves into an Operational Asset
This is where many organizations misjudge the danger. They concentrate on the AI’s output rather than its capabilities.
This is a fundamental oversight.
Contemporary AI platforms are increasingly integrated with tools, APIs, storage systems, workflow engines, and automated processes. In a defense context, these integrations can turn an assistant into a significant operational asset. However, if permissions, approvals, and logging are inadequate, they can just as easily turn it into a significant liability.
Technical explanation
A retrieval assistant that only reads from a limited knowledge base carries certain risks. An AI agent that goes further — pulling dashboards, opening tickets, running scripts, editing records, or launching automated workflows — introduces a different and much larger set of risks. Under a zero-trust framework, every action the system takes must be backed by clearly defined permission, narrowly limited access, and fully auditable checkpoints — not by blanket trust passed along from the host application (DoD CIO, 2022).
A realistic example
Picture a network operations center rolling out an AI assistant to handle incident response. Initially, it only digests reports and suggests response playbooks. To save time, the team then lets it create draft support tickets. After that, it gets permission to pull live operational metrics. Then it’s authorized to file change requests. Each step, taken alone, feels incremental and harmless.
But at some point the tool quietly crosses a line. It is no longer merely “helping out.” It has become something that can act on the environment — an action surface.
Executive explanation
Leaders should approach AI permissions the same way they approach delegating authority to human staff. If the system’s outputs can alter or kick off downstream processes, then oversight must extend beyond “what it can see” to also cover “what it is allowed to set in motion.” Granting power without guardrails is not forward-thinking — it is unmanaged drift.
Testing AI Means Probing Behavior Under Stress
Routine vulnerability scans, configuration audits, and patch cycles remain essential. But on their own, they fall short.
NIST’s AI Risk Management Framework makes clear that AI risk must be evaluated across the dimensions of context, observable behavior, trustworthiness, and resilience — not just through standard software assurance checklists (NIST, 2023). This is particularly critical in defense, where AI outputs can shape mission-critical judgments even when the system itself does not make the final call.
A useful framework for testing
A solid AI testing program in a defense setting should operate along three parallel tracks:
Infrastructure testing checks whether the underlying environment is hardened, properly segmented, patched, and shielded.
Behavior testing examines how the model handles hostile prompts, garbled inputs, prompt injection attacks, retrieval tampering, dangerous instructions, or unclear situational context.
Mission testing evaluates whether the system acts appropriately inside realistic operational scenarios — with fatigued users, missing data, contradictory inputs, and failing upstream systems.
Example
A cyber-analysis assistant shines during polished demonstrations, concisely summarizing suspicious network activity. But when testers deliberately craft prompts designed to strip away formatting rules, remove uncertainty cues, or push the model toward unwarranted confidence, something shifts. The system begins to sound stubbornly authoritative precisely where it should express hesitation.
This is not a theoretical concern — it is an operational hazard. Analysts working under time pressure are especially prone to absorbing polished overconfidence. Possible fixes include enforcing stricter output templates, requiring confidence-language disclosures, hardening the prompt layer, and training analysts to actively question the machine’s certainty rather than accept it uncritically.
Executive explanation
For decision-makers, the takeaway is brief and blunt: passing a clean demo proves next to nothing. The real question is not whether the AI performs under perfect lab conditions. It is what happens when the environment is chaotic, adversarial, rushed, uncertain, or simply broken.
That is actually where genuine confidence is built.
Continuous Monitoring — Because the System You Signed Off On Won’t Stay That Way
AI systems drift. Subtly at first, then steadily, then suddenly.
A model version gets swapped out. A new data source gets plugged in. A prompt template is tweaked. An integration connector is widened. A team expands the user pool. An administrator makes what seems like a minor, sensible adjustment that ends up reshaping system behavior at scale. NIST’s RMF treats continuous monitoring as a cornerstone of sustained risk management, and DoD’s AI-specific directives reinforce the need for periodic reassessment as these systems evolve (Joint Task Force, 2018; DoD CIO, 2025).
Technical explanation
For AI systems, monitoring needs to record far more than network traffic and authentication logs. It should also track:
- model version changes,
- connector changes,
- data source changes,
- prompt policy changes,
- tool invocation anomalies,
- retrieval patterns,
- admin activity,
- output policy violations,
- and signs of performance drift.
The reason is straightforward: an AI system can change in meaningful, high-impact ways while appearing outwardly unchanged.
Real-world style example
An organization deploys an AI assistant for internal knowledge lookup. Six months on, end users see the same interface they always have. Behind the scenes, however, the model version has been updated, additional data repositories have been linked, new administrators now have access, and the original logging configuration has quietly drifted from the baseline. The capability now functions under a different set of assumptions than the one that was originally evaluated and approved.
That is not a minor footnote. That is a different system operating under the same label.
Executive explanation
For leaders, continuous monitoring is what transforms an authorization from a one-time paper approval into an ongoing, informed risk judgment. If the system evolves while the organization lacks clear visibility into those changes, then trust in the original authorization becomes a memory — not an active control.
Responsible AI and Secure AI Are the Same Fight
The Department of Defense has established ethical AI principles requiring systems to be responsible, equitable, traceable, reliable, and governable (U.S. Department of Defense, 2020). Follow-up guidance has focused on translating those principles into day-to-day engineering practice (Department of Defense, 2021).
This is significant because each of those principles maps directly onto security and operational discipline.
A traceable system demands comprehensive logs, data provenance, and full version accountability.
A reliable system must be rigorously tested, continuously monitored, and tightly bounded.
A governable system must be interruptible, subject to review, and placed under meaningful human authority.
These are not aspirational ethics statements. They are implementation requirements.
Executive explanation
For senior leaders, responsible AI is not a separate discussion to be held after the cybersecurity review wraps up. It is woven into the same trust architecture. In defense contexts, if a system cannot be understood, bounded, overridden, and fully audited, then it fails the standard for both responsibility and security — and should not be relied on for mission execution.
What Squadrons and Units Should Actually Do Next
The way forward does not demand solving enterprise-wide AI governance overnight. It demands one disciplined step, then another, and then another — until the capability is mature enough to earn genuine trust.
Start with a single, clearly bounded mission use case.
Define the system’s scope and data boundaries explicitly.
Apply the Risk Management Framework early in the lifecycle.
Tailor security controls to the unique AI development and deployment pipeline.
Enforce zero-trust principles across all identities, data access paths, connectors, and tool integrations.
Test the infrastructure, the AI’s behavioral responses, and mission-level performance.
Formally authorize the capability as an operational system.
Then monitor it without interruption and re-evaluate it whenever material changes occur (DoD CIO, 2022; Joint Task Force, 2018; NIST, 2023; DoD CIO, 2025).
This pace will feel slower than a commercial pilot project. But it is also the surest way to prevent a “helpful assistant” from quietly becoming an unmonitored gateway into sensitive operational workflows.
In defense, speed
But staying alive and operational matters even more.
Conclusion: In Defense AI, Discipline Wins the Future
The push to accelerate never stops.
There’s always someone insisting the tool works fine as-is.
Someone’s always eager to activate that integration.
And someone’s always pushing to scale the pilot program before controls are in place.
That urgency isn’t disappearing. Neither is artificial intelligence.
So for military and defense groups, the real issue isn’t whether AI will end up in classified or high-stakes settings—because it already has. The real question is: will it arrive through strict design principles, limited access rights, binding policies, formal approval processes, and constant oversight? Or will it show up like so many tools do—quickly adopted, highly functional, poorly regulated, and trusted more than the data supports?
This is exactly what zero trust is built to prevent.
This is what the Risk Management Framework structures.
This is what secure deployment requires.
Because in defense systems, the greatest threat isn’t AI becoming overly capable.
It’s AI becoming useful long before it’s safe.
References
Department of Defense. (2021). Implementing responsible artificial intelligence in the Department of Defense.
Department of Defense Chief Information Officer. (2022). DoD zero trust strategy.
Department of Defense Chief Information Officer. (2022b). Department of Defense zero trust reference architecture (Version 2.0).
Department of Defense Chief Information Officer. (2025). AI cybersecurity risk management tailoring guide.
Joint Task Force. (2018). Risk management framework for information systems and organizations: A system life cycle approach for security and privacy (NIST Special Publication 800-37, Rev. 2). National Institute of Standards and Technology.
National Institute of Standards and Technology. (2023). Artificial intelligence risk management framework (AI RMF 1.0) (NIST AI 100-1).
U.S. Department of Defense. (2020, February 25). DOD adopts 5 principles of artificial intelligence ethics.
About the Author
Joe Guerra holds a Master’s in Computer Science and a Master’s in Software Engineering, along with certifications including CASP+ and CCSP. As a dedicated tech and cybersecurity expert, he focuses on enabling secure digital modernization for government and national defense operations. With deep experience in software development, cybersecurity strategy, AI integration, and technical leadership, Joe helps build mission-driven solutions that align with real-world demands in public-sector environments. At FEDITC, LLC, he contributes to a team delivering vital support for critical missions globally—offering expertise in cybersecurity, cloud infrastructure, systems engineering, software solutions, health IT, and enterprise resilience. FEDITC stands out through its commitment to operational security excellence, providing services such as cybersecurity program development, RMF compliance, vulnerability assessments, DevSecOps integration, custom mission applications, and continuous process improvements designed to equip military units with robust, regulation-compliant, and high-performing technology systems.
(FEDITC: EMAIL: [email protected]



