Security

LLMO is about truth integrity. Security is the discipline of protecting that integrity against adversarial, accidental, and entropic degradation.

Threat model

LLMO systems face threats at every layer of the ontology:

Source poisoning

Bad actors inject false or misleading claims into sources that models retrieve. This includes:

Creating fake review profiles
Publishing false business information on directories
Generating synthetic content designed to influence model outputs
Compromising authoritative sources to alter claims at the origin

Stale data exploitation

Outdated claims remain in circulation after they have been superseded. This includes:

Cached pages reflecting old pricing, compliance status, or capabilities
Training data containing claims that were true at training time but are no longer current
Retrieval systems that do not check freshness before surfacing claims

Prompt injection via retrieval

When models retrieve external content as context, that content can contain adversarial instructions. This includes:

Injected text in documents designed to override model behavior
Hidden instructions in metadata, alt text, or structured data fields
Content crafted to manipulate model reasoning during retrieval-augmented generation

Calibration manipulation

Systems can be made overconfident through curated inputs that exploit calibration weaknesses. This includes:

Providing consistent but false signals across multiple sources to inflate trust scores
Exploiting model tendencies to weight fluent, well-structured content as more trustworthy
Manufacturing consensus across synthetic sources

Audit tampering

If the logging and audit layer is compromised, the system loses its ability to verify its own history. This includes:

Modification of audit logs after the fact
Deletion of provenance records
Insertion of fabricated evaluation results

Protections

Provenance protection

Every claim in the system must maintain a traceable chain from source to representation. Provenance records should be:

Append-only where possible
Timestamped at creation
Attributable to a specific source and retrieval event
Stored independently from the claims they describe

Signing strategy

For high-trust environments, claims and truth packs should support cryptographic or attestation-based signing:

The entity that publishes a truth pack signs it
Validators that verify claims can co-sign
Supersession events are signed by the superseding authority
Signatures are timestamped to establish ordering

Freshness enforcement

The system must actively check freshness rather than assuming currency:

Every claim has a validity window
Claims beyond their validity window are flagged, not served
Supersession chains are checked before claims are used in reasoning
Stale claims are logged but excluded from active decision-making

Retrieval hygiene

Retrieval-augmented generation introduces an attack surface. Mitigations include:

Input sanitization on retrieved content
Separation of retrieved context from system instructions
Monitoring for anomalous patterns in retrieved content
Source reputation scoring that degrades trust for inconsistent or suspicious origins

Audit integrity

The audit layer must be protected as a first-class system component:

Append-only logging for all decision records
Separation of audit storage from primary system storage
Integrity checks on log sequences
Access controls that prevent modification of historical records

Doctrine & Governance

Evaluation

Security

Security

Threat model

Source poisoning

Stale data exploitation

Prompt injection via retrieval

Calibration manipulation

Audit tampering

Protections

Provenance protection

Signing strategy

Freshness enforcement

Retrieval hygiene

Audit integrity

Doctrine & Governance

Evaluation

​Security

​Threat model

​Source poisoning

​Stale data exploitation

​Prompt injection via retrieval

​Calibration manipulation

​Audit tampering

​Protections

​Provenance protection

​Signing strategy

​Freshness enforcement

​Retrieval hygiene

​Audit integrity

Security

Threat model

Source poisoning

Stale data exploitation

Prompt injection via retrieval

Calibration manipulation

Audit tampering

Protections

Provenance protection

Signing strategy

Freshness enforcement

Retrieval hygiene

Audit integrity