On-Prem RAG Security Playbook: Controls, Access, and Testing

At 09:10, legal needs one clause from a scanned contract.

At 09:12, finance needs a number from an archived spreadsheet.

At 09:15, security asks the question that separates demos from production systems: can one department read another department's documents by mistake?

That question is the real go-live threshold for enterprise AI search.

This playbook is scoped to one deployment model:

- one organization

- on-prem infrastructure

- internal tenant/workspace isolation between teams or departments

- no shared multi-organization SaaS runtime

In this model, security is not a slide deck or a policy PDF. Security is runtime behavior that is testable under normal load and failure conditions.

Product Scope: What Selvo Lens Supports

Selvo Lens is designed for private document AI on internal infrastructure. In this scope, the platform supports:

- authenticated access with role-aware behavior

- internal tenant/workspace isolation within one organization

- local processing for core document QA flow, without cloud API dependency

- upload validation with bounded processing behavior

- retrieval-context hardening against prompt injection

- operational logging for audit and incident response workflows

Selvo Lens does not target shared multi-organization tenancy in a single runtime. Keeping that boundary explicit prevents both technical and messaging drift.

What Usually Breaks First in Real Deployments

Most teams assume model quality is the main risk. In production, failures are usually operational:

- authorization checks implemented on one route but missing on another

- over-privileged admin actions without review trail

- upload paths that handle common files well but fail poorly on malformed ones

- logs that exist but cannot reconstruct a real incident timeline

On-prem infrastructure solves data location. It does not automatically solve control quality.

The 12 Controls That Matter Before Go-Live

The controls below are grouped by operational domain so teams can assign ownership and verification clearly.

Identity and Access Controls

1) Enforce authentication on all sensitive routes

Query, upload, export, and admin routes should require verified identity.

Evidence: unauthorized requests fail consistently across critical paths.

2) Enforce role-based authorization

Identity and permission are separate controls.

Evidence: non-admin roles cannot execute privileged actions.

3) Keep core QA execution local

If private AI is the promise, core retrieval and answer generation should stay on internal infrastructure.

Evidence: no external API dependency in standard document QA flow.

4) Enforce document and workspace boundaries at retrieval

Relevance cannot override access scope.

Evidence: cross-scope reads remain blocked, including filtered retrieval paths.

Ingestion and Retrieval Safety

5) Harden file ingestion

Upload handling is both a security and reliability surface.

Evidence: strict type/size checks, filename sanitization, bounded timeouts.

6) Isolate heavy processing

A malformed file should not degrade whole-service health.

Evidence: worker limits, timeout controls, graceful failure behavior.

7) Treat document content as untrusted input

Prompt injection can be introduced through uploaded documents.

Evidence: retrieved context is clearly separated from system instructions.

8) Scope cache keys to access context

Caching improves latency but leaks data when scope is ignored.

Evidence: cache keys include scope context and invalidation is complete.

Operational Security and Resilience

9) Capture investigation-grade security logs

Without strong logs, incidents become assumptions.

Evidence: user id, action, endpoint, status, request id, timestamp.

10) Protect secrets and runtime config

Common breaches involve hardcoded secrets or permissive defaults.

Evidence: externalized secrets and hardened production config.

11) Secure backup and restore workflows

Backups carry the same sensitivity as production data.

Evidence: encryption, access-controlled restore, tested recovery runbook.

12) Run a pre-launch security drill

Go-live should follow rehearsal, not optimism.

Evidence: auth bypass tests, blocked retrieval tests, unsafe upload tests, and log-trace tests all pass.

Failure Scenario: How Cross-Scope Leakage Happens

A frequent pattern is subtle and cumulative:

1. authorization is correct on the primary query route

2. a secondary filtered route reuses retrieval code without equivalent scope checks

3. cached responses are keyed too broadly

4. under load, one scope receives another scope's retrieval artifact

No single issue appears catastrophic in isolation. Combined, they produce a reportable incident.

This is why scope enforcement and cache design belong in security reviews, not only in performance discussions.

Admin Policy That Holds Up During Incidents

A policy that says "admins can do everything" is not an access model. It is unbounded blast radius.

A practical policy has four layers:

1. role boundaries: explicit role-to-action mapping

2. endpoint boundaries: policy coverage for query, upload, export, metadata, and maintenance paths

3. privileged action rules: high-impact operations require approval logic and traceability

4. audit evidence: sensitive actions are attributable to actor, action, and timestamp

This model reduces accidental misuse and shortens incident triage time.

5-Stage Go-Live Test Runbook

Checklist items matter only when paired with repeatable test execution.

Stage 1: authentication and role tests

Run unauthorized calls and role escalation attempts.

Pass condition: no route-level inconsistency in allow/deny behavior.

Stage 2: upload safety tests

Test invalid types, oversized files, unsafe filenames, and corrupted documents.

Pass condition: unsafe input is rejected without service instability.

Stage 3: retrieval boundary tests

Test whether filters or hybrid retrieval paths can bypass workspace limits.

Pass condition: unauthorized scope access remains blocked in all query modes.

Stage 4: cache consistency tests

Test stale data and mixed-scope high-frequency traffic.

Pass condition: no cross-scope leakage and reliable invalidation.

Stage 5: audit readiness tests

Trigger blocked actions and reconstruct incident timeline from logs only.

Pass condition: investigation-grade evidence without manual guesswork.

10-Minute Release-Day Gate

Before each release candidate:

1. run one unauthorized query

2. run one unauthorized upload

3. run one blocked privileged action

4. run one blocked retrieval outside assigned scope

5. verify complete log trace for all test events

If any step fails, release pauses until remediation is verified.

FAQ

Is this guidance for multi-organization SaaS tenancy?

No. The scope is one-organization on-prem deployment with internal workspace isolation.

Is on-prem deployment enough to claim security?

No. On-prem defines hosting location. Security depends on authorization quality, test discipline, and auditability.

Is this useful for engineering and security teams?

Yes. The guide is operational: control domains, test stages, pass criteria, and release gates.

Is this readable for non-security stakeholders?

Yes. The structure translates technical controls into observable operational behavior.

Conclusion

Enterprise trust is not built on the phrase "runs locally."

Enterprise trust is built on proof: who can access what, under which controls, with logs that hold during incident review.

That is the difference between an AI feature and a production platform.

Before the next release, run this playbook as a formal go-live gate.

If any control cannot be proven with evidence, treat it as unresolved production risk.

On-Prem RAG Security Playbook: From Controls to Go-Live