BASE44DEVS

FIX · SECURITY · CRITICAL

Base44 RLS Rules Out of Sync After AI Builder Edit — Diagnose and Fix

Base44's AI builder updates schemas and queries but does not always sync the matching Row-Level Security predicates, so policies and code drift apart. The bug is silent because RLS denials return zero rows, not an error. Fix sequence: enumerate every RLS policy, diff predicate columns against the SDK calls in your code, and write a per-role verification test that fails loudly on drift.

Last verified
2026-05-02
Category
SECURITY
Difficulty
HARD
DIY possible
NO

What's happening

A user logs in. They see a dashboard with records belonging to another tenant. Or the opposite — they see an empty list when they should see fifty rows. There is no error in your logs. The SDK call returned successfully. The data is in the database. Something between the database and the user is filtering wrongly.

This is the silent-failure mode unique to AI-builder platforms: the AI builder edited your schema, your queries, and your forms, but it did not edit the Row-Level Security policy that gates the data. The policy and the code are now out of sync. Every read goes through a stale predicate, and the predicate either lets too much through or too little. Either way, no error fires.

This is a new class of bug. Traditional teams using hand-written SQL migrations have reviewers who catch policy drift in pull requests. Teams using AI builders have no review surface — the agent shipped the change directly. Base44's AI builder will happily add a tenant_id column to a table, rewrite every read query to filter on it, generate a new admin form, and leave the existing RLS policy filtering on the now-deprecated org_id column. The schema migrates. The queries migrate. The policy does not. The result is a permission bug that lives in production until a human notices.

We have audited 11 Base44 apps for this pattern in the last six months. Every single one had at least one drift case. The median time-to-detection in production is 47 days — by which point the drift has compounded across multiple AI-builder turns.

Why AI builders create RLS drift

The AI builder operates on the surfaces it sees in your prompt. When you ask it to "add multi-tenant support to the orders table," it interprets that as a schema-and-code task. It reads the table definition, adds a column, finds every read in the codebase, rewrites the where clauses, regenerates the form, and reports success. The regeneration is impressive. It is also incomplete.

The RLS policy lives in a different surface — Base44's data settings panel, not the code editor. The agent does not always pull that surface into context. Even when it does, it cannot run the policy against a test user to verify the predicate is correct. There is no execution feedback loop for RLS the way there is for compilation errors or failed tests. The agent emits whatever predicate looks plausible from the prompt and moves on.

Three structural reasons compound the problem.

First, RLS evaluation is silent. A failed type-check produces an error the agent can read and correct. A failed RLS predicate produces an empty result set the agent has no way to interpret. Without a signal, there is no correction.

Second, RLS lives outside the code surface. The AI builder's training data is dominated by application code where the security policy is colocated with the query (decorators, middleware, ORM hooks). Base44's model separates them. The agent's instinct from training is "the security check is right there in the function" — but in Base44 the check is in a settings panel the agent may not be inspecting on this turn.

Third, AI builders optimize for visible progress. The user prompted for "multi-tenant support" and the agent has already produced a schema migration, a form, three new queries, and a passing build. From the agent's reward signal, the task looks done. Surfacing "by the way, you also need to update RLS policy ID 47 on table orders, here is the new predicate" is not in its training distribution.

The net effect: AI-builder edits introduce RLS drift in approximately 73 percent of feature additions involving new tables or new tenant boundaries (sample of 11 audits, 38 distinct AI-builder feature additions). The drift is silent. The bug is real. And every additional AI-builder turn makes the divergence harder to trace.

The 6 silent-failure patterns we see most

These are the patterns we hit in audit after audit. Each one has a specific signature, a specific cause, and a specific way to detect it.

Pattern 1: New foreign key, RLS still references old column.

  • Symptom: Zero rows returned to a user who should see many. Admin view shows the rows exist.
  • Cause: AI builder added team_id and migrated reads, but the RLS policy still filters org_id = auth.org_id().
  • Detect: Enumerate every foreign key on the table, then enumerate every column referenced in active RLS predicates. Any FK column not in any predicate is a candidate orphan.

Pattern 2: New table, no RLS at all.

  • Symptom: Cross-tenant data leakage. Users see other users' records.
  • Cause: AI builder created the table with default visibility (open to all authed users) and never came back to lock it down.
  • Detect: List every table. List every table with at least one RLS policy. The set difference is your unguarded surface. In 3 of our 11 audits, this set contained a table with PII.

Pattern 3: Predicate uses a session variable that is no longer populated.

  • Symptom: Records visible immediately after login disappear after a session refresh.
  • Cause: Policy filters on auth.tenant_id() but the AI builder migrated the auth context to auth.org_context() and only the new context is populated.
  • Detect: Diff every auth.* call inside RLS predicates against the auth-context functions actually invoked at session start in your code.

Pattern 4: Write path bypasses the read predicate.

  • Symptom: Users can write records they cannot subsequently read. Records appear orphaned.
  • Cause: INSERT policy is more permissive than SELECT policy. AI builder relaxed INSERT to ship a new form, did not touch SELECT.
  • Detect: For every table, compare the predicate complexity of INSERT vs SELECT. Asymmetry is a flag, not always a bug, but worth checking.

Pattern 5: Predicate column was renamed in the schema but not in the policy.

  • Symptom: All queries to the table return zero rows. Admin view works.
  • Cause: Schema rename succeeded; RLS policy still references the old name and silently fails the predicate evaluation (or the platform substitutes NULL, depending on version).
  • Detect: Run every active RLS predicate as a raw SQL EXPLAIN and check for unresolved column references.

Pattern 6: Role expansion not reflected in policy.

  • Symptom: New role (e.g., "billing") gets blocked from records they should access.
  • Cause: AI builder added a new role and the membership join, but every existing RLS policy enumerates allowed roles by name and does not include the new one.
  • Detect: Enumerate the roles present in the role table. For each role, find the RLS policies that mention it. Roles missing from the policy set are likely under-permissioned.

The pattern → cause → detect comparison fits in one table:

SymptomLikely RLS patternDetect command
Zero rows where many expectedPredicate references stale columnDiff FKs vs predicate columns
Cross-tenant leakageNo RLS on new tableList tables minus tables-with-policies
Records vanish after session refreshStale auth context functionDiff auth.* calls in predicates vs code
Write succeeds, read returns nothingINSERT/SELECT asymmetryCompare predicate complexity per op
All reads zeroRenamed column not updated in predicateEXPLAIN every predicate
New role blockedRole enumeration missingDiff roles vs role-mentions in predicates

The verification audit — 7 steps

This is the audit we run when we are called in. It works whether you have one drift or twelve. Allow a full day for a mid-size app the first time through; subsequent runs are faster once the matrix is built.

Step 1: Export every RLS policy

Pull the full list out of Base44's data settings panel. Capture name, table, operation (SELECT/INSERT/UPDATE/DELETE), the predicate text verbatim, and any role qualifiers. Save it as JSON or CSV — you will diff it later.

Step 2: Enumerate every SDK call site

Grep the codebase for base44.collection(. For each match, record the table name, the operation (list/get/create/update/delete), the where-clause columns, and the file:line. This is your code-side surface.

Step 3: Build the coverage matrix

Cross-product the two exports. Rows are tables. For each table, columns are: policies-on-this-table, predicate-columns, query-columns-from-code, ops-covered. The matrix usually surfaces drift in the first read.

Step 4: Identify orphan policies

A policy is orphaned if its predicate references a column no live query filters on. Either the column was renamed and the policy lags, or the policy is dead weight from a previous schema. Both cases need investigation. Mark every orphan with the suspected cause.

Step 5: Identify unguarded queries

A query is unguarded if it hits a table that has no policy for that operation. The platform default is "allow" in some Base44 versions and "deny" in others — never assume; check. Every unguarded query is a leak candidate.

Step 6: Write a smoke test per role

For every role in your app, write a test that logs in as a real user with that role and runs every protected query. Assert row counts and content. Then run a query the role should not be able to do and assert zero rows.

// tests/rls/orders.role-smoke.test.ts
import { createBase44Client } from "@/lib/base44";
import { signInAs } from "./helpers";

describe("orders RLS — per-role smoke", () => {
  test.each([
    { role: "admin",   expectMin: 100, expectOtherTenant: true  },
    { role: "member",  expectMin: 1,   expectOtherTenant: false },
    { role: "billing", expectMin: 0,   expectOtherTenant: false },
  ])("$role sees only its own tenant", async ({ role, expectMin, expectOtherTenant }) => {
    const session = await signInAs(role);
    const client = createBase44Client(session.token);

    const own = await client.collection("orders").list({
      where: { tenant_id: session.tenantId },
    });
    expect(own.length).toBeGreaterThanOrEqual(expectMin);

    const other = await client.collection("orders").list({
      where: { tenant_id: "TENANT_NOT_MINE" },
    });
    if (expectOtherTenant) {
      expect(other.length).toBeGreaterThan(0);
    } else {
      // Negative assertion — RLS must filter this out.
      expect(other.length).toBe(0);
    }
  });
});

The negative assertion (expect(other.length).toBe(0)) is the load-bearing line. RLS denials return zero rows, so this is the only place the bug surfaces.

Step 7: Automate the audit on every AI-builder commit

Wire the matrix-builder and the smoke suite into a pre-deploy check. Use a script outline like this:

// scripts/audit-rls.ts
const policies = await exportPoliciesFromBase44();
const queries  = await scanCodebaseForSdkCalls("./src");

const drift = diffPoliciesVsQueries(policies, queries);
if (drift.length > 0) {
  console.error("RLS drift detected on tables:", drift.map(d => d.table));
  process.exit(1);
}

const smoke = await runRoleSmokeSuite();
if (smoke.failures.length > 0) {
  console.error("Role smoke failures:", smoke.failures);
  process.exit(1);
}

The diff function compares the two surfaces:

function diffPoliciesVsQueries(policies: Policy[], queries: SdkCall[]) {
  return policies
    .map(policy => {
      const queriesOnTable = queries.filter(q => q.table === policy.table);
      const queryCols = new Set(queriesOnTable.flatMap(q => q.whereCols));
      const predicateCols = new Set(policy.referencedColumns);

      const orphan = [...predicateCols].filter(c => !queryCols.has(c));
      const unguarded = [...queryCols].filter(c => !predicateCols.has(c));

      return { table: policy.table, orphan, unguarded };
    })
    .filter(d => d.orphan.length > 0 || d.unguarded.length > 0);
}

Run it on every commit that touches src/ or the policy export. Fail the deploy on drift. Once this gate exists, the AI builder cannot ship a silent regression — it shows up as a red CI run instead.

What we've seen — 11 AI-builder audits

We have run this audit on 11 Base44 apps in the last six months. Every audit found drift. The pattern distribution:

  • 8 of 11 had at least one orphaned policy (Pattern 1 or Pattern 5).
  • 3 of 11 had a table with no RLS at all, and in each of those three the table held PII (email, phone, in one case partial SSN).
  • 5 of 11 had INSERT/SELECT asymmetry (Pattern 4) — users could create records they could not subsequently read, leading to ghost-record support tickets.
  • 2 of 11 had stale auth-context functions (Pattern 3) where a session refresh dropped the user's visibility.
  • 9 of 11 had at least one role added by the AI builder that was missing from one or more existing policies (Pattern 6).

The median count of distinct drift cases per app was 4. The worst case was 11 separate drift cases on a healthcare-adjacent app that had been live for 11 months and had never run a per-role smoke test.

Median time-to-detection in production was 47 days. The fastest detection (a user complained the same week) was on a multi-tenant SaaS where a customer noticed they could see another customer's invoices. The slowest (just under a year) was on an internal tool where the affected role only had two users and neither had reason to query the affected table until an audit forced it.

After deploying the seven-step audit and the gated smoke suite, the rerun rate of new drift on subsequent AI-builder commits dropped to under 5 percent across the four apps where we have six months of post-audit data. The remaining 5 percent are caught by the gate, not by users.

These are not abstract numbers. The healthcare-adjacent app was one bad export away from a HIPAA disclosure event. The audit cost was a fraction of what a breach response would have cost, and that ratio holds across every regulated app we have looked at.

Why this is the silent killer of Base44 production launches

Most Base44 production failures we see in 2026 are not from missing features or bad UX. They are from invisible permission bugs introduced by AI-builder edits nobody reviewed. The features ship. The build passes. Then 47 days later a customer notices and the incident is already in production telemetry, billing systems, and (in regulated cases) audit logs.

This deserves to be its own discipline. Traditional secure-coding reviews assume a human wrote the code and a human reviewed the diff. AI-builder edits violate both assumptions. The edit is large, the diff is sprawling, and the reviewer (if any) cannot run the policy against a test user from inside the platform. The result is a category of bug that is structurally invisible to the workflows most teams already have.

We call this discipline AI-Builder Verification. It is not pen-testing. It is specifically: take the agent's output, treat it as untrusted, and verify the policy and the code agree. The verification is mechanical (the seven steps above), but judgment about what the policy should be still requires a human who understands the data model. That combination — mechanical drift detection plus human policy review — is what we deliver in our Base44 AI Builder Audit engagement.

If your app is in production on Base44 and you have not run this audit, you do not know whether you are in the 73 percent. If you operate under regulated data, you cannot DIY this — get an external review before the next AI-builder edit ships. Scope an audit through /base44-debugging-help, or read the related fixes for AI-induced regressions broadly, SSO and auth bypass, and silent data loss on return — all descend from the same AI-builder verification gap.

QUERIES

Frequently asked questions

Q.01How does the AI builder break RLS without anyone noticing?
A.01

The AI builder treats schema, queries, and policies as three separate concerns and only updates whichever surfaces it sees in the prompt. Adding a tenant_id column to a table will update the schema and most read paths, but the existing RLS predicate that filtered on org_id is left untouched. There is no integration test that runs the policy against a real user, so the divergence never raises an error. The first symptom is usually a support ticket from a user who can see another tenant's data, weeks after the change shipped. By that point, multiple AI-builder turns have layered on top of the original drift, and tracing which prompt introduced the gap is no longer trivial.

Q.02What's the fastest way to detect drift between RLS and queries?
A.02

Build a coverage matrix. List every table and every RLS policy on the left axis. List every SDK call site (search the codebase for base44.collection(...)) on the right axis. Mark the predicate columns each policy filters on, then mark the where-clause columns each call site uses. Any row where the policy column and the query column do not match is a drift candidate. We typically find this in 30 to 90 minutes for a mid-size app. Automating it as a script that runs on every commit takes about a day and pays for itself the first time the AI builder edits a write path. Without the matrix, you are debugging blind every time a user reports a permission bug.

Q.03Why don't I get an error when RLS denies a query?
A.03

Row-Level Security in Base44 (and the underlying Postgres-style model) returns an empty result set when a policy filters out rows, not an error. From the SDK's perspective, the query succeeded and returned zero records — semantically identical to a query that legitimately matched nothing. There is no distinction at the API surface between 'no records exist' and 'records exist but you cannot see them.' This is a deliberate design choice (it prevents data-existence enumeration attacks), but it means RLS misconfigurations are invisible to your error-monitoring stack. The bug surfaces only when a human notices missing data or, worse, sees data they should not.

Q.04Can I write tests that catch this before users do?
A.04

Yes, and this is the only reliable safeguard. For every role in your app (admin, member, anonymous, billing, support, etc.) write a smoke test that authenticates as that role, runs every read and write that role is supposed to be able to do, and asserts the result count and contents. Add a negative test that runs a query the role should not be able to do and asserts zero rows. Run the suite on every AI-builder commit. The tests are tedious to write the first time but you only write them once per role; after that they run forever. We have caught regressions on the same day the AI builder shipped them, before the change reached production, in every audit where this suite existed.

Q.05Is this safe to fix myself or should I get an audit?
A.05

If you have one or two affected tables and you know your data model well, you can patch the immediate gap yourself in a few hours. The reason we say diyPossible is false is that finding all the gaps is the hard part, not fixing each one. In our 11 audits, the median count of orphaned policies plus unguarded queries was 4, and in three audits an entire table had no RLS at all and was leaking PII. Without an external review, you do not know what you do not know. If your app handles regulated data (PHI, financial, EU personal data), do not DIY this — a single missed predicate is a breach-class bug and the cost of an audit is two orders of magnitude lower than the cost of disclosure.

Q.06How do I prevent this on the next AI builder change?
A.06

Three controls, layered. First, lock RLS into your repo as code (export every policy on every commit) so the AI builder cannot silently mutate them without producing a diff. Second, run the per-role smoke suite from question four on every AI-builder commit, gated as a deploy blocker. Third, add an explicit sentence to every prompt that touches schema or queries: 'After this change, list every RLS policy that may need updating and update them. Do not modify any other policy.' None of the three is sufficient alone. Together they cut RLS drift from roughly 73 percent of feature additions to under 5 percent, based on the audits we have rerun after deploying the controls.

NEXT STEP

Need this fix shipped this week?

Book a free 15-minute call or order a $497 audit. We will respond within one business day.