Databricks Vendor Access: How to Block Direct Workspace Changes Without Breaking Delivery

The vendor problem isn't access — it's unaudited write access

A Databricks Community thread recently surfaced a scenario most platform teams recognize: an external vendor needs to develop and deploy notebooks, jobs, and pipelines in your workspace, but you have no mechanism to prevent them from editing production objects directly. The vendor has CAN_EDIT on shared folders because someone needed them to "get work done quickly." Now you're discovering changes to production notebooks with no pull request, no review, and no rollback path.

The instinct is to revoke access entirely, but that kills delivery velocity. Vendors need to read production code to understand existing logic. They need to run jobs to validate their work. What they should not have is the ability to commit changes to production paths without your team's review.

Databricks doesn't offer a single "read-only vendor mode" toggle. Instead, you assemble the guardrail from four mechanisms: workspace folder permissions, service principals, Git folders, and Unity Catalog. Each covers a different attack surface. Folder permissions control who can edit workspace objects. Service principals control how code reaches production. Git folders enforce version control as the deployment path. Unity Catalog controls data access independently of workspace permissions.

Skip any one layer and you leave a gap. A vendor with CAN_EDIT on a folder can overwrite a notebook even if you have Git folders configured — because Git folders don't retroactively protect files outside the repo path. A vendor with Unity Catalog SELECT grants but no workspace write access can still read your data through a SQL warehouse without touching your notebooks at all, which may be exactly what you want.

Folder permissions inherit downward — use that to build a permission boundary

Databricks workspace permissions follow a top-down inheritance model. When you grant CAN_VIEW on a folder, every notebook, file, and subfolder inside inherits that permission unless explicitly overridden. This is the first and most direct control you have over vendor access.

The practical pattern is a three-folder structure at the workspace root:

/Production — Vendor group gets CAN_VIEW. Only your platform team's service principal and workspace admins retain CAN_MANAGE. Vendors can read every notebook in production to understand existing logic but cannot edit or execute anything here.

/Development — Vendor group gets CAN_EDIT. This is their sandbox. They create notebooks, iterate on code, run interactive clusters. Nothing here deploys to production without going through your CI/CD pipeline.

/Staging — Vendor group gets CAN_RUN but not CAN_EDIT. Your CI/CD pipeline deploys code here for integration testing. Vendors can trigger runs to validate behavior but cannot modify the deployed artifacts.

Set these permissions using the Databricks Permissions API rather than the UI to ensure they're reproducible and auditable. The endpoint is PUT /api/2.0/permissions/directories/{directory_id} with a JSON body specifying the access control list. Script this in Terraform or your IaC tool so permissions survive workspace recreation.

One critical detail: workspace admins automatically have CAN_MANAGE on all objects. If any vendor user has workspace admin status — even temporarily granted for troubleshooting — your entire permission structure is bypassed. Audit admin group membership monthly. The Databricks SCIM API (GET /api/2.0/preview/scim/v2/Groups) lets you enumerate the admins group programmatically.

Four-Layer Vendor Access Control in Databricks

Service principals turn deployment into a chokepoint you control

Folder permissions stop a vendor from editing a production notebook in the UI. But if the vendor's personal credentials are used to run deployment scripts, you still have no gate between their code and production. Service principals solve this by separating human identity from deployment identity.

A service principal is a non-human identity in Databricks that authenticates via OAuth tokens or personal access tokens. Create one dedicated to your CI/CD pipeline — call it sp-cicd-prod. Grant sp-cicd-prod CAN_MANAGE on /Production and /Staging. Grant it nothing else. Now the only way code reaches production is through whatever pipeline authenticates as this service principal.

Vendors submit pull requests to your Git repository. Your CI/CD system (GitHub Actions, Azure DevOps, GitLab CI) authenticates as sp-cicd-prod, runs databricks workspace import or the Databricks Asset Bundles CLI (databricks bundle deploy), and pushes validated code to the production folder. The vendor never holds the credentials that write to production.

Create the service principal at the account level via the Account SCIM API: POST /api/2.0/accounts/{account_id}/scim/v2/ServicePrincipals. Then add it to your workspace and assign permissions. Store its OAuth secret in your CI/CD platform's secret manager — Azure Key Vault, GitHub Secrets, or AWS Secrets Manager — never in a shared notebook or workspace file.

The service principal also becomes the job owner for production jobs. This matters because Databricks jobs execute with the permissions of their owner. If a vendor creates a job under their own identity and then leaves the engagement, that job breaks on the next run. Service principal ownership survives personnel changes. Set the owner field explicitly in your job definitions using the Jobs API owner_user_name field (which, despite the name, accepts service principal application IDs).

Git folders enforce the review gate that permissions alone cannot

Permissions and service principals control who can write and how code deploys. Git folders add a third dimension: they make version history and peer review a structural requirement rather than a policy document nobody reads.

Databricks Git folders (formerly Repos) clone a remote Git repository into the workspace. When a vendor works inside a Git folder, every change they make is tracked as a local modification against the cloned branch. To get their code into the main branch, they must commit, push, and open a pull request — which your team reviews before merging.

The enforcement gap is that Git folders coexist with regular workspace notebooks. A vendor with CAN_EDIT on any non-Git folder can create a plain notebook and run whatever they want. This is where folder permissions and Git folders reinforce each other. If the vendor group only has CAN_EDIT inside /Development/vendor-name and that folder is a Git folder linked to your repo, then every edit they make is version-controlled by default.

Admins can also configure a Git URL allow list at the workspace level. This restricts which remote repositories can be cloned into the workspace. If your vendor tries to link a Git folder to their own private repo — one you can't audit — the allow list blocks it. Configure this through the admin console under Workspace Settings > Git Integration, or via the Workspace Conf API: PATCH /api/2.0/workspace-conf with {"enableGitUrlRestriction": "true", "gitUrlAllowList": "https://github.com/your-org/*"}.

Combine this with branch protection rules on your Git provider. Require at least one approval from your internal team before merges to main or production branches. The vendor's code goes through review. The service principal deploys only from the protected branch. The chain is complete.

Unity Catalog separates data access from workspace write access

A vendor who cannot edit production notebooks might still need to query production data — to validate transformations, debug issues, or build reports. Unity Catalog lets you grant data access without granting workspace object access, which is the separation most teams miss when setting up vendor permissions.

Grant the vendor group SELECT on specific catalogs, schemas, or tables using standard SQL: GRANT SELECT ON SCHEMA main.vendor_project TO vendor_group. They can query this data through a SQL warehouse or an interactive cluster without having write access to any workspace folder. For write access to development tables, grant it only on a development catalog: GRANT ALL PRIVILEGES ON SCHEMA dev.vendor_project TO vendor_group.

This layering means a vendor can run SELECT * FROM main.sales.orders in a SQL editor but cannot modify the notebook that transforms that data in production. They build and test their transformations in the dev catalog, submit code via Git, and your CI/CD pipeline deploys the reviewed code to production where it operates on production data under the service principal's identity.

Audit everything. Unity Catalog generates system tables in system.access.audit that log every data access event, including which principal accessed which table and when. Run a weekly query: SELECT * FROM system.access.audit WHERE action_name = 'getTable' AND request_params.principal_name LIKE 'vendor%' AND event_date > current_date - 7. This gives you a concrete record of what your vendors accessed, which matters for compliance and for the inevitable post-engagement access review.

MetricSign monitors Databricks job runs and surfaces failures with root cause context — including permission errors that appear when a service principal's grants are misconfigured or a Unity Catalog privilege is missing. When a production job fails because sp-cicd-prod lost SELECT on a table after a catalog migration, MetricSign groups that failure with the specific permission error rather than burying it in a generic job failure notification.

The four-layer audit: verify your controls actually hold

Configuration drift is the real enemy. You set up folder permissions, service principals, Git folders, and Unity Catalog grants during the vendor onboarding sprint. Six months later, someone grants CAN_EDIT on a production folder to unblock an urgent fix. The permission stays forever.

Build a quarterly audit script that checks all four layers. For folder permissions, call GET /api/2.0/permissions/directories/{id} for each production folder and flag any non-admin, non-service-principal entries with CAN_EDIT or CAN_MANAGE. For service principals, verify that production job owners are service principals, not human users — query the Jobs API (GET /api/2.1/jobs/list) and check the creator_user_name field. For Git folders, confirm that all vendor development folders are linked to approved repositories using GET /api/2.0/repos and cross-reference against your allow list. For Unity Catalog, run SHOW GRANTS ON SCHEMA main.production_schema and verify no vendor group has write privileges on production schemas.

Automate this as a Databricks notebook that runs on a scheduled job and writes results to a Delta table. Alert on any drift. The script doesn't need to be complex — four API calls and four SQL statements cover the critical checks.

Vendor offboarding is the other blind spot. When the engagement ends, disable the vendor group rather than deleting individual users. Disabling the group in your identity provider (Entra ID, Okta) propagates through SCIM to Databricks automatically. Verify propagation by checking GET /api/2.0/preview/scim/v2/Groups?filter=displayName eq "vendor-group" and confirming the member list is empty. Then revoke Unity Catalog grants explicitly — SCIM removal doesn't cascade to catalog permissions.