Embedding Databricks Apps w/o SSO
A Go-based reverse proxy that enables embedding Databricks Lakehouse Apps (like VS Code and Marimo notebooks) without requiring users to authenticate directly with Databricks. Uses a session-cookie architecture — no tokens ever appear in URLs, browser storage, or logs.
Overview
Normally, embedding a Databricks app (such as the hosted VS Code editor or Marimo notebooks) in an iframe requires users to authenticate directly with Databricks through an SSO login flow. This exposes the Databricks login interface to end users and breaks the seamless experience of a custom application.
This Go reverse proxy eliminates the need for Databricks SSO by handling authentication transparently. Users authenticate with your application (via Better Auth + Okta SSO), and the proxy securely manages Databricks OAuth tokens entirely on the server side — exchanging a short-lived JWT for an opaque session cookie that the browser uses for all subsequent proxied requests.
Key Features
- HttpOnly session cookies — no tokens in URLs, logs, or browser storage
- Server-side session management with PostgreSQL
- JWT-validated session creation via
/start-session - Automatic SPN token refresh (5-minute pre-expiry window)
- Strict CORS origin validation (exact match, no wildcards)
- Bidirectional WebSocket proxying for real-time features
- SSRF protection: app URLs validated against
ALLOWED_APEX_DOMAIN
Embedded Applications
This proxy architecture powers several embedded applications:
High-Level Architecture
The following diagram shows the interaction between your Next.js application, the Go proxy server, and Databricks Lakehouse Apps. The React ProxyIframe component bootstraps a server-side session via JWT exchange, receives an opaque proxy_sid cookie, then loads the iframe. Every subsequent proxied request is authenticated by the session cookie alone.
Why a Proxy is Needed
Databricks Lakehouse Apps require OAuth Bearer token authentication for every request. When embedding these apps in iframes, we face several security challenges:
Token Exposure
If we embed the Databricks app directly with the token in the URL, the OAuth token would be visible in the browser's address bar, network inspector, and history.
CORS Restrictions
Databricks apps have strict CORS policies that prevent direct cross-origin requests from custom web applications.
WebSocket Authentication
WebSocket connections (used for terminals and real-time features) cannot easily include custom authentication headers from browser-initiated connections.
How the Proxy Solves These
Session Cookie Architecture
The React ProxyIframe component calls /start-session with a short-lived JWT. The proxy validates the JWT, fetches a Databricks bearer token via SPN client credentials, and returns an opaque HttpOnly session cookie. Databricks tokens never appear in URLs, logs, or browser storage.
CORS Proxy
The proxy adds appropriate CORS headers to responses, enabling cross-origin iframe embedding while maintaining security. The /start-session endpoint enforces a strict exact-match origin check against FRONTEND_URL.
WebSocket Proxying
The proxy detects WebSocket upgrade requests, looks up the session cookie to retrieve the Databricks bearer token, establishes an authenticated connection to the app, and bidirectionally forwards messages.
Session Creation Flow
Session initialization is a 7-step flow driven by the ProxyIframe React component. It runs once when the component mounts (with a React StrictMode guard to prevent double invocation).
Step 1: Fetch JWT
ProxyIframe calls authClient.token() to obtain a short-lived, signed JWT from Better Auth. The JWT has the current user's session as its subject and FRONTEND_URL as both issuer and audience.
Step 2: POST /start-session
The component posts { jwt, toolId, orgId } to {proxyBaseUrl}/start-session with credentials: "include" so the browser sends and receives cookies cross-origin.
Step 3: Origin validation
The Go proxy checks Origin == FRONTEND_URL (exact match, no wildcards). Requests from any other origin receive a 403 Forbidden before any JWT processing begins.
Step 4: JWT verification
The proxy fetches the JWKS from {frontendURL}/api/auth/jwks, verifies the JWT's signature, and checks iss, aud, and exp claims. The sub and email claims are extracted.
Step 5: Access validation
The proxy runs a 4-step DB check: user not banned → user belongs to orgId → tool exists and is not deleted → SPN credentials exist for this tool. On any failure the request is rejected with 403 Forbidden.
Step 6: Databricks token fetch
The proxy calls POST {workspaceURL}/oidc/v1/token using SPN client-credentials OAuth flow (grant_type=client_credentials, scope=all-apis). The resulting bearer token is stored in the session record — the browser never sees it.
Step 7: Session created, cookie set
A 32-byte cryptographically random session ID is generated. Its SHA-256 hash (not the raw ID) is stored in the proxy_sessions table. The raw session ID is returned as an HttpOnly proxy_sid cookie (1-hour TTL).
ProxyIframe Component
The session initialization logic lives in src/components/proxy-iframe.tsx. A useRef guard prevents React StrictMode's double-invocation from creating duplicate sessions.
// 1. Fetch a short-lived JWT from better-auth (requires session cookie).
const tokenResult = await authClient.token();
if (tokenResult.error || !tokenResult.data?.token) {
setStatus("error");
return;
}
// 2. POST { jwt, toolId, orgId } to the Go proxy /start-session endpoint.
// The proxy validates the JWT via JWKS, looks up the tool + SPN in
// the database, fetches a Databricks bearer token, and sets the session cookie.
const res = await fetch(`${proxyBaseUrl}/start-session`, {
method: "POST",
credentials: "include", // send/receive cookies cross-origin
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ jwt: tokenResult.data.token, toolId, orgId }),
});
// 3. Once the cookie is set, render the iframe.
// All subsequent requests to /app-proxy/{toolId}/ carry the cookie automatically.
if (res.ok) setStatus("ready");Complete Session Flow Diagram
Proxy URL Pattern
The proxy uses a simple tool-scoped URL pattern. Unlike token-in-URL approaches, there is no sensitive data in the URL — routing is resolved entirely from the server-side session record.
Pattern:
/app-proxy/{toolId}/ ← initial iframe load
/app-proxy/{toolId}/{path} ← all subsequent requests (assets, API calls, WS)
Example:
/app-proxy/code-editor-3771219485779100/
/app-proxy/code-editor-3771219485779100/terminal
/app-proxy/code-editor-3771219485779100/api/files
How routing works:
1. Browser includes proxy_sid cookie automatically (no token in URL)
2. Proxy looks up session by SHA-256(proxy_sid) in proxy_sessions table
3. Session record contains appURL (e.g. https://{app}.aws.databricksapps.com)
4. Proxy forwards request to appURL/{path} with Authorization: Bearer {accessToken}Session Validation & Token Refresh
Every request to /app-proxy/{toolId}/... goes through session validation before being proxied. The session ID itself is never stored — only its SHA-256 hash — so a leaked database row cannot be used to forge a valid cookie.
1. Extract & hash cookie
The proxy_sid cookie value is read and its SHA-256 hash is computed. The hash is used to query proxy_sessions.
2. Validate session
The session must not be expired (expiresAt > now) and the toolID in the record must match the toolId in the request path. This prevents cross-app session reuse if a cookie is sent to the wrong proxy path.
3. Automatic token refresh
If the stored Databricks access token expires within 5 minutes, the proxy transparently fetches a new one via SPN client credentials and updates the session record before proxying the request.
4. Proxy the request
The request is forwarded to appURL with Authorization: Bearer {accessToken}. All X-Forwarded-* headers from the client are stripped to prevent injection. Security headers (CSP, X-Frame-Options) are injected on the response.
Session Database Schema
CREATE TABLE proxy_sessions (
id TEXT PRIMARY KEY, -- hex(SHA-256(cookie_value))
user_id TEXT NOT NULL,
user_email TEXT NOT NULL,
tool_id TEXT NOT NULL,
org_id TEXT NOT NULL,
app_url TEXT NOT NULL, -- validated against ALLOWED_APEX_DOMAIN
workspace_url TEXT NOT NULL,
spn_client_id TEXT NOT NULL,
spn_client_secret TEXT NOT NULL,
access_token TEXT NOT NULL, -- Databricks bearer token (never in browser)
token_expires_at TIMESTAMPTZ NOT NULL,
expires_at TIMESTAMPTZ NOT NULL, -- session TTL (1 hour)
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Fast lookup by hash, cleanup by expiry, session list by tool+user
CREATE INDEX proxy_sessions_expires_at_idx ON proxy_sessions (expires_at);
CREATE INDEX proxy_sessions_tool_user_idx ON proxy_sessions (tool_id, user_id);Production Deployment: Domain-Based Wildcard Routing
⚠️ Important: The Firefly Reference Implementation Uses Path-Based Cookies — This Is Not the Production Recommendation
The Firefly reference implementation ships with a single shared proxy domain using path-scoped cookies (/app-proxy/{toolId}/ ). This works for development, staging demos, and getting started quickly, but must not be used in production due to the following cross-app security risks:
- Shared cookie namespace: All apps run under the same domain. An XSS vulnerability in one embedded app could potentially read or interfere with sessions belonging to other apps on the same domain.
- Path scoping is not a security boundary:
Path: /app-proxy/{toolId}/is a browser hint that limits which requests receive the cookie — it is not enforced by the Same-Origin Policy. JavaScript running on the same domain can access all cookies for that domain regardless of path. - Cross-app session enumeration: A compromised or malicious embedded app could attempt to probe other session paths on the shared domain.
For production, use wildcard subdomain routing with strict CORS (detailed below).
Recommended: Wildcard Subdomain Pattern
Assign each tool a dedicated subdomain. This gives every app a separate origin, enforcing the browser's Same-Origin Policy as a hard isolation boundary — no JavaScript on app-tool-a.firefly-analytics.com can access cookies or storage from app-tool-b.firefly-analytics.com.
Pattern:
app-{toolId}.firefly-analytics.com → Go proxy for that tool
Examples:
app-code-editor.firefly-analytics.com
app-notebook-1234.firefly-analytics.com
app-sql-dashboard.firefly-analytics.com
DNS:
*.firefly-analytics.com → A record → reverse proxy (nginx / Cloudflare / ALB)
Reverse proxy extracts toolId from hostname, routes to Go proxy.Cookie configuration per subdomain:
Name: proxy_sid
Domain: app-{toolId}.firefly-analytics.com ← exact subdomain, not wildcard
SameSite: Strict ← or Lax if frontend is same registrable domain
Secure: true
HttpOnly: true
MaxAge: 3600Security Benefits
- Browser SOP enforces full isolation —
app-foo.firefly-analytics.comcannot read cookies or storage ofapp-bar.firefly-analytics.com - XSS in one embedded app is contained to that app's subdomain only
SameSite=Strict(orLax) can be used instead ofNone, reducing CSRF surface furtherFRONTEND_URLCORS check on/start-sessionensures only your application can initiate sessions
Path-Based vs Domain-Based Architecture
Comparison: Path-Based vs Domain-Based
| Feature | Path-Based (Firefly Reference) | Domain-Based (Recommended Production) |
|---|---|---|
| Cookie isolation | Partial (path hint only) | Full (SOP-enforced boundary) |
| XSS blast radius | All apps on same domain | Single app subdomain only |
| SameSite setting | None (cross-site iframe) | Strict or Lax |
| CORS protection | Origin check on /start-session | Origin check + subdomain isolation |
| JS access to cookies | All same-domain cookies accessible | Only subdomain cookies accessible |
| Setup complexity | Simple (single domain) | Requires wildcard DNS + routing |
| Recommended for | Dev / demos / getting started | Production deployments |
Iframe Embedding
Once the session cookie is set, the ProxyIframe component renders an <iframe> pointing to {proxyBaseUrl}/app-proxy/{toolId}/. The browser automatically includes the proxy_sid cookie on all requests within that iframe.
Iframe Architecture
Sandbox Attributes
Allows JavaScript execution (required for editor functionality)
Allows access to localStorage and cookies within iframe context
Enables form submission for file uploads and settings
Allows opening new windows for help docs and external links
Permits file downloads for notebooks and data exports
WebSocket Support
Real-time features like terminal sessions and language server protocol require WebSocket connections. The Go proxy provides full bidirectional WebSocket proxying, using the same session cookie for authentication.
WebSocket Proxy Flow
WebSocket Detection & Auth
// WebSocket requests are detected by the Upgrade header.
func isWebSocketRequest(r *http.Request) bool {
return strings.ToLower(r.Header.Get("Connection")) == "upgrade" &&
strings.ToLower(r.Header.Get("Upgrade")) == "websocket"
}
// In the main proxy handler — session cookie provides the auth token.
if isWebSocketRequest(r) {
// Session already validated; accessToken retrieved from proxy_sessions.
wsURL := strings.Replace(targetURL, "https://", "wss://", 1) + remainingPath
handleWebSocketProxy(w, r, wsURL, accessToken)
} else {
handleHTTPProxy(w, r, targetURL, accessToken, remainingPath)
}Deployment
The Go proxy can be deployed in several ways. All deployment options require a PostgreSQL database for session storage.
Docker Container
Build a Docker image and deploy to any container platform (ECS, Kubernetes, Cloud Run)
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o proxy .
FROM alpine:latest
COPY --from=builder /app/proxy /proxy
# Required
ENV FRONTEND_URL="" # e.g. https://firefly-analytics.com
ENV ALLOWED_APEX_DOMAIN="" # e.g. aws.databricksapps.com
ENV DATABASE_URL="" # PostgreSQL connection string
# Optional
ENV DEV_MODE="false" # Set to "true" for http://localhost testing only
ENV PORT="8090"
EXPOSE 8090
CMD ["/proxy"]Serverless Function
Deploy as AWS Lambda or Google Cloud Functions for auto-scaling. Note: WebSocket support requires a long-lived connection — ensure your serverless platform supports it (e.g., API Gateway WebSocket APIs).
VM or Bare Metal
Run directly on VMs for maximum performance and control. Recommended for high-concurrency WebSocket workloads.
Configuration Reference
| Variable | Description | Required |
|---|---|---|
FRONTEND_URL | Origin of the Next.js app (e.g. https://firefly-analytics.com). Used for JWT iss/aud validation and strict CORS origin check. | Yes |
ALLOWED_APEX_DOMAIN | Databricks apps apex domain (e.g. aws.databricksapps.com). App URLs from the DB are validated against this to prevent SSRF. | Yes |
DATABASE_URL | PostgreSQL connection string for the proxy_sessions table. | Yes |
DEV_MODE | Set to "true" to use path-scoped cookies without Secure flag. For http://localhost development only. Never enable in production. | No |
PORT | Server port (default: 8090) | No |