TL;DR
- We built an enterprise RAG system combining semantic search and relationship traversal inside BigQuery
- Eliminated the need for a separate graph database by using recursive CTEs for graph logic
- Solved the data sync problem and ensured atomic consistency across retrieval layers
- Reduced infrastructure costs by about 94 percent by avoiding a new managed service
- Simplified architecture and maintenance with a single warehouse constraint
- Achieved practical latency because network round trips between systems were removed
- Best suited for use cases with bounded graph depth, centralized data, and strict freshness requirements
At LatentView, we built Chanakya, an AI Chief of Staff system that answers executive queries by reasoning over a private knowledge graph of corporate data. When we designed the retrieval architecture, the industry standard seemed obvious: use a Vector Database for semantic search and a dedicated Graph Database for relationship traversal.
We chose a contrarian path. We eliminated GraphDB entirely. The industry is moving away from ‘specialized database sprawl.’ We realized that introducing a transactional database solely for this graph would violate our architecture’s simplicity.
Instead, we built an “All-in-SQL” RAG pipeline that runs entirely inside BigQuery. By leveraging Recursive Common Table Expressions (CTEs), we achieved atomic data consistency, reduced our infrastructure cost by 94%, and simplified our retrieval architecture — all while meeting our sub-second latency targets.
Here is why we imposed a “Single Warehouse” constraint and how we implemented graph traversal in pure SQL.
The Problem: The “Sync Tax” in RAG Systems
Our system processes queries like “What is blocking Project Alpha?” or “Who is accountable for the GenAI Initiative?”
This requires Hybrid Retrieval:
- Semantic Search: Finding “Project Alpha” (even if misspelled).
- Graph Traversal: Deterministically following blocks or owned_by edges to find related entities.
The standard architecture involves syncing data from a warehouse (BigQuery) to a specialized GraphDB. However, we realized this introduced a critical flaw for our use case: Eventual Consistency.
In an executive context, stale data destroys trust. If a user updates a project status in an email and immediately asks the AI for a summary, a 10-minute sync lag looks like an AI hallucination. To solve this, we imposed a strict architectural constraint: Atomic RAG. The vector index and the graph structure must live in the same storage engine to guarantee they reflect the same state at query time.
The Solution: Pragmatic Graph Theory
We evaluated our graph requirements and realized we fell into the “Bounded Depth” category. We didn’t need unbounded pathfinding or complex community detection algorithms (like PageRank). We needed:
- Hierarchy Lookups: (Task → Project → Initiative → Goal)
- Dependency Tracing: (What blocks X? What does X block?)
- Ownership Resolution: (Who owns X and what else do they own?)
These are deterministically solvable with SQL Recursive CTEs.
Pattern 1: Recursive Hierarchy Extraction
The most common operation is fetching an entity’s context. A standard recursive CTE can traverse the belongs_to relationships UP (parents) and DOWN (children) simultaneously.
WITH RECURSIVE hierarchy AS (
— Base case: The entity found via Vector Search
SELECT entity_id, name, type, 0 as depth
FROM canonical_entities WHERE entity_id = @entity_id
UNION ALL
— Recursive step: Traverse edges
SELECT
e.entity_id, e.name, e.type, h.depth + 1
FROM hierarchy h
JOIN entity_relationships r ON h.entity_id = r.source_entity_id
JOIN canonical_entities e ON r.target_entity_id = e.entity_id
WHERE h.depth < 3 — Hard depth limit prevents infinite loops
)
SELECT * FROM hierarchy ORDER BY depth ASC
Pattern 2: Blocker Chain Detection
“What is blocking this project?” is a more complex query because it requires traversing different relationship types (blocks, depends_on) potentially across multiple hops. This is usually where teams give up on SQL.
However, BigQuery handles this elegantly by collecting the traversal path in an array. This allows us to explain why an entity is blocked (e.g., “Task A depends on Task B, which is blocked by Task C”).
WITH RECURSIVE blocker_chain AS (
— Base: direct blockers
SELECT
e.entity_id, e.name, 1 as distance,
[r.relationship_type] as rel_chain
FROM entity_relationships r
JOIN canonical_entities e ON r.source_entity_id = e.entity_id
WHERE r.target_entity_id = @entity_id
AND r.relationship_type IN (‘blocks’, ‘depends_on’)
UNION ALL
— Recursive: blockers of blockers
SELECT
e.entity_id, e.name, bc.distance + 1,
ARRAY_CONCAT(bc.rel_chain, [r.relationship_type])
FROM blocker_chain bc
JOIN entity_relationships r ON bc.entity_id = r.target_entity_id
JOIN canonical_entities e ON r.source_entity_id = e.entity_id
WHERE r.relationship_type IN (‘blocks’, ‘depends_on’)
AND bc.distance < 3
)
SELECT DISTINCT * FROM blocker_chain
Pattern 3: The “N+1” Solver (Batch Enrichment)
The biggest performance killer in GraphDB integrations is the “N+1 problem.” If Vector Search returns 20 relevant entities, a naive application sends 20 separate queries to the GraphDB to fetch their owners and blockers. This network round-trip latency adds up fast.
In BigQuery, we solve this with UNNEST. We can enrich an entire batch of entities in a single query.
— Fetch graph context for 20+ entities in one shot
WITH RECURSIVE
blocker_chain AS (
SELECT r.target_entity_id as blocked_id, …
FROM entity_relationships r
— The Magic: Batch entry via UNNEST
WHERE r.target_entity_id IN UNNEST(@entity_ids)
UNION ALL …
)
A caveat on cold starts: BigQuery’s on-demand model incurs a noticeable first-query penalty. Our end-to-end pipeline latency — from user query to final LLM response — runs 30–40 seconds, a figure dominated by LLM generation and orchestration, not retrieval. The cold-start overhead is real but amortized across a session once the first query warms the pipeline.
Evaluation: BigQuery vs. Specialized GraphDB
We compared our implementation against a managed GraphDB instance.
| Metric | GraphDB | BigQuery (Recursive CTE) | Decision |
| Consistency | Eventual (Sync lag) | Atomic (Immediate) | Critical for trust. |
| Complexity | High (New query language, ETL) | Low (SQL, Single source) | Critical for maintenance. |
| Cost | ~$800/mo (Managed Instance) | **~$50/yr (Incremental Compute) | 94% Savings. |
** BigQuery is already our warehouse — this is the marginal cost of running graph queries on existing infrastructure, not a net-new service. The real saving is avoiding an additional managed dependency entirely.
While BigQuery is slower per-hop, the end-to-end latency for our “Chief of Staff” use case is actually lower because we eliminated the network overhead of moving data between the warehouse and the graph store.
Conclusion: A General Pattern for Analytical RAG
Engineering is about choosing the right constraints. By constraining ourselves to a “Single Warehouse,” we accepted a slightly higher traversal latency in exchange for massive wins in data consistency and operational simplicity.
We built this for a Chief of Staff agent, but the pattern is universal. You probably don’t need a dedicated GraphDB if your application fits these criteria:
- Bounded Depth: You only need to look 3-5 hops away (e.g., “what belongs to this”, “what depends on this”).
- High Data Gravity: Your core data (logs, transactions, embeddings) already lives in a data warehouse.
- Atomic Requirements: You cannot afford the latency or complexity of syncing data to a specialized graph store.
If you meet these criteria, your data warehouse is already a graph engine; you just need to write the recursive SQL to unlock it.
FAQs
1. When can SQL replace a graph database?
When traversal depth is limited, relationships are structured, and data already resides in a warehouse. Complex graph algorithms like PageRank or large-scale pathfinding still favor specialized engines.
2. Are recursive CTEs fast enough for graph queries?
For bounded depth traversals, yes. Per-hop latency may be higher than a native graph engine, but eliminating cross-system network calls often makes overall query time faster.
3. Why is atomic consistency important for RAG systems?
Because answers must reflect the latest state of enterprise data. Sync delays between storage systems can produce stale outputs that look like hallucinations.
4. What is the N+1 problem in graph retrieval?
It happens when each entity returned by search requires its own follow-up query for relationships. Batch querying with SQL UNNEST solves this by enriching many entities in one pass.
5. Does this approach work for real-time applications?
Yes if your warehouse supports low-latency queries and the graph depth is constrained. It is less suitable for millisecond-level transactional workloads.
6. Is this cheaper than running a graph database?
Often. If your warehouse already exists, running graph logic inside it usually adds marginal compute cost instead of requiring a new managed service.