How Grab’s Legal team automated contract reviews without losing expert control

Abhinav Kumar, Head of Future of Work, Cyborganisation; Timothy James Bogle, Head of IP, TMT and Legal Ops

. May 4, 2026 . Regional

Every organisation has workflows where experts make judgment calls dozens of times a day.

Picture a customer support lead deciding to override a refund threshold for a loyal user; a finance manager handling an expense exception; or a legal counsel hunting for a hidden liability clause in a 50-page contract. In these moments, professionals must weigh company policies against messy, real-world context.

These processes are high-volume, knowledge-intensive, and notoriously hard to automate. However, we didn’t let that stop us from trying.

Earlier this year, Grab’s Legal team and Grabber Technology Solutions (GTS)—our internal corporate IT and enterprise technology department—partnered on a moonshot pilot to learn how to automate aspects of our legal work.

Challenge accepted: Automating legal work

We chose the first-pass review of Non-Disclosure Agreements (NDAs) as our testing ground. This is the initial check to see whether a contract follows Grab’s standard terms or needs a lawyer’s closer attention. NDAs are a vital step, for example, when Grab enters strategic partnership conversations and needs to exchange confidential business information, or when an external advisor needs access to sensitive product information.

The real barrier isn’t technology, it’s trust

We were curious to test if generative AI has the potential to help us work more effectively, because it is good at reading high volumes of unstructured text—which the Legal team deals with on a daily basis.

So we asked one of the top Large Language Models (LLM) to read an NDA, extract the key facts and clauses, and tell us whether the agreement should pass, be flagged, or be escalated.

However, our first experiment revealed the limitations of relying on genAI.

The LLM tool was useful in some ways, but it struggled to apply policy checks consistently across a full contract. It sometimes missed non-standard wording, interpreted the same phrases differently across runs, or produced recommendations that sounded plausible but didn’t stand up to the rigour required for an audit-grade review.

In legal work, the risk sits in the details. A single clause that changes liability, confidentiality obligations, or governing law can materially increase an organisation’s exposure if it slips through unnoticed. We knew we needed to build an extra layer of trust into the automation process.

In other words, when an AI contract review tool says “approved,” ‘denied,” or makes specific suggestions, the Legal team needs to know exactly why—with citations, audit trails, and zero tolerance for unsupported reasoning.

We went back to first principles and asked: what if we separated what AI is good at from what code is good at: applying rules consistently?

The breakthrough: AI reads, code decides, humans approve

That insight led to a hybrid decision engine built on three layers:

AI reads: Multiple LLM models extract facts and exact citations from the contract, using a consensus vote across models to stabilise outputs and catch unsupported output.
Code decides: A deterministic “rulebook”—written and owned by the Legal team—evaluates the extracted facts against Grab’s policies. This rulebook translates Grab’s legal’s policy logic into explicit pass/fail checks and routing rules. Each rule has three parts: the condition that passes, the condition that fails, and the action that follows. For example: auto-approve, suggest a standard amendment, or escalate to Legal. The Legal team maintains this rulebook by updating the policy logic as standards evolve, testing changes against a set of sample contracts, and reviewing the outputs before any rule changes are used in production. In other words, Legal still owns the policy; the code simply applies it consistently.
Humans approve: A reviewer interface presents facts, citations, and rule outcomes side-by-side, and legal experts validate with full context. Anything that AI can’t automatically decide is escalated to them for review.

We tested this engine against hundreds of variants spanning compliant, low-risk, and high-risk scenarios. The results: 100% rule evaluation accuracy, 0% false positives, and review time cut from 60 minutes to under 2 minutes. Legal retained final say at every step.

The result of our experiment was a resounding success. The architecture we built—extract facts with AI, evaluate with deterministic rules, validate with humans—is now a blueprint transforming judgment-based work across Grab’s business and corporate functions.

A blueprint for every knowledge-intensive team

For example, whenever a vendor requests access to Grab’s systems or data, these requests can be screened against standard policy requirements, with only unusual or higher-risk cases escalated to a human decision-maker. It would significantly speed up the process.

Customer support dispute resolutions are another scenario this framework can be applied to. Teams can assess contested charges, refunds, or account actions against policy rules, transaction history, and exception thresholds before deciding whether to uphold, reverse, or escalate the case.

The playbook we’ve developed is straightforward:

Identify a high-volume workflow where judgment can be broken down into rules
Codify tribal knowledge into a rulebook-as-code and curate a test set
Prove decision alignment against expert judgment in a sandbox
Graduate from human-in-the-loop review to workflow integration to selective autonomy

In practice, a small cross-functional team, combining technical support and deep domain expertise, can set up a working pilot in weeks.

What’s next

The Legal team will continue stress-testing our risk management framework, and then move the NDA engine from sandbox to production. This will be followed by expansion to commercial contracts, where teams need to review terms on liability, payment, and obligations. Next up are privacy reviews, where teams assess how personal data is collected, used, or shared; and

Perspectives