Messages you exchange with your driver-partner in the Grab app are automatically translated. This GrabChat feature is meant to ensure that you’re able to communicate with driver- and delivery-partners seamlessly, even when you don’t speak the same language. 

When we received feedback that these translations were occasionally inaccurate, we put together an engineering squad to tackle the issue. 

An example of a bad translation. The correct translation is: ok sir.

The goal was simple. We had to improve the quality of translations on our platform, while keeping cost efficiency in mind.

Building an in-house translation model

Initially, we completely relied on an external service to translate our chat messages. Although the third-party tool was generally effective, translations weren’t always accurate.

An example of an inaccurate translation from the third-party service recorded on Dec 19, 2023.

We realised we could create our own translation system and even aim to outperform off-the-shelf tools that often lacked context. Our advantage? Our wealth of data which is directly related to the context of booking rides and ordering food—and thus can generate more precise translations. 

What are users saying?

For example, we found that many of the chats centred around pickup points. The exchange of courtesies such as ‘Hi’ and ‘Thank you’ were also common. Users also frequently used GrabChat to inform each other about their arrivals. 

This allowed us to build a dataset that is representative of the types of chat messages exchanged between users and driver-partners, and diverse enough to capture all of the nuances of the conversations.

Some examples of recurring topics include arrival notifications and pickup instructions.

By aggregating and analysing past conversations, we can pinpoint recurring themes and topics. This valuable information can then be used to enhance the training of our translation models and improve their accuracy and effectiveness.

Setting the quality bar

We then worked with Grab’s localisation team to set a standard for translations. The goal wasn’t to create a dataset large enough to fully train our model, but to get enough data that would help us set some benchmarks. 

Good vs bad translations

These translations will also serve as a guide for the model to accurately capture the desired style and tone.

How to train a model? 

To create our model, we also needed training data. We used an open-source Large Language Model (LLM) to create artificial translation data. The model had to be large enough to produce high-quality results and could handle the many Southeast Asian languages across the markets we serve. 

This was especially important for languages like Vietnamese and Thai that consist of large character sets and diacritics—marks, shapes or strokes that accompany letters. 

(Read more: Why we redesigned a typeface for Thai and Cambodian scripts)

We then used some of the translations with high benchmark scores as data for our model to learn from. After all, the quality of translations by Language Learning Models (LLMs) is only as good as the data they are trained on.

Fine-tuning our model

Before our translation system was good to go, we also had to make sure that elements such as numbers or unique symbols were not misinterpreted or omitted during translation. This is critical as displaying incorrect numbers in a translation can confuse users.

The model pulls out all non-translatable items from the original message, tally each occurrence, and then attempt to find a corresponding match in the translation. If the match is not found, we reject the internally generated translation and revert to using an external third-party translation service.

Expanding our solutions 

We believe that our proprietary in-house translation models are not only more cost-effective but cater more accurately to our unique use cases compared to third-party services. We will focus on expanding these models to more languages and countries across our operating regions.

We are also exploring opportunities to apply learnings of our chat translations to other Grab content. This strategy aims to guarantee a seamless language experience for our rapidly expanding user base, especially travellers.

The problem of language translation and translation quality benchmarking is highly complex. If you’d like to know more, for example about language detection (how do we know which languages which languages to translate from, and into?) and the measures required to do all of this at manageable costs, dig into this post on Grab’s engineering blog, which gets into these points with more detail.

Komsan Chiyadis

GrabFood delivery-partner, Thailand

Komsan Chiyadis

GrabFood delivery-partner, Thailand

COVID-19 has dealt an unprecedented blow to the tourism industry, affecting the livelihoods of millions of workers. One of them was Komsan, an assistant chef in a luxury hotel based in the Srinakarin area.

As the number of tourists at the hotel plunged, he decided to sign up as a GrabFood delivery-partner to earn an alternative income. Soon after, the hotel ceased operations.

Komsan has viewed this change through an optimistic lens, calling it the perfect opportunity for him to embark on a fresh journey after his previous job. Aside from GrabFood deliveries, he now also picks up GrabExpress jobs. It can get tiring, having to shuttle between different locations, but Komsan finds it exciting. And mostly, he’s glad to get his income back on track.