{"id":224482,"date":"2024-09-19T18:43:31","date_gmt":"2024-09-19T10:43:31","guid":{"rendered":"https:\/\/www.grab.com\/sg\/?post_type=editorial&#038;p=224482"},"modified":"2025-02-26T18:56:38","modified_gmt":"2025-02-26T10:56:38","slug":"grabchat-translations-accurate-engineering","status":"publish","type":"editorial","link":"https:\/\/www.grab.com\/sg\/inside-grab\/stories\/grabchat-translations-accurate-engineering\/","title":{"rendered":"We made translations on GrabChat more accurate. Here\u2019s how."},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"224482\" class=\"elementor elementor-224482\" data-elementor-post-type=\"editorial\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-67e9c96 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"67e9c96\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-57a7e8f\" data-id=\"57a7e8f\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap\">\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-a2facc5  gr21-boxed-content  editorial-gr21-boxed-content elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a2facc5\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7451b5e\" data-id=\"7451b5e\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-8399d75 elementor-widget elementor-widget-text-editor\" data-id=\"8399d75\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<p><span style=\"font-weight: 400;\">Messages you exchange with your driver-partner in the Grab app are automatically translated. This GrabChat feature is meant to ensure that you\u2019re able to communicate with driver- and delivery-partners seamlessly, even when you don\u2019t speak the same language.\u00a0<\/span><\/p><p><span style=\"font-weight: 400;\">When we received feedback that these translations were occasionally inaccurate, we put together an engineering squad to tackle the issue.\u00a0<\/span><\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ab52469 elementor-widget elementor-widget-image\" data-id=\"ab52469\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"243\" src=\"https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180807\/translation-example-700x243.png\" class=\"attachment-large size-large wp-image-224501\" alt=\"\" srcset=\"https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180807\/translation-example-700x243.png 700w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180807\/translation-example-250x87.png 250w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180807\/translation-example-768x267.png 768w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180807\/translation-example-18x6.png 18w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180807\/translation-example-120x42.png 120w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180807\/translation-example.png 1070w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">An example of a bad translation. The correct translation is: ok sir.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-47c3bdc elementor-widget elementor-widget-text-editor\" data-id=\"47c3bdc\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<p><span style=\"font-weight: 400;\">The goal was simple. We had to improve the quality of translations on our platform, while keeping cost efficiency in mind.<\/span><\/p><h5><b>Building an in-house translation model<\/b><\/h5><p><span style=\"font-weight: 400;\">Initially, we completely relied on an external service to translate our chat messages. Although the third-party tool was generally effective, translations weren\u2019t always accurate.<\/span><\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a16cb9b elementor-widget elementor-widget-image\" data-id=\"a16cb9b\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"409\" src=\"https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180854\/weird-translation-700x409.png\" class=\"attachment-large size-large wp-image-224502\" alt=\"\" srcset=\"https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180854\/weird-translation-700x409.png 700w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180854\/weird-translation-250x146.png 250w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180854\/weird-translation-768x449.png 768w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180854\/weird-translation-18x12.png 18w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180854\/weird-translation-120x70.png 120w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19180854\/weird-translation.png 1352w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">An example of an inaccurate translation from the third-party service recorded on Dec 19, 2023.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-62c4f1c elementor-widget elementor-widget-text-editor\" data-id=\"62c4f1c\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<p>We realised we could create our own translation system and even aim to outperform off-the-shelf tools that often lacked context. Our advantage? Our wealth of data which is directly related to the context of booking rides and ordering food\u2014and thus can generate more precise translations.\u00a0<\/p><h5>What are users saying?<\/h5><p>For example, we found that many of the chats centred around pickup points. The exchange of courtesies such as \u2018Hi\u2019 and \u2018Thank you\u2019 were also common. Users also frequently used GrabChat to inform each other about their arrivals.\u00a0<\/p><p>This allowed us to build a dataset that is representative of the types of chat messages exchanged between users and driver-partners, and diverse enough to capture all of the nuances of the conversations.<\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-28f6ecb elementor-widget elementor-widget-image\" data-id=\"28f6ecb\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1418\" height=\"528\" src=\"https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19181236\/grabchat-topics-e1726741786152.png\" class=\"attachment-full size-full wp-image-224503\" alt=\"\" srcset=\"https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19181236\/grabchat-topics-e1726741786152.png 1418w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19181236\/grabchat-topics-e1726741786152-250x93.png 250w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19181236\/grabchat-topics-e1726741786152-700x261.png 700w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19181236\/grabchat-topics-e1726741786152-768x286.png 768w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19181236\/grabchat-topics-e1726741786152-18x7.png 18w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19181236\/grabchat-topics-e1726741786152-120x45.png 120w\" sizes=\"(max-width: 1418px) 100vw, 1418px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Some examples of recurring topics include arrival notifications and pickup instructions.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-295da91 elementor-widget elementor-widget-text-editor\" data-id=\"295da91\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<p>By aggregating and analysing past conversations, we can pinpoint recurring themes and topics. This valuable information can then be used to enhance the training of our translation models and improve their accuracy and effectiveness.<\/p><h5>Setting the quality bar<\/h5><p>We then worked with Grab&#8217;s localisation team to set a standard for translations. The goal wasn&#8217;t to create a dataset large enough to fully train our model, but to get enough data that would help us set some benchmarks.\u00a0<\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-203304e elementor-widget elementor-widget-image\" data-id=\"203304e\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1561\" height=\"421\" src=\"https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19182527\/translation-examples-e1726741548608.png\" class=\"attachment-full size-full wp-image-224507\" alt=\"\" srcset=\"https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19182527\/translation-examples-e1726741548608.png 1561w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19182527\/translation-examples-e1726741548608-250x67.png 250w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19182527\/translation-examples-e1726741548608-700x189.png 700w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19182527\/translation-examples-e1726741548608-768x207.png 768w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19182527\/translation-examples-e1726741548608-1536x414.png 1536w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19182527\/translation-examples-e1726741548608-18x5.png 18w, https:\/\/assets.grab.com\/wp-content\/uploads\/sites\/4\/2024\/09\/19182527\/translation-examples-e1726741548608-120x32.png 120w\" sizes=\"(max-width: 1561px) 100vw, 1561px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Good vs bad translations<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ec1da8a elementor-widget elementor-widget-text-editor\" data-id=\"ec1da8a\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<p>These translations will also serve as a guide for the model to accurately capture the desired style and tone.<\/p><h5>How to train a model?\u00a0<\/h5><p>To create our model, we also needed training data. We used an open-source Large Language Model (LLM) to create artificial translation data. The model had to be large enough to produce high-quality results and could handle the many Southeast Asian languages across the markets we serve.\u00a0<\/p><p>This was especially important for languages like Vietnamese and Thai that consist of large character sets and diacritics\u2014marks, shapes or strokes that accompany letters.\u00a0<\/p><p><strong>(Read more: <a href=\"https:\/\/www.grab.com\/sg\/inside-grab\/stories\/why-we-redesigned-a-typeface-for-thai-and-cambodian-scripts\/\">Why we redesigned a typeface for Thai and Cambodian scripts<\/a>)<\/strong><\/p><p>We then used some of the translations with high benchmark scores as data for our model to learn from. After all, the quality of translations by Language Learning Models (LLMs) is only as good as the data they are trained on.<\/p><h5>Fine-tuning our model<\/h5><p>Before our translation system was good to go, we also had to make sure that elements such as numbers or unique symbols were not misinterpreted or omitted during translation. This is critical as displaying incorrect numbers in a translation can confuse users.<\/p><p>The model pulls out all non-translatable items from the original message, tally each occurrence, and then attempt to find a corresponding match in the translation. If the match is not found, we reject the internally generated translation and revert to using an external third-party translation service.<\/p><h5>Expanding our solutions\u00a0<\/h5><p>We believe that our proprietary in-house translation models are not only more cost-effective but cater more accurately to our unique use cases compared to third-party services. We will focus on expanding these models to more languages and countries across our operating regions.<\/p><p>We are also exploring opportunities to apply learnings of our chat translations to other Grab content. This strategy aims to guarantee a seamless language experience for our rapidly expanding user base, especially travellers.<\/p><p>The problem of language translation and translation quality benchmarking is highly complex. If you\u2019d like to know more, for example about language detection (how do we know which languages which languages to translate from, and into?) and the measures required to do all of this at manageable costs, dig into\u00a0<a href=\"https:\/\/engineering.grab.com\/improved-translation-experience-with-cost-efficiency\">this post on Grab\u2019s engineering blog<\/a>, which gets into these points with more detail.<\/p>\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-0c2530f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"0c2530f\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-f7c8864\" data-id=\"f7c8864\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap\">\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"parent":180237,"menu_order":0,"template":"grab21-default","acf":[],"_links":{"self":[{"href":"https:\/\/www.grab.com\/sg\/wp-json\/wp\/v2\/editorial\/224482"}],"collection":[{"href":"https:\/\/www.grab.com\/sg\/wp-json\/wp\/v2\/editorial"}],"about":[{"href":"https:\/\/www.grab.com\/sg\/wp-json\/wp\/v2\/types\/editorial"}],"version-history":[{"count":25,"href":"https:\/\/www.grab.com\/sg\/wp-json\/wp\/v2\/editorial\/224482\/revisions"}],"predecessor-version":[{"id":224527,"href":"https:\/\/www.grab.com\/sg\/wp-json\/wp\/v2\/editorial\/224482\/revisions\/224527"}],"up":[{"embeddable":true,"href":"https:\/\/www.grab.com\/sg\/wp-json\/wp\/v2\/editorial\/180237"}],"wp:attachment":[{"href":"https:\/\/www.grab.com\/sg\/wp-json\/wp\/v2\/media?parent=224482"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}