Computer Vision

How might we automate the process of recognizing the details of the vehicles from images, including make and model?

This is a data science assignment where you are expected to create a data model from a given training dataset.



Economies in Southeast Asia are turning to AI to solve traffic congestion, which hinders mobility and economic growth. The first step in the push towards alleviating traffic congestion is to understand travel demand and travel patterns within the city.

Can we accurately forecast travel demand based on historical Grab bookings to predict areas and times with high travel demand?

Please submit the final repository including documentation by or before 17 June 2019, 6.00pm (SGT).

In this challenge, participants are to build a model trained on a historical demand dataset, that can forecast demand on a Hold-out test dataset. The model should be able to accurately forecast ahead by T+1 to T+5 time intervals (where each interval is 15-min) given all data up to time T.

You can use the “Demand Data” dataset provided by Grab.

You are expected to create a Data Model based on the “Demand Data” dataset in order to solve the given problem statement.

You should also provide step by step documentation on how to run your code. Our evaluators will be running your data models on a test dataset.

The given dataset contains normalised historical demand of a city, aggregated spatiotemporally within geohashes and over 15 minute intervals. The dataset spans over a two month period. A brief description of the dataset fields are found below:




geohash level 6
Geohash is a public domain geocoding system which encodes a geographic location into a short string of letters and digits with arbitrary precision. You are free to use any geohash library to encode/decode the geohashes into latitude and longitude or vice versa. Some examples include (for Python), (for Java).


start time of 15-minute intervals, in the following format: <hour>:<minute>, where hour ranges from 0 to 23 and minute is either one of (0, 15, 30, 45)


day, where the value indicates the sequential order and not a particular day of the month


aggregated demand normalised to be in the range [0,1]

You will be judged on the following criteria:

Code Quality

Creativity in Problem-solving

Code Quality, also known as Software Quality, is generally defined in two ways:

  • How well does the code conform to the functional specifications and requirements
    of a project.

  • Structural quality, which relates to the maintainability and robustness of the code.

Creativity speaks volumes about your capability to make sense of given data, derive tangible results relevant to the business needs of an organization and present the findings. All this, while keeping in mind the problem statements.

Check out our thought process behind these challenges in our short film!

Feature Engineering

Model Performance

Feature Engineering, also referred to as pre-processing, refers to the process of selecting and transforming variables when creating a data model for a given problem statement. While you will be given a general dataset which relates to the problem statement, you need to create “features” that make the models and algorithms work as intended.

Note that your code should be able to automatically create your desired features, that can be used in the evaluation of the Hold-out test set.

Model performance determines how a model represents the data and how well the chosen model will work. In this challenge, we will be performing a Hold-out model evaluation. For this problem, you are given a training dataset, and our evaluators will have a test dataset (not seen by the model). This test dataset will assess the likely future performance of the model.

Test dataset details:

1. Timeframe: The test dataset can start from any time period after the timeframe of the training dataset. Your model can use features of up to 14 consecutive days from the test dataset, ending at timestamp T and predict T+1 to T+5.

2. Geohash coverage: You may assume that the set of geohashes are the same in training dataset and test dataset. The original geohashes are anonymised, but you may assume that adjacency is maintained between the geohashes.

Submissions will be evaluated by RMSE (root mean squared error) averaged over all geohash6, 15-minute-bucket pairs.



  • Submit the correct link to your repository

  • Make sure your repository includes the complete codebase (all the commits are done, documentation, complete, etc)

  • Solve only one of the challenges mentioned on the website

  • Do not plagiarise the code. That will be grounds for instant disqualification

  • The link to your repository must be publicly accessibly from the time of submission.

You can submit the code (either as a codebase or a Jupyter notebook) by uploading it to a public Github or similar repository. The instructions to submit the repository link will be sent to you via email once you accept the challenge on

Strategic Partners

. Contact us at

Komsan Chiyadis

GrabFood delivery-partner, Thailand

Komsan Chiyadis

GrabFood delivery-partner, Thailand

COVID-19 has dealt an unprecedented blow to the tourism industry, affecting the livelihoods of millions of workers. One of them was Komsan, an assistant chef in a luxury hotel based in the Srinakarin area.

As the number of tourists at the hotel plunged, he decided to sign up as a GrabFood delivery-partner to earn an alternative income. Soon after, the hotel ceased operations.

Komsan has viewed this change through an optimistic lens, calling it the perfect opportunity for him to embark on a fresh journey after his previous job. Aside from GrabFood deliveries, he now also picks up GrabExpress jobs. It can get tiring, having to shuttle between different locations, but Komsan finds it exciting. And mostly, he’s glad to get his income back on track.