Creating Smarter Travel Policies

Merck Challenge


Visualize the impact and effectiveness of Merck’s travel policy regarding booking flights. Using Merck’s travel data and by collecting airline data, explore if the “lowest price” policy is truly the best one.

Merck’s Travel Policy stipulates that employees:

  • Book travel at least 14 days in advance
  • Accept the lowest fare for a flight regardless of restrictions

Merck would like to know if the current policy is the most effective option regarding:

  • Travel cost
  • Productivity, with regard to the administration of the policy and its impact on employees
  • Employee happiness and frustration with the travel experience

Every airline has different algorithms that determine how many tickets will be sold at a given price and how prices change as the departure date approaches. Some airlines drastically increase ticket prices as the departure date nears while others cut prices. Airlines also have unique policies regarding surcharges for ticket changes and/or cancellations. Merck collects data on how many times a traveler makes changes to their flights and how many non-refundable/non-transferable tickets are lost each year.

Does booking at least 14 days in advance reliably provide travelers with low-cost flight options? Are there destinations, times of year, days of the week, etc., where booking a different number of days in advance provides sufficient options for travelers? Is a low-cost option even the best option? How does this policy affect their work and personal lives?


Merck provides two data sets that cover flight information for all Merck travelers from January 2012 – June 2013:

    1. By Trip (Ticket Level Report)- In this data set each trip has one row that includes information such as trip dates, the traveler’s airport itinerary, airline, and price details.
      Get the trip data…
    2. By Flight (Origin/Destination Report) – In this data set each flight has one row that includes information on flight date, originating airport, destination, airline, price details and a unique trip identifier code. A single business trip will occupy multiple rows, so long as it involved multiple destinations, and/or was round trip. To link all segments of a trip together you will need to use the unique trip identifier code.
      Get the flight data…

You’ll also need the data definition, which explains what each column means in each data set.
Get the data definition…

Note: The international trip data does not require the Exec/employee field, so the data in that field may be inconsistent with the data definition. The US-based trips follow the data definition.

Some information overlaps in these files, but there are some variables unique to each file. You will also need to assemble external data such as:

  1. Ticket Change Surcharges – For the airlines Merck travelers use, collect data on the surcharges a traveler may experience if they change, transfer, or cancel their ticket.
  2. Optimal Booking Date – Considering origins/destinations and time of year, collect data on when airlines increase (or decrease) their fares as the departure date approaches.


This is an example of another important business question for Merck – can we detect evidence of bad behavior and patterns of abuse by analyzing individual employee expense reports submitted over time. The screenshot is a heatmap that shows the metrics used to detect fraud. Red blocks indicates an increase in “exceptions” (travel and expense policy violations) for a business unit. The panel on the right shows the trend for that department by month, and the panel at the bottom shows each individual exception for that department.

Some of the variables that generate exceptions include:

  • Meals Over Limit – Total value and number of meal expenses that exceeded the per deim limit
  • Lagged Submission – Count of expenses submitted by an employee over 30 days after the receipt date
  • Non-AMEX – Sum and percent of total expenses that were not placed on a corporate card
  • Duplicative Expenses – Sum of expenses where multiple transactions are identical
  • Round Expenses – Sum and number of non-corporate card expenses that end in a 0 or 5
  • Category Mismatch – Sum and number of expenses incorrectly categorized as “Miscellaneous”

Click the image to see full size