Reducing Employee Insider Threats

Lockheed Martin Challenge


Theft of intellectual property is an increasing threat to organizations, and can go unnoticed for months or even years. Additionally, there are increased incidents of employees taking proprietary information when they believe they will be, or are, searching for a new job.

Congress has continually expanded and strengthened criminal penalties for violations of intellectual property rights to protect innovation and ensure that egregious or persistent intellectual property violations do not merely become a standard cost of doing business.

A domestic or foreign business competitor or foreign government intent on illegally acquiring a company’s proprietary information and trade secrets may wish to place a spy into a company in order to gain access to non-public information. Alternatively, they may try to recruit an existing employee.

Your challenge is to visualize the risk posture of the employee base and identify top potential insider threats based on predictive analysis of data sets and the use of publicly available risk indicators:

  • Patterns and/or inconsistencies in travel & phone records
  • Inconsistencies regarding actual versus planned physical location
  • Personal stressors such as poor performance reviews or demotion
  • Birth country and/or citizenship in high risk countries

Additionally, the presence of abnormalities or the absence of normality should weigh into your analysis.

Data and Additional Resources

The data sets you have been provided have been generated for the purposes of this competition and do not represent actual Lockheed Martin data or employee personal records. They are as follows:

  • Employee personal and contact information
  • Employee travel records
  • Employee phone records
  • Employee performance records
  • Employee citizenship records
  • Server access logs

Data Set (single Excel workbook with multiple tabs) [Lockheed Martin-provided file]

The following resources can also inform your analysis

FBI Publication: The Insider Threat: An introduction to detecting and deterring an insider spy

Software Engineering Institute Publication: Common Sense Guide to Mitigating Insider Threats 4th Edition

 Want more data related to this challenge? Check out the Temple Library Analytics Challenge guide.


Front page photo credit.

Finding “Hot Spots” for Election Spending

NBCUniversal Challenge


The 2014 Federal midterm elections may deliver a new record for midterm spending.  Messaging and engagement of voters in crucial races will take priority.  NBCUniversal (NBCU) would like to better understand the distribution of spending across Congressional Districts and Nielsen® DMAs (Designated Market Areas), as well as how they relate to markets with NBC owned-and-operated (O&O) and affiliate stations.

NBCU would like to use this data to better understand:

  • Which key markets (DMA) present the greatest opportunities for engagement through advertising spending (e.g. – “tight-races”, “highest ratios of funds raised vs. funds spent in key races”, etc.)?
  • What demographics are represented in those key markets (DMA) and how they relate to NBCU’s audience segments?
  • Based on understanding the fundraising and spending patterns of the mid-term elections and the subsequent results, how best should NBCU posture for the upcoming mid-term and Presidential elections (2016)?

Develop a visualization (static or interactive) that reflects campaign spending, as provided the Federal Election Commission (FEC), by Committee, Candidate, and total Contributions by Individual for each Congressional District of the 113th Congress.  This information should be triangulated with Nielsen® DMA and local station information.

Potentially relevant information for individual candidates may include:

  • Spending by PAC (Political Action Committee)
  • Party, PAC affiliation
  • Individual contributions
  • Ratio of funds raised and spent by PAC, Party
  • Demographic data for candidates’ respective Congressional District


Want more data related to this challenge? Check out the Temple Library Analytics Challenge guide.


Front page photo credit

Understanding A Corporate Move’s Impact

Merck Challenge


When a corporation decides to move the location of a major site, the impact on employees is not always completely understood. Often the total impact on the quality of life and commute time for employees to travel to a new location is not always taken into consideration.

Merck recently decided to move its corporate headquarters from Whitehouse Station, New Jersey to Kenilworth, New Jersey. Leadership wanted to understand the impact on the commute experience for the over 2,400 employees who worked at that site and whether certain organizations were more affected than others. Beyond this, there are also larger environmental impacts of such a change, caused by shifts in driving patterns, increased traffic at the new location, and potential impacts on public transit.

Your challenge is to characterize, quantify, and visualize the impact of a change of location for those 2,400 Merck employees to a new location. Specifically, your analysis should address one or more of the following questions:

  • What was the impact of relocating from Whitehouse Station, NJ to Kenilworth, NJ?
  • How does this compare to the impact of relocating instead to West Point, PA?
  • For either option, are certain organizations more negatively impacted than others?
  • If the commute experience were the only factor in making a decision and Merck could move anywhere, is there a different location that would be ideal for most employees? For the surrounding communities?

Data and Resources

  • An anonymized list of 2,453 employees that currently commute to the Whitehouse Station site. The data set includes their divisions, their home zip codes, and the organization code in which the employee works (Microsoft Excel). [Merck-provided file]
    Note 10/8/2014: Merck pointed out that there was an error in some of the zip codes in the original file. The first tab of the workbook has the entire corrected data set and the second tab has a list of the affected zip codes. [Corrected file] [Original file, for reference]
  • A list of valid US zip codes, including a tool embedded in the sheet that allows you to compute the distance between two zip codes (Microsoft Excel). [Supplemental File]
  • The zip code of Whitehouse Station, NJ is 08889.
  • The zip code of Kenilworth, NJ is 07033.
  • The zip code of West Point, PA is 19486.
Want more data related to this challenge? Check out the Temple Library Analytics Challenge guide.


Front page photo credit.