Skip to main content

Inrix, Big Data & the fine art of anonymity

How do you protect personal privacy while still allowing data to be of use in intelligent transportation? Ahmed Darrat of Inrix offers some thoughts on finding that balance...
January 9, 2025 Read time: 7 mins
Anonymisation techniques can include removing location data or rotating device IDs (© Weerapat Kiatdumrong | Dreamstime.com)

In today's rapidly-evolving transportation landscape, probe data from aggregated mobile devices and vehicles has become a critical asset for rapidly scaling mobility insights and traffic safety solutions.

This data has unlocked cost-effective solutions for managing traffic, increasing roadway safety, and planning equitable, sustainable and livable communities. As the landscape evolves, data will continue to serve as the bedrock of innovation.

However, as regulators and data providers enact new data privacy policies, the intelligent transportation industry has moved past “peak GPS”. As an industry we must balance both protecting user privacy and maintaining data utility. It is essential for transportation professionals to focus on outputs of solutions – like accuracy and precision of metrics – to help deliver outcomes for the public, rather than inputs like penetration rates.

To focus on outputs and outcomes in the current data paradigm, we must understand how anonymisation, data quality and data governance impact the pipeline of information from connected devices (e.g. phones and vehicles) in the field to your computer screen.

 

Data anonymisation

Anonymisation and de-identification techniques are designed to protect personal privacy by obfuscating, reducing or eliminating information that could potentially identify an individual. This includes sensitive locations and any personal data such as unique device identifiers (like VIN, MAC address, etc), frequently-visited places or patterns of movement that can be linked to a specific person. Under-anonymising data could reveal personal information or provide the necessary insights to reverse-engineer and obtain sensitive information. On the other hand, over-anonymising can significantly reduce the data’s value for transportation solutions.

For companies seeking to protect not only their customers but also their brands, privacy must be a core value. By committing to protecting data as it is collected, companies can enhance handling practices to ensure privacy protection goes beyond individual data provider standards. These practices include device ID rotation, data scrubbing, start and end point obfuscation, and other advanced anonymisation techniques.

 

“For companies seeking to protect not only their customers but also their brands, privacy must be a core value”

 

Anonymisation is not new; for some companies, like Inrix, it has been an important practice since day one. Companies committed to data protection must continuously adapt methods to address evolving privacy requirements while ensuring data quality. As data sources evolve, so do methods of data care.

Transportation and mobility solutions rely on aggregated, anonymised data to provide valuable insights and improve safety. Key use cases include:

•    Optimising signal timing based on traffic flow trends
•    Assessing mobility impacts approaching workzones
•    Performing origin-destination studies 
•    Monitoring daily traffic conditions
•    Identifying measures to improve road safety for all road users 

For these applications and many others, individual user identification is neither necessary nor desirable. However, when selecting a vendor to provide the detailed information necessary to deliver these and other use cases, two primary factors should be considered:

•    Commonly-visited locations can often include homes, workplaces, schools, places of worship or grocery stores. 
•    Device IDs enable connecting multiple trips to the same driver or vehicle. 

When both are known, bad actors could potentially identify behaviour of individual vehicles.

Anonymisation techniques can include removing location data or rotating device IDs. Hiding, deleting or obfuscating either of these items increases personal privacy. There are currently at least four approaches to data anonymisation. Each method takes a different approach to protecting personal privacy while supporting transportation use cases.

The table below visualises each of the most commonly-used methods:

 

Data quality

When identifying changes to the data by vehicle OEMs, logistics providers and mobile device companies, it is difficult to discuss the impacts to the outputs of products without venturing into a discussion about inputs. Most often, customers have questions about penetration rates because this is a relatively simple calculation. However, we have found that once the penetration rate is above a certain minimum threshold, any additional increase has limited impact on the products.

The two main areas of focus when evaluating whether a new data source meets users’ needs:

•    Frequency of data How often do we receive information from the vehicle or device?
•    Reliability of the data How sure are we that the data we receive best represents the ground truth?

The frequency of data pings allows for the most advanced use cases. High-frequency data allows us to observe a vehicle’s movement often enough that AI engines can determine much more in-depth information when mixed with detailed information about the context of the location. For example, even at relatively low penetration rates, metrics calculated with high-frequency vehicle data can be incredibly accurate with a mean absolute percent error of percent arrivals on green (less than 10% error) when compared to hardware-based systems. Additionally, high-frequency data allows us to calculate turning movement-level data and overall control delay, something low-frequency mobile data solutions and hardware cannot provide.

Data from aggregated mobile devices and vehicles has become a critical asset for rapidly scaling mobility insights and traffic safety solutions (© Si Le | Dreamstime.com)

Data reliability is an especially necessary topic. As data providers restrict data usage based on privacy concerns, more data brokers have entered the market. Because many of these brokers have been paid on a ‘per point’ basis, they are incentivised to provide the largest number of data points possible. As a result, a fair number of these brokers have been plagued with fraudulent data, often replaying historic data. To counter this, solutions providers need to leverage other unplanned, real-time data sources such as incidents and closures to quickly flag and delist potential fraudulent data.

 

Data governance

When identifying the utility of data coming from a specific solution, it is essential that users understand the governance of that data from the point it was created until the insights and outputs are delivered. This can be done with qualitative information without revealing intellectual property or trade secrets. 
When the industry faced a data disruption, solutions providers had a choice: obtain additional data sources or outsource processing and insights to a third party. Many solutions providers pivoted business models from obtaining raw data, applying their intellectual property, and delivering their own insights to visualising third-party insights.

While this approach may help to quickly go to market with a wide array of features, it can be a significant problem for the solutions providers’ end users. Without the vertical integration of raw data processing, delivery of insights, and development of user interfaces, end users are locked into a product even if a specific feature is missing or the product doesn’t work for them. For example, a solutions provider that only built a visualisation tool doesn’t have direct control of its supplier’s roadmap and cannot be fully responsive to improvements or corrections.

To the contrary, Inrix brought on numerous additional data sources to fill gaps and delivered a wide range of new algorithms that leveraged existing providers to be used across more products. In the end, we maintained complete control over their products and roadmaps.

 

Navigating the path forward

In this evolving landscape, organisations need to carefully evaluate their data sources to ensure they align with their privacy and data utility goals. When assessing potential data providers, consider the following key questions and factors:

1.    What is the source of the data (i.e. passenger vehicle, freight, local delivery or mobile devices)? 
2.    How accurate and frequent is the GPS signal? 
3.    What processes are in place to filter out non-transportation-related signals and identify potential biases?
4.    How does the provider handle data fraud and ensure the integrity of its data?
5.    What is the provider's approach to privacy and anonymisation practices, and how do they balance data utility?
6.    How much control does the provider have over the processing of raw data to deliver and visualise insights?

The power of data lies in its ability to transform lives and solve complex problems. As privacy concerns take centre stage, companies that continue to innovate and develop privacy-preserving solutions that benefit our communities will drive the industry forward.

The end goal should be a connected device and probe data market that ensures the development of technologies that protect the privacy of individuals while enabling critical mobility and safety solutions.

ABOUT THE AUTHOR
Ahmed Darrat is Inrix chief product officer


 

For more information on companies in this article

Related Content

  • Study highlights regressive effects of road pricing and tolling
    April 9, 2014
    Road pricing can have a detrimental effect on the mobility and employment levels of low income households. Colin Sowman talks to Floridea Di Ciommo to discover why. Since the road pricing and tolling were first introduced it has been acknowledged that such schemes could have a disproportional impact on low income households but a study in Madrid, Spain, has revealed just how regressive such measures can be. The findings revealed that the consequences of a proposed road pricing scheme would be a 17% increas
  • Cotares adds Parking Tours to its public developer site
    February 7, 2019
    Cotares, which specialises in software for navigation and mapping, has added a tool to encourage the development of smart parking solutions to its public developer site. The firm says Parking Tours is designed for the developers of route finding and guidance systems to change their offering from ‘A-to-B’ into ‘A-to-park-near-B’ where on-street parking is available. The company suggests that route guidance can be augmented by an optimised parking search (a ‘Tour’) that adapts to driver preferences, parking
  • Florida's high occupancy tolling success in reducing congestion
    July 18, 2012
    TransCore's David Sparks writes about the development of 95 Express, Florida Department of Transportation's new high-occupancy tolling facility. High-Occupancy Tolling (HOT) lanes are one of the most compelling uses of existing transportation infrastructure to expand capacity, particularly in major metropolitan areas which have limited right of way but need to relieve congestion. According to the Federal Highway Administration, while vehicle miles travelled have increased over 70 per cent in the past 20 yea
  • US lawmakers re-introduce smart cities bill
    May 17, 2019
    Proposed US legislation which advocates funnelling hundreds of millions of dollars into smart city developments has been brought back before lawmakers. Congresswoman Suzan DelBene, senator Maria Cantwell and congressman Ben Ray Luján have re-introduced the Smart Cities and Communities Act to promote the advancement of smart cities. DelBene says: “Utilising smart technologies to our advantage will allow cities to invest in clean infrastructure projects that reduce pollution, create good-paying jobs, and e