Skip to main content

Inrix, Big Data & the fine art of anonymity

How do you protect personal privacy while still allowing data to be of use in intelligent transportation? Ahmed Darrat of Inrix offers some thoughts on finding that balance...
January 9, 2025 Read time: 7 mins
Anonymisation techniques can include removing location data or rotating device IDs (© Weerapat Kiatdumrong | Dreamstime.com)

In today's rapidly-evolving transportation landscape, probe data from aggregated mobile devices and vehicles has become a critical asset for rapidly scaling mobility insights and traffic safety solutions.

This data has unlocked cost-effective solutions for managing traffic, increasing roadway safety, and planning equitable, sustainable and livable communities. As the landscape evolves, data will continue to serve as the bedrock of innovation.

However, as regulators and data providers enact new data privacy policies, the intelligent transportation industry has moved past “peak GPS”. As an industry we must balance both protecting user privacy and maintaining data utility. It is essential for transportation professionals to focus on outputs of solutions – like accuracy and precision of metrics – to help deliver outcomes for the public, rather than inputs like penetration rates.

To focus on outputs and outcomes in the current data paradigm, we must understand how anonymisation, data quality and data governance impact the pipeline of information from connected devices (e.g. phones and vehicles) in the field to your computer screen.

 

Data anonymisation

Anonymisation and de-identification techniques are designed to protect personal privacy by obfuscating, reducing or eliminating information that could potentially identify an individual. This includes sensitive locations and any personal data such as unique device identifiers (like VIN, MAC address, etc), frequently-visited places or patterns of movement that can be linked to a specific person. Under-anonymising data could reveal personal information or provide the necessary insights to reverse-engineer and obtain sensitive information. On the other hand, over-anonymising can significantly reduce the data’s value for transportation solutions.

For companies seeking to protect not only their customers but also their brands, privacy must be a core value. By committing to protecting data as it is collected, companies can enhance handling practices to ensure privacy protection goes beyond individual data provider standards. These practices include device ID rotation, data scrubbing, start and end point obfuscation, and other advanced anonymisation techniques.

 

“For companies seeking to protect not only their customers but also their brands, privacy must be a core value”

 

Anonymisation is not new; for some companies, like Inrix, it has been an important practice since day one. Companies committed to data protection must continuously adapt methods to address evolving privacy requirements while ensuring data quality. As data sources evolve, so do methods of data care.

Transportation and mobility solutions rely on aggregated, anonymised data to provide valuable insights and improve safety. Key use cases include:

•    Optimising signal timing based on traffic flow trends
•    Assessing mobility impacts approaching workzones
•    Performing origin-destination studies 
•    Monitoring daily traffic conditions
•    Identifying measures to improve road safety for all road users 

For these applications and many others, individual user identification is neither necessary nor desirable. However, when selecting a vendor to provide the detailed information necessary to deliver these and other use cases, two primary factors should be considered:

•    Commonly-visited locations can often include homes, workplaces, schools, places of worship or grocery stores. 
•    Device IDs enable connecting multiple trips to the same driver or vehicle. 

When both are known, bad actors could potentially identify behaviour of individual vehicles.

Anonymisation techniques can include removing location data or rotating device IDs. Hiding, deleting or obfuscating either of these items increases personal privacy. There are currently at least four approaches to data anonymisation. Each method takes a different approach to protecting personal privacy while supporting transportation use cases.

The table below visualises each of the most commonly-used methods:

 

Data quality

When identifying changes to the data by vehicle OEMs, logistics providers and mobile device companies, it is difficult to discuss the impacts to the outputs of products without venturing into a discussion about inputs. Most often, customers have questions about penetration rates because this is a relatively simple calculation. However, we have found that once the penetration rate is above a certain minimum threshold, any additional increase has limited impact on the products.

The two main areas of focus when evaluating whether a new data source meets users’ needs:

•    Frequency of data How often do we receive information from the vehicle or device?
•    Reliability of the data How sure are we that the data we receive best represents the ground truth?

The frequency of data pings allows for the most advanced use cases. High-frequency data allows us to observe a vehicle’s movement often enough that AI engines can determine much more in-depth information when mixed with detailed information about the context of the location. For example, even at relatively low penetration rates, metrics calculated with high-frequency vehicle data can be incredibly accurate with a mean absolute percent error of percent arrivals on green (less than 10% error) when compared to hardware-based systems. Additionally, high-frequency data allows us to calculate turning movement-level data and overall control delay, something low-frequency mobile data solutions and hardware cannot provide.

Data from aggregated mobile devices and vehicles has become a critical asset for rapidly scaling mobility insights and traffic safety solutions (© Si Le | Dreamstime.com)

Data reliability is an especially necessary topic. As data providers restrict data usage based on privacy concerns, more data brokers have entered the market. Because many of these brokers have been paid on a ‘per point’ basis, they are incentivised to provide the largest number of data points possible. As a result, a fair number of these brokers have been plagued with fraudulent data, often replaying historic data. To counter this, solutions providers need to leverage other unplanned, real-time data sources such as incidents and closures to quickly flag and delist potential fraudulent data.

 

Data governance

When identifying the utility of data coming from a specific solution, it is essential that users understand the governance of that data from the point it was created until the insights and outputs are delivered. This can be done with qualitative information without revealing intellectual property or trade secrets. 
When the industry faced a data disruption, solutions providers had a choice: obtain additional data sources or outsource processing and insights to a third party. Many solutions providers pivoted business models from obtaining raw data, applying their intellectual property, and delivering their own insights to visualising third-party insights.

While this approach may help to quickly go to market with a wide array of features, it can be a significant problem for the solutions providers’ end users. Without the vertical integration of raw data processing, delivery of insights, and development of user interfaces, end users are locked into a product even if a specific feature is missing or the product doesn’t work for them. For example, a solutions provider that only built a visualisation tool doesn’t have direct control of its supplier’s roadmap and cannot be fully responsive to improvements or corrections.

To the contrary, Inrix brought on numerous additional data sources to fill gaps and delivered a wide range of new algorithms that leveraged existing providers to be used across more products. In the end, we maintained complete control over their products and roadmaps.

 

Navigating the path forward

In this evolving landscape, organisations need to carefully evaluate their data sources to ensure they align with their privacy and data utility goals. When assessing potential data providers, consider the following key questions and factors:

1.    What is the source of the data (i.e. passenger vehicle, freight, local delivery or mobile devices)? 
2.    How accurate and frequent is the GPS signal? 
3.    What processes are in place to filter out non-transportation-related signals and identify potential biases?
4.    How does the provider handle data fraud and ensure the integrity of its data?
5.    What is the provider's approach to privacy and anonymisation practices, and how do they balance data utility?
6.    How much control does the provider have over the processing of raw data to deliver and visualise insights?

The power of data lies in its ability to transform lives and solve complex problems. As privacy concerns take centre stage, companies that continue to innovate and develop privacy-preserving solutions that benefit our communities will drive the industry forward.

The end goal should be a connected device and probe data market that ensures the development of technologies that protect the privacy of individuals while enabling critical mobility and safety solutions.

ABOUT THE AUTHOR
Ahmed Darrat is Inrix chief product officer


 

For more information on companies in this article

Related Content

  • Teledyne Flir: here’s how to find the right ITS camera
    January 4, 2022
    From lighting to weather, there are so many elements which need to be taken into account when choosing a camera for ITS operations. Riana Sartori from Teledyne Flir offers a buyer’s guide
  • Weighing up the future with AI
    April 14, 2022
    There is broad agreement that artificial intelligence will be an important part of Weigh in Motion as we go forward – but Adam Hill finds that not everyone agrees quite how close we are to that point
  • Key to EV roll-out is understanding drivers
    October 22, 2021
    Understanding EV technology and driver behaviour will be key to building out the world’s charging infrastructure. Andrew Stone finds out why from Bret Scott at Wejo
  • Intelligent intersection solution improves mobility
    October 8, 2020
    Parsons’ Intelligent Intersections solution uses advanced analytics and algorithms to reduce congestion and improve mobility around cities. By gathering data already being generated at the intersection, the solution builds a dashboard of information. Traffic engineers can use the dashboard to visualise information in real-time, identify hotspot priorities, and drill down to automated traffic signal performance measures (ATSPM), allowing for more effective decision making.