Call Detail Records (CDRs) are designed for billing purposes
A Call Detail Record (CDR) is generated by the mobile network operator (MNO) when a subscriber initiates or receives a network event, such as a call, SMS message or mobile data. You can read more about how CDRs are generated here. CDR data is stored by the MNO for billing purposes, and the features of the data are those that the MNO needs to calculate how much to bill a subscriber.
CDR data do not contain any information about the content of the network event. They do not, for example, include the contents of an SMS message or what mobile applications are using mobile data.
Key features of CDR data allow us to study human mobility
While CDR data are not initially intended for studying human population mobility, density or characteristics, the combination of temporal and spatial features allows us to understand the locations and movements of subscribers.
By using the subscriber’s identifier to link CDRs from the same individual, we can construct an individual trajectory, a series of times of network events and the approximate locations the subscriber was at each time point. The uniqueness of individual trajectories, combined with the regularity of people’s movements, means that individual trajectories can act as a ‘fingerprint’ and may be used to identify individuals even when the dataset does not contain any directly identifying information, such as name or date of birth. As a result, CDR data is considered personal data.
To protect the privacy of individual subscribers, individual trajectories are aggregated
This aggregation may involve, for example, summing the total number of subscribers residing in each administrative unit each day or the total number of subscribers travelling between two locations, or averaging the travelled distance of subscribers visiting a location. When aggregated, these can help us understand the mobility, density and characteristics of populations as a whole in situations where this information is not available.
There are three types of features in a CDR dataset which are essential for producing individual trajectories: an individual identifier, the time of the network event (the temporal dimension) and the location of the routing tower (the spatial dimension).
When first generated, CDR data contain a range of subscriber identifiers:
- MSISDN, the phone number from which the network event originated,
- IMSI, the ID of the SIM from which the network event originated,
- IMEI, the ID of the device (typically a mobile handset) from which the network event originated.
The MSISDN (the phone number) is most commonly used as the subscriber ID.
Each of these fields is pseudonymised to remove any information which could be used directly to identify the subscriber.
Time of network event
This field records the time and date of the network event.
Routing cell tower
CDR data also contain the ID for the cell tower that the network event was routed through. CDR data generated by the MNO does not contain the coordinates of the cell tower, but the cell tower ID can be used to reference the dataset containing this information.
CDR data also include a range of other features which are not essential to generating individual trajectories or mobility aggregates, but may be used in other forms of analysis. These include:
Type of network event
CDR data always include the type of network which generated the record. For example, a call, an SMS, or the use of mobile data. Depending on the context, we might only use data from certain types of network events, usually calls.
Identifier for the other party
For calls and SMS messages, the CDR data may also contain an identifier for the other party involved in the network event. This is most commonly their MSISDN.
For calls, CDR data also includes the length of the call.
For calls, SMS messages and mobile data, data may be available for the amount that the network event costs.
Similar data may also be available for mobile money transactions (payments to or from the subscriber using their mobile device) or payments to the MNO for credit on the subscribers account, known as ‘top ups’.
Cell tower data
An MNO may also hold other data which can support the analysis of CDR data, the most important of which is data related to the cell towers in the network.
In particular, the analysis of CDR data to analyse population distribution and mobility requires the geographic coordinates of the cell towers, along with the IDs that correspond to the routing tower feature in the CDRs.