OpenLI Tutorial 01: Introduction to LI - OpenLI-NZ/openli GitHub Wiki

OpenLI Tutorial Chapter 01 -- Introduction to LI

You may either watch the tutorial lesson on YouTube by clicking on the image above, or you can download the slides and read them alongside the transcript provided below.

Lesson Transcript

Each horizontal rule indicates when to move on to the next slide in the original presentation content.

Hello everyone and welcome to this series of training lectures about the OpenLI lawful intercept software.

My name is Shane Alcock and I am the lead author and maintainer of the OpenLI software. I’ll be guiding you through this training, which is designed to teach you everything you need to know to be ready to deploy OpenLI on your network.

Before we begin, please note that we welcome any feedback, suggestions or questions about this training. You can reach us at [email protected], or you can reach out to me directly if you prefer.

The training is split into multiple short chapters; this first chapter will introduce you to the topic of lawful intercept. Here, we’ll be looking at lawful interception from a fairly general perspective, so this chapter will be useful to anyone who is new to the topic of lawful intercept and is wondering just what it is all about. We’ll touch briefly on OpenLI, but the main aim in this chapter is to simply familiarise you with how LI generally works and the terminology that is used to describe the different parts of an LI system. We’ll delve into the more specific details in later chapters, but feel free to skip ahead if you feel like you already have this background knowledge.

So, what is Lawful Intercept (or LI as it is commonly known)? It is a tool that governments provide to law enforcement agencies (or LEAs) that allows those agencies to perform a legal pre-authorised interception of telecommunications. The interception target must be a person (or persons) resident within the jurisdiction of that agency. Interceptions may be undertaken as part of an investigation into, or the prevention of, illegal activities. Agencies that typically have these powers include the police, intelligence services, and national security agencies but other agencies or ministries may also have these powers depending on the law in a given country.

In most cases, however, these agencies do not have the ability to action an interception on their own; they instead rely on the network operators who run the communications infrastructure to capture the target’s communications and forward those on to the requesting agency. The operator is legally obliged to assist LEAs with their interceptions and must ensure that their network is ready and capable of performing intercepts on demand.

A key thing to remember here is that lawful interception must target a specific user (or set of users) that the agency has good reason to believe is involved in a criminal activity. We are not talking about dragnet-style mass surveillance. In most cases, an intercept can only take place if the intercept request is accompanied by a lawfully issued warrant. However, the authorisation powers for a warrant can vary quite a lot between countries; many countries will require judicial or magisterial oversight, but in others a warrant may be approved by a chief prosecutor or a sufficiently high-ranked police officer. In a few rare cases, warrants may be approved by government ministers.

It therefore pays to be aware of the exact circumstances surrounding the issue of interception warrants in your jurisdiction. This will allow you to know exactly when you have to comply with an interception order, as well as when you might be within your rights to refuse to carry one out. Be aware though, the laws that give interception powers to the agencies will include severe penalties for operators that fail to give assistance when required. This also means that it is especially important for operators to have some LI infrastructure in place and tested ahead of time, as you really don’t want to be the one operator that the agencies decide to make an example of.

So let’s walk through how this all works in practice (at a high level). An interception begins with an LEA identifying a target, getting their warrant and then issuing the warrant to you (the network operator). The specifics of how the warrant is issued to an operator will vary from country to country; it may involve a physical piece of signed paper or it may be a digital artefact.

Using the details provided in that warrant, and their own understanding of how the target identity would map to a customer account in their system, the operator will then configure their LI system to commence intercepting the communications traffic for the intercept target. For now, we’re simply going to think of the LI system as a magic box: it could be an OpenLI deployment, it could be a solution from another vendor, or it might be something you hacked up in-house. For now, it doesn’t really matter too much.

To be able to perform the intercept, the LI system will need to be able to see the traffic that traverses your network so it can recognise which packets belong to the intercept target and create copies of them for the LEA. It will also need to see any session management protocols that are used in your network to establish and maintain the communication sessions for each customer, including the protocols you use for AAA (such as RADIUS or GTP) as well as SIP for intercepting VOIP calls. These protocols allow the LI system to know which IP address (and in the case of VOIP, which ports) is being used by the intercept target, and therefore the system will be able to filter the IP traffic to only include packets sent or received by the target.

From these packets, the LI system will generate the intercept records to be sent back to the requesting LEA. These records will include any communications that were sent either to or from the interception target, but may also include supporting meta-data that adds additional context to those communications. The meta-data is usually derived from the session management protocols that are fed into the LI system and is often used to corroborate or validate the evidence provided by the intercepted communications. For instance, the meta-data for IP intercepts may include when the user logged in to the network, what IP address they were assigned by the ISP, and where the user logged in from. In the case of intercepted VOIP calls, the meta-data would also describe who the other party to the call was, who initiated the call, and how the media for the call was encoded. All of this information is useful supporting detail when presenting an interception as evidence in court.

Let’s take a little bit of a peek inside the magic LI box. If you’ve spent any time reading any official documents about lawful intercept, you may have come across a diagram that looks somewhat like this one. As you can see, there are a lot of acronyms and technical-sounding terms on the diagram -- I’ll explain what those mean shortly.

The LI system itself is represented by the magenta-coloured box and the components shown in that box demonstrate the specific functionalities that are required for an operator to have a working LI deployment. The blue box on the right shows the entities within the LEA domain that the operator’s LI system will need to interact with. The arrows between the two boxes represent the communication channels between the two systems, while the arrows within the operator domain represent the passage of instructions or captured records between components within the LI system.

Looks complicated, right? Don’t worry -- we’re now going to break this down into its individual parts and try to explain each in turn. Don’t hesitate to flick back to this diagram as much as you need to while we go through this -- it looks complicated at first glance, but it really just a matter of figuring out what the acronyms and the jargon actually mean.

We’re going to start with the two most prevalent acronyms in the LI terminology: the CC and the IRI. CC stands for “Content of Communication” (or “Communications Content”, depending on who you ask). A CC is a packet sent by or to an intercept target that has been intercepted by the LI system. For an IP intercept, the CCs are any IP packets that have the target’s IP address as source or destination. For a VOIP intercept, the CCs would be any intercepted RTP or RTSP packets that belong to an intercepted call.

Intercept Related Information (abbreviated as IRI) refers to the meta-data records that we described earlier which must accompany the intercepted CCs. As mentioned earlier, the IRIs are derived from packets observed for session management protocols, such as RADIUS or SIP, and provide additional supporting information about the communication that may not be readily available in the CCs. IRIs can be used to corroborate that the CCs are indeed part of a communication made by the intercept target.

You can think of CCs and IRIs as the two types of output produced by an LI system. The CCs are the actual communications that were performed by the targets while the intercept was ongoing and the IRIs are supporting meta-data for that intercept.

Next, let’s take a look at the mediation functions, of which we have two in our diagram: one for the CCs and one for the IRIs. Mediation function is really just a fancy way of saying “takes output and sends it to the intended destination”.

When the LI system produces a CC from an intercepted packet, the CC is passed to the CC Mediation function, which knows which LEA is supposed to receive that CC. The mediation function then forwards that CC to the destination LEA over an established communications channel (called HI3 in this case) using a well-defined protocol format. And that’s really all there is to it -- in many ways, you can simply think of the mediation functions as glorified routing entities.

Take intercept output, forward to appropriate LEA. Simple!

The Law Enforcement Monitoring Facility (or LEMF) is another convoluted term. It really just refers to the place where the LEA wants you to send your CCs and IRIs to. It will likely be the entrypoint into another magic box that the LEA paid a lot of money for which takes the intercept records that your LI system provides and reconstructs them into usable evidence (such as audio files containing a telephone call that took place, a reconstruction of a website that a target was browsing, or a chat log, for example).

The CCs and IRIs are generated by the Internal Intercept Function (or IIF). The IIF is some software that runs either on or alongside your core networking equipment, observing the packets that traverse your network and identifying any that should be intercepted to produce either CCs or IRIs. Usually the IIF will need to be able to track session state for each network user, so that it can readily associate any observed IP packets back to the user that they belong to and check if that user is an intercept target. The session state tracking is also required to correctly generate IRIs for the target session.

The OpenLI collector software is an example of an IIF that runs alongside your networking equipment -- you install the OpenLI collector on a separate server and then mirror interceptable traffic (and session management protocols) from your routers into the OpenLI server. This has the advantage that much of the performance impact of interception is moved away from your core routers, aside from the cost of mirroring traffic. The downside, of course, is that you have to dedicate ports on your routers for traffic mirroring and that mirroring needs to be actively configured and managed.

I’ll touch more on intercept functions that run directly on your networking equipment a bit later on.

The IIFs and mediation functions are controlled by the Administration Function (or AF). This function will receive intercept orders from the LEAs and turn those into instructions that can be pushed out to the other components within the operator domain. For IIFs, these instructions will include the identity of the intercept target and any labels that need to be added to any records captured as part of that intercept. For a mediation function, the instructions will tell the mediator which LEA needs to receive the results of the interception.

For many LI deployments, the administration function may include some non-automated elements such as verifying the validity of the received interception order, or updating the mirroring configuration on routers to ensure that all target traffic can be seen by an IIF.

The last concept I want to introduce here is the handover, which is simply a communication channel between the operator and the LEA.

There are three handovers that are explicitly defined for an LI system, which are imaginatively named Handover Interface 1, Handover Interface 2 and Handover Interface 3. Handover Interface 1 is the method by which an operator will receive an interception order from the LEA. How HI1 works in practice will be determined by the LEAs that you have to deal with -- this could be a physical handover of a paper warrant or an email with a digital copy of the warrant attached. Usually, someone employed by the operator will examine the interception order and confirm that it is valid before it can be entered into the Administration function; the process cannot be fully automated.

Handover Interface 2 is used to deliver the intercepted IRI meta-data records to the LEA and Handover Interface 3 is used to deliver the intercepted CC records to the LEA. The formatting and delivery mechanism used by these two handovers will depend on the LI standards that are supported by the LEAs, but will almost certainly require some level of encryption to be applied to anything transmitted via these handovers.

There are two widely recognised sets of standards that govern the process of lawful interception and the method of delivering the intercepted content to the LEAs. These are the CALEA standards, which are used in the United States, and the ETSI standards, which are used almost everywhere else. Russia also has its own separate standard, known as SORM.

The important thing to know about all of these standards is that they require the operator to go above and beyond the approach of simply tcpdumping a customer connection and sending that pcap file on to the LEA afterwards. The standards are designed to ensure that the interception contains sufficient information to withstand scrutiny in court; the associated meta-data that would not be present in a pcap assists with this, but the standards also require every intercepted record to be accurately labelled and sequenced so that there can be no doubt that the full communication has been intercepted.

Another obvious benefit of having clearly defined standards is that the developers of LI systems can know the expected format of any output produced by their system, and both operators and LEAs can purchase LI equipment under the expectation that if it is compliant with the expected standard then there should be no interoperability problems with the handovers.

The last point I want to make about these standards is that they all specify that operators must be able to deliver the intercepted content to the LEAs in real time. WIth a pcap-based system, an operator must complete the tcpdump capture before the file can be uploaded to the LEA. In situations where the LEA investigation is time-sensitive, such as an ongoing kidnapping or in the immediate aftermath of a terrorism incident, the pcap approach is not particularly helpful. Therefore, real-time delivery is an essential requirement for LEAs and this is reflected in the modern LI standards. Never expect an LEA to be happy if you offer to send them a pcap in response to an interception order.

While having clearly defined, detailed standards is a huge benefit from the perspective of the LEAs, it can be a bit problematic for the operators. The extra details and requirements means that the implementation of those standards becomes quite a complex task. The likelihood that an operator would have the resources available to be able to develop a fully compliant solution in-house is extremely low, so most are forced to look at buying an existing third-party system to meet their LI obligations.

Of course, choosing an LI system is a challenge in itself (especially when you are not an expert in the field) and there are a number of factors that come into play when trying to make this decision. Budget is obviously a big one -- many of the commercial-grade LI solutions are very expensive and may simply be unaffordable for smaller operators. Other things that you may need to consider are: The level of support that you are expecting (and willing to pay for) from your LI vendor. Which LI standards are supported by the system being considered -- some LI systems may only support the ETSI standards, some may only support CALEA. Whether you are confident in the security capabilities and trustworthiness of the makers of the system. The last thing you want is a security vulnerability that allows unauthorized parties to be able to utilise your intercept infrastructure to spy on your customers.

Note that these factors also equally apply if you are an LEA looking to purchase a system for receiving and processing intercepted communications.

Back to thinking about where to acquire an LI system from; the first and most obvious option is to buy a commercial solution from one of a number of companies that are specialist lawful interception vendors. You can expect the system that you buy to be standards-compliant, well supported and complete, including functions for mediation and administration (well probably; no doubt there are dud vendors out there, as with anything, so stick with the well-known ones if you’re not sure).

The downside is that this will all come at a fairly eye-watering cost, both in terms of initial installation costs and ongoing support fees. If you are a large carrier with the money to spend and expect to be required to perform enough interceptions to make it worthwhile, then a commercial grade solution is definitely a good option to consider.

Another option that may come up when researching LI systems are the licenses for LI that you can purchase for your existing networking hardware. Cisco, Juniper, Nokia, and undoubtedly others offer these licenses to enable your equipment to “perform LI”, which no doubt sounds great to the unwary operator. The prospect of being able to simply purchase an additional license on equipment that you already have deployed and have your LI problems solved is very tantalizing. It’s also very handy to have an IIF that runs directly on your networking hardware, rather than having to mirror traffic to a separate device.

But the LI capabilities offered by these licenses are often missing key features that you will still need to provide to have a fully-compliant LI system. For instance: The output format produced by these systems is a custom protocol specific to that hardware vendor which does not conform to the LI standards that we discussed earlier. The interception functions often do not produce IRI records, just CCs. The mediation function is missing and will need to be provided by the purchaser.

So really, the LI features on your routers are only a partial solution at best, but because they are labelled as “Lawful Interception” in the sales brochures then operators have in the past fallen into the trap of buying these licenses only to find that they still have a lot more work to do to satisfy the LEAs that their interception system is functional.

Having said that, if you can afford them then the LI capabilities on these devices are actually very useful for mirroring the traffic for specific intercept targets into an IIF that does produce suitable CCs and IRIs. Your IIF would need to be able to parse and strip the custom protocols used by your vendor to encapsulate the intercepted packets, however, so this approach only works in certain circumstances.

The final option that we’re going to talk about today (and presumably the one that brought you here in the first place) is OpenLI. OpenLI is open-source software that is designed to provide an ETSI-compliant low cost LI solution to smaller network operators that cannot hope to afford the commercial alternatives. To run OpenLI, you don’t need much in the way of fancy expensive hardware -- a small number of off-the-shelf 1U servers with a DPDK-capable card will meet the requirements easily. We’ve had good success with getting OpenLI running in production on a number of networks in New Zealand, and are now in the process of trying to take it to the rest of the world. The source code is all freely available on GitHub under a GPLv3 license and we also publish packages for several popular Linux distributions.

One neat thing about OpenLI, which was a regular request from our initial users, is that it is able to understand some of those custom network vendor LI formats that I just mentioned and convert them into CCs and IRIs that conform to the ETSI standards. So if you already have access to those LI capabilities on your routers, then they may be able to interface directly with OpenLI.

That concludes this chapter of our OpenLI training series. Hopefully by now, you have a very good idea of what Lawful Intercept is and how an LI system works in practice. You should be a lot more familiar with the various terminology that is used to describe the different parts of an LI system and how those components interact to satisfy an interception order from an LEA. You also have a passing knowledge of the technical standards for LI and be aware of some of the different options available for adding LI capabilities to your network.

Join me in the next chapter where we’re going to delve a bit more deeply into the ETSI standards to make sure that we properly understand what is required for an LI system to be fully compliant with these standards. See you then!