Why do we need secure VOIP? - SIP over TLS and SRTP
Author: Robert Abela
Date: 23-02-2008
Sooner or later VOIP will be one of the targets for malicious users who are always looking for new sources from where to get confidential data. By capturing VOIP calls malicious users can listen to calls and gather other information about the calls as explained later in this article. Telephone conversations with finance companies and banks, where security and PIN codes are exchanged can be a gold mine for such users.
A SIP based VOIP call
A Sip based VOIP call uses 2 protocols, SIP and RTP.
SIP is used as the control protocol. When a user starts a call or when a call is transferred, clear text SIP messages are sent between the VOIP entities like VOIP phones, a PBX or a VOIP provider. By capturing these SIP messages, anyone can gather information regarding the captured call, as explained in the sample SIP INVITE message below:
INVITE sip:callee@192.168.1.54 SIP/2.0
Call-ID: 1661063781111548084@192.168.1.55
Via: SIP/2.0/UDP 192.168.1.55:5060;branch=z9hG4bKB1C334582014F6D31DE9
Via: SIP/2.0/UDP sip.voipproducts.org
From: sip:caller@ voipproducts.org;tag=1738655730
To:sip:callee@192.168.1.54
CSeq:1 INVITE
contact:sip: 192.168.1.55:5060
Accept:application/sdp
User-Agent: Nero SIPPS IP Phone Version 2.0.51.16
Max-Forwards: 70
Content-Length: 0
In the above INVITE SIP packet we can see who is calling who, the callee (INVITE sip:callee@192.168.1.54) and the caller (From: sip:caller@voipproducts.org), what VOIP client is being used (User-Agent:
Nero SIPPS IP Phone) and other call control details.
RTP is the protocol which takes care of the media content, audio or video, or both. RTP media is sent from one VOIP entity to another using a specific codec which both VOIP entities agreed on while call negotiation was taking place using SIP messages. Anyone who capturing such data can easily decode and replay the voice conversation of the captured call using tools available freely on the internet, like Wireshark.
Securing a VOIP call
As seen above, SIP is sent as clear text and RTP streams are sent using standard Codecs between SIP entities. One of the most basic security features in VOIP, which helps protect the content and details of your call, is encryption. SIP is tunneled over TLS to encrypt the call control channel and SRTP is used to encrypt the voice or video being transmitted over the network can be used.
SIP over TLS - encrypting the control channel of a SIP based VOIP call
TLS provides a secure communication channel. The VOIP client sets up a TLS connection with the server and then exchanges encrypted SIP messages using a shared secret. Using TLS makes it very difficult for an eavesdropper to view, change / manipulate, or replay the messages exchanged. Therefore if SIP over TLS traffic is captured, it is encrypted and makes it very difficult for a user to decrypt the data. However, using SIP over TLS does not encrypt the voice / video data being sent, but only the SIP call control channel.
SRTP - encrypting the voice and video stream of a VOIP call
To encrypt the voice / video channel, SRTP must be used. SRTP encrypts the RTP stream which is sent during a conversation between 2 VOIP entities, irrelevant of the codec being used. Using a combination of SRTP and SRTCP encrypts both the audio / video stream and also the control of the SRTP session itself. A user can opt to use only 1 of them.
SIP on TLS and SRTP Support
We put up a table showing some of the most common used SIP devices and software and if they do support encryption or not. The table can be found at this
page. Information presented in this table is based on information we were able to collect on the internet. Please note that none of these features were tested from our end and if there is any incorrect information or you would like to list specific hardware or software which is not in the table, kindly contact me at abela dot robert at gmail dot com.