Understanding VoIP protocols

VoIP phones, Voice over Wi-Fi handsets and PC-based "soft phones" send H.323 or SIP messages over private or public IP networks to register themselves and initiate calls. The analog voice is then digitized, encoded, compressed and transported by Real-Time Transport Protocol (RTP)/User Datagram Protocol(UDP)/IP packets, routed between the calling and called parties. Most VoIP products employ one of the following two standard protocols to accomplish this:

  1. The Session Initiation Protocol (SIP) is a plaintext signaling protocol defined by the IETF Network Working Group's RFC 3261 (http://www.ietf.org/rfc/rfc3261). It uses HTTP-like messages to create, modify and terminate multimedia sessions. To place a call, a user agent (e.g., phone) sends a SIP INVITE request to a proxy server, which forwards it to the called party, mapping the session request to the target user agent's IP address(es). The called party relays set up progress messages -- Trying, Ringing, OK -- back through the proxy. After the caller sends an "ACK" to acknowledge receipt, the two user agents can exchange RTP packets carrying voice or multimedia. The session ends when either user agent sends a "BYE" message. To receive incoming calls, SIP user agents can REGISTER with a server.
  2. H.323 is a family of standards, developed by the International Telecommunication Union (ITU), designed to support multimedia service delivery over packet-based networks. Like SIP, H.323 uses signaling messages to create RTP sessions, which then transport voice between the calling and called parties. Unlike SIP, H.323 employs several discrete ASN.1-encoded (Abstract Syntax Notation One) protocols for registration admission status (RAS), signaling (Q.931) and control (H.245). Here's how the process works:
    • Every H.323 endpoint (e.g., phone) sends H.225 RAS messages over UDP to discover and register with a H.323 zone gatekeeper (GK). Thereafter, endpoints use RAS to request GK permission to place or disengage calls.
    • Once the GK confirms call admission, Q.931 Setup, Call Proceeding, Alerting and Connect signaling messages are sent over TCP connections between the caller and his GK, the called endpoint and his GK, and the two zone GKs.
    • Once the call is established, multimedia content is exchanged directly between endpoints over RTP. In addition, H.245 call control messages sent over TCP let endpoints discuss their capabilities and manage logical channels.

Understanding VoIP protocols
VoIP protocol insecurity
How to use fuzzing to deter VoIP protocol attacks

Check out this full VoIP protocol listing or learn more about VoIP security in this Learning Guide.



Lisa Phifer is vice president of Core Competence Inc., a consulting firm specializing in network security and management technology. Phifer has been involved in the design, implementation, and evaluation of data communications, internetworking, security, and network management products for nearly 20 years. She teaches about wireless LANs and virtual private networking at industry conferences and has written extensively about network infrastructure and security technologies for numerous publications.
This was last published in January 2006

Dig Deeper on IPv6 security and network protocols security

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.