Network Infrastructure Design for IP Phone
1 Introduction Let’s start by describing the new type of phone system that businesses are now using to replace traditional phone systems in their offices. An “IP Phone System” (sometimes called an IP PBX) uses the technology of “IP (Internet Protocol)” to carry the voice conversations in your office. This does not necessarily mean it uses the public Internet. An IP Phone System uses IP technology within the private data network of a business in a single location or across a private network. The same cabling that a business uses for its data network is used to carry the voice traffic of the phone system. In some ways they are totally independent and just sharing the same cabling. In one way they affect each other.
They are independent in that if the data server goes down, the voice will still go through. Your phone system will still work. Likewise if they phone system goes down, the data will still go through.The way the IP Phone System and data network could affect each other is in the capacity or “bandwidth” of the network, both in the office and going to the outside world. Data is “forgiving” meaning it is not time sensitive. If it is delayed by several tenths of a second or seconds to move your data back and forth the quality of the data doesn’t suffer. However, voice is time sensitive. It must occur in “real time” which typically means there can’t be more than 150 milliseconds (0.15 seconds) of delay in moving the voice traffic between its destinations. If the combined voice and data traffic is more than the capacity of the network infrastructure to handle it then the voice quality can suffer. The network infrastructure consists of the cabling and the equipment throughout the network. IP Telephony on a properly designed, private network has the same voice quality as traditional phone systems. To be “properly designed” the network must include a proper “Quality of Service” plan and execution with the proper equipment. (That discussion is too much to include in this article. Give us a call.)
You can use IP Telephony over your private data network to connect remote sites with multiple workers or remote workers in home offices. If you don’t have a private network between sites you can use the public Internet to access remote sites. 2 Helpful •
Seamless extension dialing between all our locations.
•
IP Telephony creates lower cost and greater functionality advantages from carrier services.
•
Easily and economically connecting home based workers.
•
Easily and economically connecting home based workers.
•
Enhanced contact center (call center) responsiveness to customer needs.
•
“Contact center” is the replacement term for what used to be “call center.”
•
Disaster recovery and power outage backup for business continuity.
•
Simplified system administration: Through a GUI (graphical user interface). You can make changes to your system that previously required your telephone equipment vendor to make the changes. Therefore, you can significantly reduce your maintenance costs.
•
Easier moves of telephone sets: When moving from one location in your building to another, it previously required re-programming the telephone switch and physically changing some wires in the “telephone closet.” With IP Telephony as you pack up your desk supplies and plants, you also grab your telephone. In your new location, you simply plug the telephone into the Ethernet connection in the wall and then connect your computer to a jack in the phone that acts as a bypass for your data. All your personal settings move with you. Costs for moves are dramatically reduced.
•
Software upgrades are much easier: And can be performed by you instead of paying the telephone equipment vendor to do them.
There are many more benefits to IP Telephony but this brief overview should be enough to peak your interest to continue your investigation. You don’t need to make a total swap out of your current phone system. It is possible to gradually introduce an IP Telephone System into your organization and interface it to legacy systems.
3 Voice over IP Overview There are four new sections about SIP: Introduction to SIP, SIP Messages, SIP Call Flow, and SIP — Session Description Protocol. The other pieces cover digitization of voice, audio codecs, codec latency vs bandwidth optimization, audio jitter, the Real Time Protocol, introduction to H.323, description of H.323 call flow, and H.323 call signalling optimizations. 4 Echo Canceling
5 Gatekeeper Basic Operations
6 Goal of the Project IP PHONE OR TELEPHONY SYSTEM ARE 2 TYPE *LAN, MAN, WAN Networks & *WIRELESS Networks. (Discussed in wired technology) 7 OSI Model The Open Systems Interconnection model (OSI model) is a product of the Open Systems Interconnection effort at the International Organization for Standardization. It is a prescription of characterizing and standardizing the functions of a communications system in terms of abstraction layers. Similar communication functions are grouped into logical layers. An instance of a layer provides services to its upper layer instances while receiving services from the layer below. For example, a layer that provides error-free communications across a network provides the path needed by applications above it, while it calls the next lower layer to send and receive packets that make up the contents of that path. Two instances at one layer are connected by a horizontal connection on that layer. OSI model
7. Application Layer
NNTP · SIP · SSI · DNS · FTP ·Gopher · HTTP · NFS · NTP · SMPP ·SMTP · SNMP · Teln et · DHCP ·Netconf · RTP · SPDY · (more) 6. Presentation Layer
MIME · XDR · TLS · SSL 5. Session Layer
Named Pipes · NetBIOS · SAP · L2TP · PPTP · SOCKS 4. Transport Layer
TCP · UDP · SCTP · DCCP · SPX 3. Network Layer
IP (IPv4, IPv6) · ICMP · IPsec · IGMP ·IPX · AppleTalk 2. Data Link Layer
ATM · SDLC · HDLC · ARP · CSLIP ·SLIP · GFP · PLIP · IEEE 802.3 ·Frame Relay · ITUT G.hn DLL · PPP ·X.25 · Network Switch · 1. Physical Layer
EIA/TIA-232 · EIA/TIA-449 · ITU-T V-Series · I.430 · I.431 · POTS · PDH ·SONET/SDH · P ON · OTN · DSL ·IEEE 802.3 · IEEE 802.11 ·IEEE 802.15 · IEEE 802.16 · IEEE 1394 · ITUT G.hn PHY · USB · Bluetooth ·Hubs
OSI Model Data unit
Layer
Function
7. Application Network process to application Host Data layers Segments
6. Presentation
Data representation, encryption and decryption, convert machine dependent data to machine independent data
5. Session
Interhost communication
4. Transport
End-to-end connections, reliability and flow control
Packet/Datagram 3. Network Media Frame 2. Data Link layers Bit 1. Physical
Path determination andlogical addressing Physical addressing Media, signal and binary transmission
Description of OSI layers According to recommendation X.200, there are seven layers, each generically known as an N layer. An N+1 entity requests services from the layer N entity. At each level, two entities (N-entity peers) interact by means of the N protocol by transmitting protocol data units (PDU). A Service Data Unit (SDU) is a specific unit of data that has been passed down from an OSI layer to a lower layer, and which the lower layer has not yet encapsulated into a protocol data unit (PDU). An SDU is a set of data that is sent by a user of the services of a given layer, and is transmitted semantically unchanged to a peer service user. The PDU at any given layer, layer N, is the SDU of the layer below, layer N-1. In effect the SDU is the 'payload' of a given PDU. That is, the process of changing a SDU to a PDU, consists of an encapsulation process, performed by the lower layer. All the data contained in the SDU becomes encapsulated within the PDU. The layer N-1 adds headers or footers, or both, to the SDU, transforming it into a PDU of layer N-1. The added headers or footers are part of the process used to make it possible to get data from a source to a destination. Some orthogonal aspects, such as management and security, involve every layer. Security services are not related to a specific layer: they can be related by a number of layers, as defined by ITU-T X.800 Recommendation.[3] These services are aimed to improve the CIA triad (i.e.confidentiality, integrity, availability) of transmitted data. Actually the availability of communication service is determined by network design and/or network management protocols. Appropriate choices for these are needed to protect against denial of service. Layer 1: Physical Layer The Physical Layer defines electrical and physical specifications for devices. In particular, it defines the relationship between a device and transmission, such as a copper or optical cable. This includes the layout of pins, voltages, cable specifications, hubs, repeaters, network, host bus adapters (HBA used in storage area networks) and more. The major functions and services performed by the Physical Layer are: 
Establishment and termination of a connection to a communications medium.

Participation in the process whereby the communication resources are effectively shared among multiple users. For example, contention resolution and flow control.

Modulation, or conversion between the representation of digital data in user equipment and the corresponding signals transmitted over a communications channel. These are signals operating over the physical cabling (such as copper and optical fiber) or over a radio link.
Parallel SCSI buses operate in this layer, although it must be remembered that the logical SCSI protocol is a Transport Layer protocol that runs over this bus. Various Physical Layer Ethernet standards are also in this layer; Ethernet incorporates both this layer and the Data Link Layer. The same applies to other local-area
networks, such as token ring, FDDI, ITU-T G.hn and IEEE 802.11, as well as personal area networks such as Bluetooth and IEEE 802.15.4. Layer 2: Data Link Layer The Data Link Layer provides the functional and procedural means to transfer data between network entities and to detect and possibly correct errors that may occur in the Physical Layer. Originally, this layer was intended for point-to-point and point-to-multipoint media, characteristic of wide area media in the telephone system. Local area network architecture, which included broadcast-capable multiaccess media, was developed independently of the ISO work in IEEE Project 802. IEEE work assumed sublayering and management functions not required for WAN use. In modern practice, only error detection, not flow control using sliding window, is present in data link protocols such asPoint-to-Point Protocol (PPP), and, on local area networks, the IEEE 802.2 LLC layer is not used for most protocols on the Ethernet, and on other local area networks, its flow control and acknowledgment mechanisms are rarely used. Sliding window flow control and acknowledgment is used at the Transport Layer by protocols such as TCP, but is still used in niches where X.25 offers performance advantages. The ITU-T G.hn standard, which provides high-speed local area networking over existing wires (power lines, phone lines and coaxial cables), includes a complete Data Link Layer which provides both error correction and flow control by means of a selective repeat Sliding Window Protocol. Both WAN and LAN service arranges bits, from the Physical Layer, into logical sequences called frames. Not all Physical Layer bits necessarily go into frames, as some of these bits are purely intended for Physical Layer functions. For example, every fifth bit of the FDDI bit stream is not used by the Layer. WAN protocol architecture Connection-oriented WAN data link protocols, in addition to framing, detect and may correct errors. They are also capable of controlling the rate of transmission. A WAN Data Link Layer might implement a sliding window flow control and acknowledgment mechanism to provide reliable delivery of frames; that is the case for Synchronous Data Link Control (SDLC) and HDLC, and derivatives of HDLC such as LAPB andLAPD. IEEE 802 LAN architecture Practical, connectionless LANs began with the pre-IEEE Ethernet specification, which is the ancestor of IEEE 802.3. This layer manages the interaction of devices with a shared medium, which is the function of a Media Access Control (MAC) sublayer. Above this MAC sublayer is the media-independent IEEE 802.2 Logical Link Control (LLC) sublayer, which deals with addressing and multiplexing on multiaccess media. While IEEE 802.3 is the dominant wired LAN protocol and IEEE 802.11 the wireless LAN protocol, obsolescent MAC layers include Token Ring and FDDI. The MAC sublayer detects but does not correct errors. Layer 3: Network Layer The Network Layer provides the functional and procedural means of transferring variable length data sequences from a source host on one network to a destination host on a different network, while
maintaining the quality of service requested by the Transport Layer (in contrast to the data link layer which connects hosts within the same network). The Network Layer performs network routing functions, and might also perform fragmentation and reassembly, and report delivery errors. Routers operate at this layer— sending data throughout the extended network and making the Internet possible. This is a logical addressing scheme – values are chosen by the network engineer. The addressing scheme is not hierarchical. The Network Layer may be divided into three sublayers: 1. Subnetwork Access – that considers protocols that deal with the interface to networks, such as X.25; 2. Subnetwork Dependent Convergence – when it is necessary to bring the level of a transit network up to the level of networks on either side 3. Subnetwork Independent Convergence – which handles transfer across multiple networks. An example of this latter case is CLNP, or IPv7 ISO 8473. It manages the connectionless transfer of data one hop at a time, from end system to ingress router, router to router, and from egress router to destination end system. It is not responsible for reliable delivery to a next hop, but only for the detection of erroneous packets so they may be discarded. In this scheme, IPv4 and IPv6 would have to be classed with X.25 as subnet access protocols because they carry interface addresses rather than node addresses. A number of layer management protocols, a function defined in the Management Annex, ISO 7498/4, belong to the Network Layer. These include routing protocols, multicast group management, Network Layer information and error, and Network Layer address assignment. It is the function of the payload that makes these belong to the Network Layer, not the protocol that carries them. Layer 4: Transport Layer The Transport Layer provides transparent transfer of data between end users, providing reliable data transfer services to the upper layers. The Transport Layer controls the reliability of a given link through flow control, segmentation/desegmentation, and error control. Some protocols are state- and connection-oriented. This means that the Transport Layer can keep track of the segments and retransmit those that fail. The Transport Layer also provides the acknowledgement of the successful data transmission and sends the next data if no errors occurred. OSI defines five classes of connection-mode transport protocols ranging from class 0 (which is also known as TP0 and provides the least features) to class 4 (TP4, designed for less reliable networks, similar to the Internet). Class 0 contains no error recovery, and was designed for use on network layers that provide errorfree connections. Class 4 is closest to TCP, although TCP contains functions, such as the graceful close, which OSI assigns to the Session Layer. Also, all OSI TP connection-mode protocol classes provide expedited data and preservation of record boundaries. Detailed characteristics of TP0-4 classes are shown in the following table:[4] Feature Name
TP0 TP1 TP2 TP3 TP4
Connection oriented network
Yes Yes Yes Yes Yes
Connectionless network
No No No No Yes
Concatenation and separation
No Yes Yes Yes Yes
Segmentation and reassembly
Yes Yes Yes Yes Yes
Error Recovery
No Yes Yes Yes Yes
Reinitiate connection (if an excessive number of PDUs are unacknowledged)
No Yes No Yes No
Multiplexing and demultiplexing over a single virtual circuit
No No Yes Yes Yes
Explicit flow control
No No Yes Yes Yes
Retransmission on timeout
No No No No Yes
Reliable Transport Service
No Yes No Yes Yes
Perhaps an easy way to visualize the Transport Layer is to compare it with a Post Office, which deals with the dispatch and classification of mail and parcels sent. Do remember, however, that a post office manages the outer envelope of mail. Higher layers may have the equivalent of double envelopes, such as cryptographic presentation services that can be read by the addressee only. Roughly speaking, tunneling protocols operate at the Transport Layer, such as carrying non-IP protocols such as IBM's SNA or Novell's IPX over an IP network, or end-to-end encryption with IPsec. While Generic Routing Encapsulation (GRE) might seem to be a Network Layer protocol, if the encapsulation of the payload takes place only at endpoint, GRE becomes closer to a transport protocol that uses IP headers but contains complete frames or packets to deliver to an endpoint. L2TP carries PPP frames inside transport packet. Although not developed under the OSI Reference Model and not strictly conforming to the OSI definition of the Transport Layer, the Transmission (TCP) and the User Datagram Protocol (UDP) of the Internet Protocol Suite are commonly categorized as Layer 4 protocols within OSI. Layer 5: Session Layer The Session Layer controls the dialogues (connections) between computers. It establishes, manages and terminates the connections between the local and remote application. It provides for full-duplex, half-duplex, or simplex operation, and establishes check pointing, adjournment, termination, and restart procedures. The OSI model made this layer responsible for graceful close of sessions, which is a property of the Transmission Control Protocol, and also for session check pointing and recovery, which is not usually used in the Internet Protocol Suite. The Session Layer is commonly implemented explicitly in application environments that use remote procedure calls. Layer 6: Presentation Layer The Presentation Layer establishes context between Application Layer entities, in which the higher-layer entities may use different syntax and semantics if the presentation service provides a mapping between them. If a mapping is available, presentation service data units are encapsulated into session protocol data units, and passed down the stack. This layer provides independence from data representation (e.g., encryption) by translating between application and network formats. The presentation layer transforms data into the form that the application accepts. This layer formats and encrypts data to be sent across a network. It is sometimes called the syntax layer.[5]
The original presentation structure used the basic encoding rules of Abstract Syntax Notation One (ASN.1), with capabilities such as converting an EBCDIC-coded text file to an ASCII-coded file, or serialization of objects and other data structures from and to XML. Layer 7: Application Layer The Application Layer is the OSI layer closest to the end user, which means that both the OSI application layer and the user interact directly with the software application. This layer interacts with software applications that implement a communicating component. Such application programs fall outside the scope of the OSI model. Application layer functions typically include identifying communication partners, determining resource availability, and synchronizing communication. When identifying communication partners, the application layer determines the identity and availability of communication partners for an application with data to transmit. When determining resource availability, the application layer must decide whether sufficient network or the requested communication exists. In synchronizing communication, all communication between applications requires cooperation that is managed by the application layer. Some examples of application layer implementations also include: On OSI stack:
FTAM File Transfer and Access Management Protocol
X.400 Mail
Common management information protocol (CMIP) On TCP/IP stack:
Hypertext Transfer Protocol (HTTP),
File Transfer Protocol (FTP),
Simple Mail Transfer Protocol (SMTP)
Simple Network Management Protocol (SNMP).
8 Call Features ADSI On-Screen Menu System
Call Forward on No Answer
Alarm Receiver
Call Forward Variable
Append Message
Call Monitoring
Authentication
Call Parking
Automated Attendant
Call Queuing
Blacklists
Call Recording
Blind Transfer
Call Retrieval
Call Detail Records
Call Routing (DID & ANI)
Call Forward on Busy
Call Snooping
Call Transfer Call Waiting Caller ID Caller ID Blocking Caller ID on Call Waiting Calling Cards Conference Bridging Database Store / Retrieve Database Integration Dial by Name Direct Inward System Access Distinctive Ring Distributed Universal Number Discovery (DUNDi™) Do Not Disturb E911 ENUM
Call Features Talk Detection Text-to-Speech (via Festival) Three-way Calling Time and Date Transcoding Trunking VoIP Gateways Voicemail: - Visual Indicator for Message Waiting - Stutter Dialtone for Message Waiting - Voicemail to email - Voicemail Groups - Web Voicemail Interface Zapateller Computer-Telephony Integration
Fax Transmit and Receive Flexible Extension Logic
AGI (Asterisk Gateway Interface)
Interactive Directory Listing
Graphical Call Manager
Interactive Voice Response (IVR)
Outbound Call Spooling
Local and Remote Call Agents
Predictive Dialer
Macros
TCP/IP Management Interface
Music On Hold
Scalability
Music On Transfer: - Flexible Mp3-based System
TDMoE (Time Division Multiple over Ethernet)
- Random or Linear Play
Allows direct connection of Asterisk PBX
- Volume Control
Zero latency
Predictive Dialer Privacy Open Settlement Protocol (OSP) Overhead Paging Protocol Conversion Remote Call Pickup Remote Office Support Roaming Extensions Route by Caller ID SMS Messaging Spell / Say Streaming Media Access Supervised Transfer
Uses commodity Ethernet hardware Voice-over IP Allows for integration of physically separate installations Uses commonly deployed data connections Allows a unified dialplan across multiple offices Speech Cepstral TTS Lumenvox ASR Vestec ASR
Codecs
Traditional Telephony Protocols
ADPCM
E&M
G.711 (A-Law & μ-Law)
E&M Wink
G.719 (pass through)
Feature Group D
G.722
FXS
G.722.1 licensed from Polycom®
FXO
G.722.1 Annex C licensed from Polycom®
GR-303
G.723.1 (pass through)
Loopstart
G.726
Groundstart
G.729a
Kewlstart
GSM
MF and DTMF support
iLBC
Robbed-bit Signaling (RBS) Types
Linear
MFC-R2 (Not supported. However, a patch is
LPC-10
available)
Speex
ISDN Protocols
VoIP Protocols Google Talk
AT&T 4ESS EuroISDN PRI and BRI
H.323 IAX™ (Inter-Asterisk exchange) Jingle/XMPP MGCP (Media Gateway Control Protocol SCCP (Cisco® Skinny®) SIP (Session Initiation Protocol)
Lucent 5ESS National ISDN 1 National ISDN 2 NFAS Nortel DMS100 Q.SIG
Skype UNIStim 9 VoIP Basics: Converting Voice to Digital Form Are you interested in Voice over IP? Would you like to know more about its background? This text begins a series that should shed some light on it. Let's start with the beginning. VoIP sends digitized voice across computer networks. So how do we convert voice to the digital form? When converting an analog signal (be it speech or another noise), you need to consider two important factors: sampling and quantization. Together, they determine the quality of the digitized sound. Sampling is about the sampling rate — i.e. how many samples per second you use to encode the
•
sound. •
Quantization is about how many bits you use to represent each sample. The number of bits determines the number of different values you can represent with each sample.
Figures 1 and 2 shows the idea of sampling — Figure 1 is the original analog signal, while Figure 2 shows the digitized form as a sequence of discrete samples. Fi re
gu 1:
Analog signal
Figure 2: Digitized signal 10 Quantization As mentioned above, quantization is about how many bits you use to represent individual sound samples. In practice, we want to work with whole bytes, so let's consider 8 or 16 bits. With 8-bit samples, each sample can represent 256 different values, so we can work with whole numbers between -128 and +127. Because of the whole numbers, it is inevitable that we introduce some noise into the signal as we convert it to digital samples. For example, if the exact analog value is "7.44125", we will represent it as "7". As we do this with each sample in the sequence, we slightly distort the signal — inject noise, in other words. It turns out 8-bit samples do not result in a good quality. With only 256 sample values, the analog-to-digital conversion adds too much noise. The situation improves a lot if we switch to 16-bit samples as 16 bits give
us 65536 different representations (from -32768 to +32767). 16-bit samples are what you will find on a CD and what VoIP codecs use as their input.
11 Sampling Now that we have decided what sample size to use (16 bits), let's look at sampling rates. The table below shows three frequently used sampling rates: Type
Transmitted Bandwidth
Sampling Frequency
Telephone Speech
300-3400 Hz
8 kHz
Wide Band Speech
50-7000 Hz
16 kHz
CD quality audio
20-20000 Hz
44.1 kHz
With VoIP, you will most frequently encounter the sampling rate of 8 kilohertz. The frequency of 16 kHz can be used now and then in situations when a higher quality audio is required (with proportionally higher Internet bandwidth consumption). The choice of sampling frequencies for the individual types of audio is not random. There is a rule (based on the work of Nyquist and Shanon) that the sampling frequency needs to be equal or greater than two times the transmitted bandwidth. Figures 3 and 4 show why this is required.
Figure 3 In Figure 3, the sinusoid represents the original analog sound. The large black dots are where we read our samples. Note that we take two samples in each period, i.e. the sampling rate is two times the frequency of the sound. This is the absolute minimum that will allow us to reconstruct a signal that is still comprehensible. It certainly won't be a hi-fi sound but it will have the correct frequency - see the thin black lines in the picture. The Figure 4 shows a situation where we take less than two samples per period. The thin black lines show what would happen after we feed the samples into a digital-to-analog converter — we would hear something different from the original, a sound with lower frequency. This problem is known as "aliasing" since the lower frequency appears to be an "alias" to the original correct one.
12 VoIP Protocols: Introducing SIP The Session Initiation Protocol (SIP for short) is a Voice over IP protocol designed by the Internet Engineering Task Force. SIP was created by the MMUSIC group of the IETF (MMUSIC stands for Multiparty Multimedia Session Control). Formally, the protocol is intended for creating, modifying and terminating sessions with one or more participants. The sessions are mainly VoIP telephone calls or conferences. The first version of SIP was published in 1999 in RFC2543 with the two main authors being Mark Handley and Henning Schulzrinne. The standard was updated to version 2.0 in 2002 with RFC3261 and naturally there were many subsequent updates and extensions (RFC3265, RFC3853, RFC4320, RFC4916, RFC5393, RFC5621, RFC5626, RFC5630). 13 SIP Characteristics Unlike H.323, SIP is a text-based protocol. The formatting of SIP requests and responses is based on HTTP version 1.1. Endpoints that communicate using SIP use the following three protocols: SIP itself, used to establish and terminate the session; Session Description Protocol (SDP for short, RFC2327, obsoleted by RFC4566), used to exchange information about audio/video channels. Like SIP, SDP is also a product of the IETF's MMUSIC group; RTP, used to send the real-time streams of audio or video across the network.SIP messages are exchanged between endpoints in transactions. A transaction consists of a request and the related response or responses. The messages that belong to the same transaction share the same transaction ID. This ID is called CSeq in SIP. Each transaction should have a unique CSeq number, with only a single exception: the ACK message (ACK for "acknowledge") uses the same CSeq number as the transaction which it applies to. SIP can use either UDP or TCP as the underlying transport protocol. Originally (in RFC2543), UDP was the only mandatory option. According to RFC3261 from 2002, all endpoints must be able to send SIP messages over both UDP and TCP. Still, UDP is the more frequently used option. When communicating over TCP, two modes are possible: either the same TCP channel is used for all transactions of a session or a new TCP connection is established for each individual transaction. 13 The SIP Protocol The Session Initiation Protocol (SIP) is a protocol for establishing real time communication sessions with one or more participants. It’s most frequently used for Voice communications but it can handle video as well, as well as future applications. SIP was designed to be independent of the transport layer, i.e it can work on UDP, TCP or STCP. All voice/video communications take place via another protocol, usually RTP. There are many RFCs surrounding SIP, but the most important one is RFC 3261
SIP is a text based protocol that looks and acts very much like the HTTP protocol. The original designers (Henning Schulzrinne& Mark Handley) wanted to make a protocol that had its roots in the IP world, rather then in the telecoms world. Sip has been an amazing success, beingthe major driver in the adoption of VOIP and Computer Telephony in recent years. All major manufacturers have adopted the standard and availability of SIP software, SIP hardware and Sip service providers is widespread. Sip servers are responsible for setting up the calls between Sip devices. SIP servers usually combine several of the SIP server functions such as SIP proxy and SIP registar into one piece of software. 3CX Phone System is both SIP proxy, a SIP registrar as well as a media server in order to handle real time voice communications as well. 14 Registration Before we describe the flow of a typical SIP call, let's have a look at how SIP user agents register with a SIP registrar. The example below shows a situation where an SIP softphone (namely, the Ekiga client) registers with an Asterisk PBX. The Asterisk's IP address is 10.10.1.99, while the client is at 10.10.1.13 and wants to register the telephone number 13. In order to register, the SIP telephone needs the send the REGISTER request:
SIP registration, phase 1 The registrar server will immediately reply with the provisional response "100 Trying". This indicates that the request has been received (and thus the client does not need to retransmit it) and that it is being processed. While processing the request, the registrar discovers that the user agent needs to authenticate. It therefore responds with "401 Unauthorized". For the user agent, this means that it has to send the REGISTER request once more, this time providing authentication. Let's have a look at the detail of the messages. This is the text of the register message: REGISTER sip:10.10.1.99 SIP/2.0 CSeq: 1 REGISTER Via: SIP/2.0/UDP 10.10.1.13:5060; branch=z9hG4bK78946131-99e1-de11-8845-080027608325;rport User-Agent: Ekiga/3.2.5 From: <sip:13@10.10.1.99> ;tag=d60e6131-99e1-de11-8845-080027608325 Call-ID: e4ec6031-99e1-de11-8845-080027608325@vvt-laptop To: <sip:13@10.10.1.99> Contact: <sip:13@10.10.1.13>;q=1 Allow: INVITE,ACK,OPTIONS,BYE,CANCEL,SUBSCRIBE,NOTIFY,REFER,MESSAGE, INFO,PING Expires: 3600
Content-Length: 0 Max-Forwards: 70 We probably do not need to show the "100 Trying" response. The text of the "401 Unauthorized" message is as follows: SIP/2.0 401 Unauthorized Via: SIP/2.0/UDP 10.10.1.13:5060; branch=z9hG4bK78946131-99e1-de11-8845-080027608325; received=10.10.1.13;rport=5060 From: <sip:13@10.10.1.99>; tag=d60e6131-99e1-de11-8845-080027608325 To: <sip:13@10.10.1.99>;tag=as5489aead Call-ID: e4ec6031-99e1-de11-8845-080027608325@vvt-laptop CSeq: 1 REGISTER User-Agent: Asterisk PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces WWW-Authenticate: Digest algorithm=MD5, realm="asterisk", nonce="343eb793" Content-Length: 0 In the "401 Unauthorized" response, the important header isWWW-Authenticate:. It instructs the client to authenticate using the digest authentication (RFC2617). The nonce (a short for "number used once") parameter is a "challenge string". The client will combine the challenge string with the user's password and compute the MD5 hash of the resulting string. The server will compute its own hash using the same method and compare it with the MD5 hash provided by the client. The digest authentication is the most frequently used method because the password is never sent over the network in plain text. The "basic" authentication has been deprecated in SIP 2.0 as it is insecure (sending a password in plain text is generally a bad idea). Once the client computes the MD5 digest, it will re-send the REGISTER request. The message will look like this: REGISTER sip:10.10.1.99 SIP/2.0 CSeq: 2 REGISTER Via: SIP/2.0/UDP 10.10.1.13:5060; branch=z9hG4bK32366531-99e1-de11-8845-080027608325;rport User-Agent: Ekiga/3.2.5 Authorization: Digest username="test13", realm="asterisk", nonce="343eb793", uri="sip:10.10.1.99", algorithm=MD5, response="6c13de87f9cde9c44e95edbb68cbdea9" From: <sip:13@10.10.1.99>; tag=d60e6131-99e1-de11-8845-080027608325 Call-ID: e4ec6031-99e1-de11-8845-080027608325@vvt-laptop To: <sip:13@10.10.1.99> Contact: <sip:13@10.10.1.13>;q=1 Allow: INVITE,ACK,OPTIONS,BYE,CANCEL,SUBSCRIBE,NOTIFY,REFER, MESSAGE,INFO,PING Expires: 3600 Content-Length: 0 Max-Forwards: 70 The registrar server will again first respond with "100 Trying" and then compare the two MD5 hashes (the one provided by the client with the one computed by the registrar itself). If they match, the registrar will respond with "200 OK" and insert the endpoint to the location database. The database is usually shared between the registrar and the proxy server so that the proxy can use it connects calls. The figure below shows the message exchange:
SIP registration, phase 2 The response "200 OK" contains one important parameter, Expires. It tells the client that the registration will expire after the given number of seconds and the client will be required to register again. Call Flow Let us now have a look at a typical SIP call. We will consider a scenario with a SIP proxy server involved. Suppose a user at the SIP telephone with number 121 dials the number 122. The following will happen: 1. The user agent in telephone 121 does not know the IP address of 122. But it knows the IP address of the SIP proxy (suppose this address is 10.10.1.99). The user agent will compose an INVITE request and send it to the proxy. The To:header of the request contains the SIP URI <sip:122@10.10.1.99>. The body of the INVITE request carries an SDP (Session Description Protocol) message providing the parameters (codec, IP address, port) the called party will need to send its RTP stream to the caller. See the previous section for an example of the INVITE request. 2. The SIP proxy immediately responds with "100 Trying" and then forwards the INVITE request to the target telephone. The proxy server adds one Via: header to the message. Asmentioned before, the SIP proxy has access to the location database and thus knows the IP addresses of all registered telephones (the simplest implementation of this is such that the registrar server and the proxy are the same application). Steps 1 and 2 are shown in Figure A below.
Figure A 3. The telephone 122 starts ringing and sends the response "180 Ringing" to the proxy server. The proxy will forward the response to the telephone 121. 4. The called user picks up the phone and her telephone sends the response "200 OK". The body of the response contains an SDP message so that the caller knows where to send his RTP stream. The proxy server forwards the response to the caller. 5. The caller (telephone 121) confirms the receipt of "200 OK" with the ACK message. The proxy server forwards the ACK to the telephone 122. At this point, the call has been established and both parties start sending their RTP streams. The steps 3 through 5 are shown in Figure B.
Figure B 6. When one of the users hangs up, his/her telephone sends the request BYE and the SIP proxy forwards the message to the other party. The other party responds to the BYE request with "200 OK" (again, the proxy server forwards the response to the other side). Both parties stop sending RTP data and the call is over. The events in step 6 are shown in Figure C.
Figure C In order to understand the SIP call flow better, we now need to have a closer look at the Session Description Protocol. This is the topic of the next section. 15 IP PBX, SIP & VOIP FAQ An IP PBX or VOIP phone system replaces a traditional PBX or phone system and gives employees an extension number, the ability to conference, transfer and dial other colleagues. All calls are sent via data packets over a data network instead of the traditional phone network. With the use of a VOIP gateway, you can connect existing phone lines to the IP PBX and make and receive phone calls via a regular PSTN line. The IP PBX FAQ helps answer common questions about VOIP, SIP, IP PBX / VOIP Phone System hardware & Software, implementation and more. 16 What is SIP forking? SIP forking refers to the process of "forking" a single SIP call to multiple SIP endpoints. This is a very powerful feature of SIP. A single call can ring many endpoints at the same time. With SIP forking you can have your desk phone ring at the same time as your soft phone or a SIP phone on your mobile. For example, you would use SIP forking to ring your desk phone and your Android SIP
Phone at the same time, allowing you to take the call from either device easily. No forwarding rules would be necessary as both devices would ring. In the same manner SIP forking can be used in an office and allow the secretary to answer calls to the extension of his/her boss when he is away or unable to take the call.
17 What is an auto-attendant? Auto-attendant (or automated attendant) is a term commonly used in telephony to describe a voice menu system that allows callers to be transferred to an extension without going through a telephone operator or receptionist. The auto-attendant is also known as a digital receptionist. For a caller to find a user on a phone system, a dial-by-name directory is usually available. This feature lists users by name, allowing the caller to press a key to automatically ring the extension of a user once his/her extension is announced by the auto attendant. If a user is not available, the auto-attendant directs callers to the appropriate voice mailbox of the user to leave a voicemail message. Having an auto-attendant in a phone system is a very useful and cost-effective feature for a business, as it replaces/helps the human operator by automating and simplifying the incoming phone calls procedure. 18 What different types of CODECS are there? A Codec converts an analog signal to a digital one for transmission over a data network. The following Codecs are in use today • •
GSM - 13 Kbps (full rate), 20ms frame size iLBC - 15Kbps,20ms frame size: 13.3 Kbps, 30ms frame size
•
ITU G.711 - 64 Kbps, sample-based. Also known as alaw/ulaw
•
ITU G.722 - 48/56/64 Kbps
•
ITU G.723.1 - 5.3/6.3 Kbps, 30ms frame size
•
ITU G.726 - 16/24/32/40 Kbps
•
ITU G.728 - 16 Kbps
•
ITU G.729 - 8 Kbps, 10ms frame size
•
Speex - 2.15 to 44.2 Kbps
•
LPC10 - 2.5 Kbps 19 What is DID - Direct Inward Dialing? DID - Direct Inward Dialing (also called DDI in Europe) is a feature offered by telephone companies for use with their customers' PABX system, whereby the telephone company (telco) allocates a range of numbers associated with one or more phone lines. Its purpose is to allow a company to assign a personal number to each employee, without requiring a separate phone line for each. That way, telephony traffic can be split up and managed more easily.
DID require that you purchase an ISDN or Digital line and ask the telephone company to assign a range of numbers. You then need DID capable equipment at your premises which consists of BRI, E1 or T1 20 What is ECHO cancellation? Echo cancellation is the process of removing echo from a voice communication in order to improve the voice call quality. Echo cancellation is often needed because speech compression techniques and packet processing delays generate echo. There are 2 types of echo: acoustic echo and hybrid echo. Echo cancellation not only improves quality but it also reduces bandwidth consumption because of its silence suppression technique. 21 What Does Enum Mean? Enum Stands For Telephone Number Mapping. Behind This ‘Abbreviation’ Hides A Great Idea: To Be Reachable Anywhere In The World With The Same Number – And Via The Best And Cheapest Route. Enum Takes A Phone Number And Links It To An Internet Address Which Is Published In The Dns System. The Owner Of An Enum Number Can Thus Publish Where A Call Should Be Routed To Via A Dns Entry. Whats More, Different Routes Can Be Defined For Different Types Of Calls - For Example You Can Define A Different Route If The Caller Is A Fax Machine. Enum Does Require The Phone Of The Caller To Support It.You Register An Enum Number Rather Like You Register A Domain. At Present Many Registrars And Voip Providers Are Providing This As A Free Service. ENUM is a new standard, and is not that widespread 22 How does FAX work in VOIP environments? FAX was designed for analog networks, and does not travel well over a VOIP network. The reason for this is that FAX communication uses the signal in a different way to regular voice communication. When VOIP technologies digitize and compress analog voice communication it is optimized for VOICE and not for FAX. Subsequently, there are a number of things you need to take note of when you move to a VoIP Phone System. If you want to continue using your old fax machine, and you want to connect to your VoIP phone system, its best to use a VoIP and an ATA that supports T38. T38 is a protocol designed to allow fax to 'travel' over a VoIP network. An example configuration of such a setup can be found here. It is also possible to convert to computer based fax and choose a VoIP phone system that supports fax. 3CX Phone System for Windows includes a full featured fax server that is able to receive faxes and forward them in PDF format to e-mail. Faxes can be sent from anywhere in the network using the Microsoft (which comes free with Windows Server 2003 and 2008)
23 What is FOIP - Fax over IP?
FOIP stands for Fax over IP and refers to the process of sending and receiving faxes via a VOIP network. Fax over IP works via T38 and requires a T38 capable VOIP gateway as well as a T38 capable fax machine, fax card or fax software. Fax server software that can talk 'T38' allows sending and receiving faxes directly via a VOIP gateway and, consequently, does not need any additional fax hardware. 3CX includes a T38 compatible network fax server in its 3CX Phone System for Windows. Faxes are converted to PDF files and forwarded via email. Outbound faxes are sent via Microsoft Fax from anywhere in the network. Other fax servers currently in the market require the use of separately licensed and expensive Dialogic Soft IP drivers. 24 What do the terms FXS and FXO mean? FXS and FXO are the name of ports used by Analog phone lines (also known as POTS - Plain Old Telephone Service) or phones. FXS - Foreign eXchange Subscriber interface is the port that actually delivers the analog line to the subscriber. In other words it is the ‘plug on the wall’ that delivers a dialtone, battery current and ring voltage. FXO - Foreign eXchange Office interface is the port that receives the analog line. It is the plug on the phone or fax machine, or the plug(s) on your analog phone system. It delivers an on-hook/off-hook indication (loop closure). Since the FXO port is attached to a device, such as a fax or phone, the device is often called the ‘FXO device’. FXO and FXS are always paired, i.e similar to a male / female plug. Without a PBX, a phone is connected directly to the FXS port provided by a telephone company.
FXS / FXO without a PBX If you have a PBX, then you connect the lines provided by the telephone company to the PBX and then the phones to the PBX. Therefore, the PBX must have both FXO ports (to connect to the FXS ports provided by the telephone company) and FXS ports (to connect the phone or fax devices to).
FXS / FXO with a PBX 25 FXS & FXO & VOIP You will come across the terms FXS and FXO when deciding to buy equipment that allows you to connect analog phones to a VOIP Phone System or traditional PBXs to a VOIP service provider or to each other via the Internet. 26 An FXO gateway To connect analog phone lines to an IP phone system you need an FXO gateway. This allows you to connect the FXS port to the FXO port of the gateway, which then translates the analog phone line to a VOIP call. There are a number of different FXO gateways available. You can view different types that 3CX Phone System supports here.
An FXS gateway An FXS gateway is used to connect one or more lines of a traditional PBX to a VOIP phone system or provider. Alternatively, you can use it to connect analog phones to it and re-use your analog phones with a VoIP phone system. You need an FXS gateway because you want to connect the FXO ports (which normally are connected to the telephone company) to the Internet or a VOIP system.
An FXS adapter a.k.a. ATA adapter An FXS adapter is used to connect an analog phone or fax machine to a VOIP phone system or to a VOIP provider. You need this because you need to connect the FXO port of the phone/fax machine to the adapter.
FXS/ FXO gateways are widely available. 3CX Phone System for Windows automatically configures FXS/FXO Gateways to allow you to easily continue using your existing PSTN lines and/or analog phones. You can download the free edition here More information about FXS / FXO and VoIP in general can be found in our SIP / VoIP Video tutorials, 'Voip Nuggets'. VoIP Nuggets are short youtube technical training tutorials about VoIP & SIP. Click here for the latest list of VoIP Nuggets. 27 FXS/ FXO procedures – how it technically works If you are interested to know in more technical detail how an FXS/ FXO port interoperate, here is the exact sequence: When you wish to place a call: 1. You pick up the phone (the FXO device). The FXS port detects that you have gone off hook. 2. You dial the phone number, which is passed as Dual-Tone Multi-Frequency (DTMF) digits to the FXS port. Inbound call: 1. The FXS port receives a call, and then sends a ring voltage to the attached FXO device. 2. The phone rings 3. As soon as you pick up the phone you can answer the call. Ending the call – normally the FXS port relies on either of the connected FXO devices to end the call. Note: The analog phone line passes approximately 50 volts DC power to the FXS port. That’s why you get a faint ‘shock’ when you touch a connected phone line. This allows a call to be made in the event of a power cut. 28 What is H323? H323 is a set of standards from the ITU-T, which defines a set of protocols to provide audio and visual communication over a computer network.
H323 is a relatively old protocol and is currently being superseded by SIP – Session Initiation Protocol. One of the advantages of SIP is that its much less complex and resembles the HTTP / SMTP protocols. Therefore most VOIP equipment available today follows the SIP standard. Older VOIP equipment though would follow H 323. 29 H.323 Standards For a history of previously approved documents and for current work in progress, please refer to the document status page. Draft H.323 Core Documents Version H.323 H.225.0 H.225.0 ASN.1 H.245 H.245 ASN.1 Implementers Guide, Errata 1996 (V1) H.323v1 H.225.0v1 h2250v1.asn H.245v2 H245v2.asn Complete Text 1998 (V2) H.323v2 H.225.0v2 h2250v2.asn H.245v3 H245v3.asn Document 1 + Document 2 1999 (V3) H.323v3 H.225.0v3 h2250v3.asn H.245v5 H245v5.asn Complete Text 2000 (V4) H.323v4 H.225.0v4 h2250v4.asn H.245v7 H245v7.asn Complete Text 2003 (V5) H.323v5 H.225.0v5 h2250v5.asn H.245v9 H245v9.asn Complete Text 2006 (V6) H.323v6 H.225.0v6 h2250v6.asn H.245v13 H245v13.asn Complete Text, H.245 Erratum 1, H.245 (Am1) (Hypertext) Erratum 2 2009 (V7) H.323v7 H.225.0v7 h2250v7.asn H.245v15 H245v15.asn Complete Text (Hypertext) (Hypertext)
NOTE: The Implementers Guide must be read along with the standards. Important ITU-approved corrections are found in those documents. For some standards, the ITU-T has published Amendments and Corrigendum documents that must also be read with the standard. Please refer to the ITU-T Recommendations Page. 30 What are the benefits of an IP PBX? • • • • • • • • • •
Much easier to install & configure than a proprietary phone system Easier to manage because of web based configuration interface No need for separate phone wiring Allows users to hot plug their phone anywhere in the office - users simply takes their phone, plug it into the nearest ethernet port and keep their existing number! Allows easy roaming - calls can be diverted anywhere in the world because of the SIP protocol characteristics Significant cost reduction by leveraging Internet SIP standard eliminates proprietary, expensive phones Scalable Better reporting Better overview of system status and calls
31 IP PBX: How an IP PBX / VOIP phone system works A VOIP Phone System / IP PBX system consists of one or more SIP phones / VOIP phones, an IP PBX server and optionally includes a VOIP Gateway. The IP PBX server is similar to a proxy server: SIP clients, being either soft phones or hardware based phones, register with the IP PBX server, and when they wish to make a call they ask the IP PBX to establish the connection. The IP PBX has a directory of all phones/users and their corresponding SIP address and thus is able to connect an internal call or route an external call via either a VOIP gateway or a VOIP service provider.
How an IP PBX integrates on the network and how it uses the PSTN or Internet to connect calls 32 What is RTCP - Real Time Transport Protocol? RTCP stands for Real Time Transport Protocol and is defined in RFC 3550. RTCP works hand in hand with RTP. RTP does the delivery of the actual data, where as RTCP is used to send control packets to participants in a call. The primary function is to provide feedback on the quality of service being provided by RTP. 33 What is RTP - Real Time Transport Protocol? RTP - short for Real Time Transport Protocol defines a standard packet format for delivering audio and video over the internet. It is defined in RFC 1889. It was developed by the Audio Video Transport Working group and was first published in 1996. RTP and RTCP are closely linked â&#x20AC;&#x201C; RTP delivers the actual data and RTCP is used for feedback on quality of service. 34 What is SDP - Session Description Protocol? SDP, short for Session Description Protocol, is a format for describing streaming media initialization parameters. It has been published by the IETF as RFC 4566. Streaming media is content that is viewed or heard while it is being delivered. 35 Summary & Future Plan Voice over Internet Protocol (VoIP), also referred to as VoIP Phone, Digital Phone, Internet Phone or Broadband Phone service, is a way of making regular phone calls over a Broadband High Speed Internet connection instead of a regular telephone line. With VoIP, there are no unexplained taxes and service
charges, or expensive monthly fees and lots of great features come standard. Unlimited local and long distance calling can be yours for as little as minimum money per YEAR! We simply pick up your regular phone, dial a number and talk just like you would with a traditional phone service. It doesn't matter if the person you are calling has a VoIP phone or a traditional phone service as this is all taken care of by your VoIP provider. To make a VoIP phone call the only additional piece of equipment that you need is an Analog Telephone Adaptor (often called ATA or gizmo). This is supplied by your VoIP provider when you sign up for service and it allows you to make phone calls using your regular telephone. The only requirement for this technology is a Broadband High Speed Internet Service, such as DSL or Cable, since VoIP service relies on a High Speed Internet connection to work correctly. Why switch to a VoIP solution? VoIP is cheap, easy to setup, and it is for everyone! References 1.CCNP Voice Study Group - The Cisco Learning Network 2.CertifiedWan Network Administrator 3.Cisco Certified Network Associate Data Communications and Networking [4] J.C. Mackin&Lan McLeanby Microsoft Corporation. ISBN-81-203-2468-4 Implementing, Managing, and Maintaining a Microsoft Windows Server 2003 Network Infrastructure [5] www.google.com [6] www.answer.com [7]http://www.safaribook.com/pdf-188/man.html [8] http://www.cisco.com/tutorials/article.php/1009431 [9] http://www.asterisk.com [10]http://searchmobilecomputing.techtarget.com/definition/80211 [11] http://en.wikipedia.org/wiki/wan [12] http://www.pdf-search-engine.com/lan-pdf.html [13] http://www.hospitalitywifi.com [14] http://www.wan.com/ [15] http://www.3cx.com/learn/internet/accessing-wifi.jsp [16] www.marcus-spectrum.com/documents/economist.pdf [17] http://www.cisco.com/en/US/products/hw/wireless/index.html [18] http://www.microsoft.com/windowsserver2003/default.mspx [19] http://bangladesh.gov.bd/index.php