Introduction
This is a C# based simple SIP (VOIP) call-out phone. This SIP application was developed and is currently in use as "Help -> Call to support". The idea was to create a zero configuration, very simple call-out phone, and that is how it is now (though IP based incoming calls are supported; example: sip:test@ip:7666, 7666 is the port SIP_Call out runs).
Currently, this application runs on Windows only. For some reason .NET "still" has no managed support for audio-in and audio-out. The audio part uses the unmanaged Windows wave API.
I tried to make the example application well organized, clear, and well commented - don't know how it turned out, you can be the judge. For beginners, I suggest you Google and read some SIP introduction, otherwise you will never get what is going on.
Because the code is full of comments, I think there is no need for a lot of explanation here; just dig into the code.
SIP commands and terms used (in the example application)
- INVITE - Invite has two meanings:
- Initial INVITE - In simple words, we or the remote-party just sends a call offer.
- Mid-dialog INVITE - In SIP specifications, this is called "RE-INVITE".
- RE-INVITE is used to modify session info; in our case, implementing call onhold. RE-INVVITE can be sent by us or the remote-party.
- ACK - ACK must be sent to the remote-party for each INVITE/RE-INVITE positive 2xx response. ACK just confirms that we received a 2xx response.
- CANCEL - CANCEL can be used to cancel a pending INVITE or RE-INVVITE request.
- BYE - BYE is used to end the active call. The call terminating side must send the BYE.
- SIP dilaog - We can imagine this as a session between us and the remote-party.
- SIP call - An SIP call consists of an SIP dialog and an audio RTP session.
Establishing a call
Call establishing starts from creating an RTP audio session, because we need to advertise our RTP session IP:port in SDP. After that, we need to do NAT handling if it's needed. Now the initial INVITE request can be created and send to the remote-party. For more details, RFC 3216 should be read (get the links below).
Example SIP messages exchanged
INVITE sip:bob@192.168.1.44 SIP/2.0
Via: SIP/2.0/UDP 192.168.1.33;branch=z9hG4bKnashds8
Max-Forwards: 70
To: Bob <sip:bob@domain.com>
From: Alice <sip:alice@domain.com>;tag=1928301774
Call-ID: a84b4c76e66710
CSeq: 314159 INVITE
Contact: <sip:alice@192.168.1.33>
Content-Type: application/sdp
Content-Length: sdp_size_in_bytes
v=0
o=- 2890844526 2890844526 IN IP4 192.168.1.33
s=
c=IN IP4 192.168.1.33
t=0 0
m=audio 1111 RTP/AVP 0 97
a=rtpmap:0 PCMU/8000
a=sendrecv
SIP/2.0 180 Ringing
Via: SIP/2.0/UDP 192.168.1.33;branch=z9hG4bK4b43c2ff8.1 ;received=192.0.2.3
To: Bob <sip:bob@domain.com>;tag=a6c85cf
From: Alice <sip:alice@domain.com>;tag=1928301774
Call-ID: a84b4c76e66710
Contact: <sip:bob@192.168.1.44>
CSeq: 314159 INVITE
Content-Length: 0
SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.168.1.33;branch=z9hG4bK4b43c2ff8.1;received=192.0.2.3
To: Bob <sip:bob@domain.com>;tag=a6c85cf
From: Alice <sip:alice@domain.com>;tag=1928301774
Call-ID: a84b4c76e66710
CSeq: 314159 INVITE
Contact: <sip:bob@1192.168.1.44>
Content-Type: application/sdp
Content-Length: sdp_size_in_bytes
v=0
o=- 2890844526 2890844526 IN IP4 192.168.1.44
s=
c=IN IP4 192.168.1.44
t=0 0
m=audio 2222 RTP/AVP 0 97
a=rtpmap:0 PCMU/8000
a=sendrecv
ACK sip:bob@192.0.2.4 SIP/2.0
Via: SIP/2.0/UDP 192.168.1.33;branch=z9hG4bKnashds9
Max-Forwards: 70
To: Bob <sip:bob@domain.com>;tag=a6c85cf
From: Alice <sip:alice@domain.com>;tag=1928301774
Call-ID: a84b4c76e66710
CSeq: 314159 ACK
Content-Length: 0
Call on-hold
Call on-hold in SIP is something that is not exactly defined. Actually, SIP doesn't know anything about on-hold. On-hold is totally up to application to handle (this is because SIP doesn't care about the SDP data it sends). Because of this, different implementations may handle it differently.
So far, I have seen four different ways how phones do it:
- Setting the audio stream IP to "0.0.0.0" in SDP.
- Disabling the audio stream in SDP by setting the port to "0".
- Setting the audio stream to "sendonly" - normally call holder then sends an on-hold music to the remote-party.
- A cleaner way is setting the audio stream to "inactive" (this is how we do it).
The main difference between disabling or setting an inactive audio stream is that in "inactive" RTCP, the session still exists and RTCP packets are sent. About RTP stream in "inactive", it is up to the application if it pauses session or disposes it in the on-hold state.
Example SDP (putting on-hold and un-hold)
[onhold our offer]
v=0
o=- 2890844526 2890844526 IN IP4 192.168.1.33
s=
c=IN IP4 192.168.1.33
t=0 0
m=audio 1111 RTP/AVP 0 97
a=rtpmap:0 PCMU/8000
a=inactive
[onhold remote-party answer]
v=0
o=- 2890844526 2890844526 IN IP4 192.168.1.44
s=
c=IN IP4 192.168.1.44
t=0 0
m=audio 2222 RTP/AVP 0 97
a=rtpmap:0 PCMU/8000
a=inactive
------------------------------------------------
[unhold our offer]
v=0
o=- 2890844526 2890844526 IN IP4 192.168.1.33
s=
c=IN IP4 192.168.1.33
t=0 0
m=audio 1111 RTP/AVP 0 97
a=rtpmap:0 PCMU/8000
a=sendrecv
[unhold remote-party answer]
v=0
o=- 2890844526 2890844526 IN IP4 192.168.1.44
s=
c=IN IP4 192.168.1.44
t=0 0
m=audio 2222 RTP/AVP 0 97
a=rtpmap:0 PCMU
a=sendrecv
NAT handling
We support two different NAT handling methods:
- STUN - A STUN request is sent to the STUN server, it replies back with - IP:port from where request came.
- UPnP - The UPnP API is used to open router ports. This can be used only if router supports UPnP.
Calling is possible only if the router supports UPnP or if router type (STUN) is not "symmetric NAT". For "symmetric NAT", each new UDP request to new IP:port is mapped to a new router, external new IP:port. For more info about NAT types, see: http://en.wikipedia.org/wiki/Network_address_translation.
There is a special case when the router supports SIP ALG (application layer gateway), then no NAT handling must be done by the application. The router alters SIP, SDP and opens router ports as needed. (So far I have seen Thompson routers do it.)
Final words
Building an SIP soft phone is easy if you you have the right components (SIP stack, RTP stack, audio lib) for it ... but when you need to code everything from scratch, it's a nightmare. You will sink in tons of messy RFC, ...
Links
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.