Lync Media Conferencing/Audio call flow of Sip and SDP

Lync Media Conferencing/Audio call flow of Sip and SDP

Top Level

Signaling and the media flow are not tied to each other. Signaling and Media are both needed to make a media session work. The Lync Signaling is TCP and flows through the Lync Edge and Mediation Servers. The Media (“the Call or Conference Data”) takes the shortest possible path. The Media is preferred UDP traffic and needs to take the shortest possible route to make audio acceptable to the human ear. Therefore, the media is the hard part to grasp in troubleshooting. This is complexity concept #1. Second, we have SDP, SIP, RTP, ICE, MRAS, and NAT as well as RFCs for other Network protocols to consider when troubleshooting. The 8 mentioned complexities become very blurring without a firm grasp of the concepts.

The goal of this paper is to deliver “search” items you can enter to filter logs for review aspects of media progression to correct and create successful Lync Deployments. You will see the search terms in bold underlined letters.

The bottom of this document contains definitions you want to review before reading this article. We need to the call flow and identify important points in the Lync SIP logs. The audio call from home to work goes like this:

  • · A Lync client uses DNS to get the Lync destination network information
  • · Sip client contacts Edge Server using SDP messaging inside of SIP, negotiating authentication to MRAS.
  • · Stun returns the Public IP (reflexive address) of the Client (to the client) and also adds the candidate pairs to the TURN candidate list
  • · The media paths are summated in to “candidates” by TURN. These are pairs of IP and Port combinations which are tested by TURN
  • · Turn secures the media connection to the first successful end to end connection
  • · The candidate is promoted and media flow is now possible.
  • · H264.SVC Codec is used and RTP packets are send as payload for the call
  • · SDP messages control the call and the setup and tear down

Media Session Scenario

Before discussing the Media call flow, it is important to understand that no matter how complex the call may look, the negotiation of media falls into three simple categories. Each Person on a call only has thee candidate types to offer. A host IP, a Reflexive (STUN) IP and a TURN IP. Said another way, a non-routable LAN IP, a Public IP on the internet, or a relay IP which is usually the Edge Server. With this in mind see the figure below, detailing an external to office call:

1

The Local IP is only an available media path for internal clients (UDP Direct). The Stun Media path is the shortest public path but is not always allowed by corporate networks (UDP NAT). You can see the fusia line represents a NAT path from 224.241.31.12 to 192.168.3.8. The STUN Path is really the solid line of fusia+blue (UDP RELAY). This Blue segment represents the TURN portion of the path. Notice the TURN path is TWO NATS away from the 224 Public IP. This is why Turn is needed in several situations. Just remember the Local, Public (Reflexive), and Relay addresses are what we are trying to choose during the process of media negotiation.

Login, Discover, Exchange, connectivity, Promotion, and Flow

The process breaks down to Login, Discover, Exchange, Connectivity, Promotion, and Flow. Since this all occurs over SIP, you can use the s4 and SIP log trace with OCSLOGGER. The Lync Client Login uses MRAS (MEDIA Relay Authentication) to gain access to the Edge server. THE MRAS trace flag may also be used. Subsequent relay of media uses the credentials gained from the MRAS login. These are the important first step of the success of Media.

Login (Sip Register)

  • · Sip Register (client)
  • · MRAS Request
  • · MRAS Response
  • · MRAS Response (to-user) 200 OK

Discover (a=candidate)

  • · Sip Invite – Contains the IP addresses of the Caller
      • Look for the “type host” for the caller host IPs
      • Look for reflexive addresses categorize by key word SRFLX RADDS (tcp/udp)
      • Look for RELAY RADDR (tcp/udp)
      • List these addresses so you understand how this call is functioning.
      • Additional Sip Invite –Contains the IP addresses of the Callee

Exchange (a=candidate)

  • · 200 OK – Contains the confirmation remote client can reach the calling client
    • Look for the “type host” for the caller host IPs
    • Look for reflexive addresses categorize by key word SRFLX RADDS (tcp/udp)
    • Look for RELAY RADDR (tcp/udp)
    • List these addresses so you understand how this call is functioning
  • · 180 Ringing – Session early media SDP message
  • · 183 Session Progress Session Early media
  • · PRACK Session media connected to allow for call setup in signaling with may be slower.

Connectivity (Re-Invite)

Re-invite results actually show in the candidate promotion (a=remote-candidate). You notice above the 180,183, and PRACK messages. This happens because the call Signaling setup may be slower to complete. Below shows the underlying connectivity checks that happen as the early media is setting up. At this point the User is logged in and is ready to make a call or to the media server. Below is the early media connection in as seen in a packet trace. Turn packets are setting up the session before the sip invite.

Packets Trace analysis:

  • · Turn: Allocate Request- (client requests space on Edge)
  • · Turn: Allocation Error Response from Edge (this is by design- This was a successful integrity check of the client message)
  • · Turn: Proper request “Allocate Request” with “message-integrity” attribute sent
  • · Sip Invite – (client Candidate information provided)
  • · Turn: Allocate response –(EDGE replies with its own Public and Private address)
  • · At this point the Info is given to client because MRAS has verified the USER CREDS and is only valid for the port which is connected.

Promotion (a=remote-candidate) and Flow of Media

  • · Sip Invite and 200 OK message
  • · Invite and candidate should only show one candidate pair
  • · This completes the negotiation of media path
  • · See the next session on was the correct media path chosen. This is very important to understand
  • · Media will flow using RTP using the chosen codec 264.SCV

Was the correct candidate pair chosen?

I did not detail the actual promotion because we need to break this down further. The Final promotion may work. The final promotion may show a poor choice, based on network settings. We need to understand the difference between a good and bad choice. Is the path going to be sustainable? Scalable?

The initial ICE candidates need to be broken into something we understand. ICE will choose a candidate in the follow order.

  • UDP direct- (Local IP of each client to each other)
  • UDP Relay- (external to external or internal to external)- uses public IP of AV/Edge
  • UDP NAT- ( Test Reflexive address of the Home users) (only for 2 users outside the firewall)
  • TCP Relay- AV/EDGE public NIC (TCP only)- last resort

So we want to tablize the candidates by TCP vs. UDP, Reflex, vs. Relay etc… I am using an example from one of the best articles I have seen on Enterprise Failures for Negotiation of Media failure. Remote call invite is below. Terms to know include:

  • TCP-Pass vs. TCP-ACT = Active or passive candidates
  • Srfx raddr = Reflexive Candidate
  • Relay raddr = Relay Candidate

· Bob remote user LAN IP 192.138.1.100 to Lync Edge 178.64.39.80 (Edge Nat AV IP 10.1.0.77) (Edge internal IP 10.1.1.11)

 

Figure 2.

2

 

Figure 3.

3

Whether the correct path is chosen depends on the physical network topology, port availability and connectivity. Realize, one candidate from each user is chosen in the final promotion. The combinations which may exist as follows:

  • UDP direct- (Local IP of each client to each other)
    • Not possible unless direct UDP path exist like 2 internal clients
  • UDP Relay- (external to external or internal to external)- uses public IP of AV/Edge
    • UDP 16648703 178.64.39.80 57548 typ relay raddr 65.10.10.189 rport 14932 (BOB)
    • UDP 16647678 178.64.39.81 56065 typ relay raddr 10.10.10.211 rport 53325 (Carol)
  • UDP NAT- ( Test Reflexive address of the Home users) (only for 2 users outside the firewall)
    • UDP 1694233598 65.10.10.189 14933 typ srflx raddr 192.168.1.100 rport 14933 (Bob)
    • No UDP reflexive address shown for Carol
  • TCP Relay- AV/EDGE public NIC (TCP only)- last resort
    • TCP-ACT 7075326 178.64.39.80 50468 typ relay raddr 65.10.10.189 rport 26654 (BOB)
    • TCP-ACT 1684797951 10.10.10.211 49583 typ srflx raddr 10.10.10.211 rport 49583 (Carol)

From this point, genuine analysis must occur. Again review TechNet to understand the network requirements, Federation, Ports, CAC, Regions, Policies, and other items which are out of scope in this article. The quickest way to get to the root of a problem is to identify the remote-candidate choice and see if it makes sense. Above, the yellow indicates a problem. 1. There does not appear to be a UDP path for the call. 2. The relay addresses do not have a common end point. Therefore it’s likely to have a port problem somewhere. See the Blog which is credited with the example for detailed explanation of the issues that Bob and Carol face.

  • · Review logs for final selection versus the candidate list
    • a=remote-candidate (signified by the SIP Invite or 200 OK) they are both eminent at this time. This is the promotion text.
    • Pair of candidates is for RTP and RTCP. One IP is used for RTP message and the RTCP is the control message
  • Review candidate ports vs. selected ports
    • Desktop Sharing and file transfer- AV/EDGE ports
    • 50000-59999 – RTP/TCP (RTP/UDP stays closed) – outbound (external side open)
    • 443 (TCP) and 3478 (UDP) open both ways
    • Outer firewall and inner firewall cannot NAT- (Access Edge is Nat-ed)
    • The outer and inner firewall both doing Nat is double NATTING. This will not work for LYNC
  • ·Review The Concept of Hair-pinning and verify this is set up properly.

How to think about this-

Jeff Schertz article on STUN, TURN and ICE is helpful in showing graphically what is going on. The pattern I see is this 1.) There are three candidate paths no matter the call: Host, Stun, and Turn. 2.) The host path is only valid when both clients are internal or routable. 3.) The Stun Path is the public (reflexive) IP on both users as the Media traversal. 3. The Turn Path, takes the STUN addresses and acts a relay because the STUN path will not connect on its own.

 

How do get to the demark in the logs

· Isolate Call = CallID– A call can be called out by finding the CallID. Search on this string and the call is now ordered by all logs for one leg of a call.

· Edge Provisioning by Client = mrasuri This will show the client making the request to the edge server and the edge response. The response will tell the client it does or does not need to contact the Edge for media Relay to the called party.

· Edge Credentials for AV/Edge = credentialsRequestID-

· Discovery and Exchange of info = a=candidate

· Connectivity Checks = Re-Invite

· Candidate Promotion = a=remote-candidate

Definitions

Sip –RFC 3261 Session Initiation Protocol is the way Lync communicates Voice and Chat. Sip is the way communication is initiated. Lync uses TCP for SIP.

SDP –Session Description protocol RFC 4566 SDP messages are sent inside the Sip message. The SDP message is isolated by a single “content-type line”. The value is application/sdp. SDP is used to set up the optimal media flow path.

RTP/RTCP –Real Time Protocol RFC 3550 is used to transfer the media stream

Stun – (reflects addresses or reflexive addresses) Session Traversal Utilities for Nat

Turn– (Relay) Traversal using Relay Nat

Ice – Interactive connectivity Establishment

MRAS – Media Relay authentication service (client login uses this credential to relay media through the edge server)

Media Relay component in Lync is the Edge Server – – This is the ICE server (Stun+Turn)

  • If STUN is used- media will flow directly between the Endpoints.
  • If TURN Is used- media will flow as proxy by the Edge Server

Nat Definitions

Full Cone Nat– Has a Port open to an external client- Every other computer can also respond.

Address Restricted Nat– Only the Server (IP) where the packet has gone- Can respond –

Port Restricted NAT- The source port of the replying server must stay the same. The client can initiate the connection but the server IP (where the packet has gone) cannot initiate connection

Symmetric NAT- Client can send out two mappings (initiating 2 connections) but the connections cannot cross communicate. This is similar to a corporate firewall. This greatly reduces the chance that a STUN connection can be made. A TURN connection candidate pair is often chosen in this situation.

Advertisements

4 thoughts on “Lync Media Conferencing/Audio call flow of Sip and SDP

  1. Greg S says:

    I hope you know how awesome this article is. Thanks.

    Like

  2. Erdal says:

    Great article…thx

    Like

  3. I am truly glad to glance at this web site posts which
    includes tons of useful facts, thanks for providing such statistics.

    Like

  4. Its like you learn my mind! You appear to understand so much
    approximately this, like you wrote the ebook inn it
    or something. I think that you can do with a few p.c. to pressure the message house a bit, but insteawd oof
    that, that is fantastic blog. A fantastic read. I will definitely be back.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s