Multicast Routing

Tik- 110.551 Internetworking Seminar
Jorma Paananen
Jorma.Paananen@tele.fi

Abstract

Group communication and network multimedia applications are becoming more and more popular. These appilcations set new demands on the quality of network resources such as bandwith or latency. While these resources are usually very limited, good multicast routing will be more and more important as networks and the number of users continues to grow. The good and bad points of current multicast routing algorithms and protocols are described in this document. Also, multicast group protocol, multicast address formats and the current Internet multimedia bacbone MBone are shortly described.

1. Introduction 1.1 What is multicasting 1.2 Multicast routing
2. Where do we need multicasting 2.1 Net Radio and TV 2.2 Groupware 2.3 Resource Discovery
3. Addressing 3.1 IP version 4, IPv4 3.2 IP version 6, IPv6
4. Multicast Routing Algorithms and Protocols 4.1 Flooding 4.2 Spanning tree 4.3 Reverse-Path forwarding, RPF 4.4 RPF and prunes 4.5 Steiner trees 4.6 Core-based trees 4.7 DVMRP 4.8 MOSPF 4.9 PIM
5. IGMP
6. MBone
7. Other multicasting 7.1 IPv4 Option for Sender Directed Multi-Destinaton Delivery 7.2 Multicast Transport Protocol 7.3 MARS-based ATM multicasting 7.4 CLNP Multicasting
8. References
9. Glossary

1. Introduction

1.1 What is multicasting

One way to categorize communication is to use the number of receivers of the message or the packet of information.

Unicast: Delivering a packet to a single receiver
Anycast: Delivering a packet to one interface of a set of receivers
Multicast: Delivering a packet to a set of receivers
Broadcast: Delivering a packet to all receivers

IP version 4 has unicast, multicast and broadcast while the new IP version 6 has unicast, anycast and multicast. Originally in the IP world the receiver has been a host or a router, but it is more clear to identify the receiver as an interface in a host or a router.

There are a few different ways how multicasting could be implemented in a IP network (a datagram network): multiple unicast, multiple addresses in a packet or dedicated multicast addresses. Dedicated multicast adresses is the normal multicast implementation model in the IP protocol world.

Multiple unicast

This choice would waste network bandwith because there would be a instance of a packet for all recipients even if some or all of them have the same route, and the same links along the path. The sender must know the identity (the address) of all the recipients in this method. This is not generally a desired property in multicasting type applications, because a registering mecahnism is needed, each receiver must register itself to the sender. This also loads the senders resources with the registering information.

Multiple addresses in a packet

This would not consume as much network bandwith as multiple unicast, because there can be only one instance of the packet for all recipients. All recipients that have at least partially the same route are listed in a single packet. There is anyway a major problem in this method for IP networking, because there is a limit for the maximum size of a IP packet, 65535 bytes (octets). The limit is set by the 16-bit long packet length field in the IP header. In most link types, the maximum size is actually much smaller, for ethernet it is only 1500 bytes and for IEEE 802.3 it is 1492 bytes [12]. This limits the maximum number of recipient addresses in a single packet so this would only partially help in the bandwith wasting problem of the previous method. Another drawback in this method is that there would be much more processing in a router when there is a list of addresses instead of a single address.

Multicast addresses

This is the normal multicast implementation model in the IP protocol world. In this choice, the sender does not need to know the recipients so it is the lightest choice for the senders resources. Also, while the sender does not have the receiver's identity, we must use connectionless UDP on top of IP. The question of saving the network link bandwith is left to the routers and the routing policy for these multicast addresses becomes a separate issue from the normal unicast address routing. With good multicast routing algorithms and protocols, wasting bandwith can be avoided. This method needs a protocol for recivers and multicast aware routers to decide which hosts and routers need packets for which multicast adresses. This protocol is currently IGMP. The multicast address format is described in chapter 3.

1.2 Multicast routing

A router recongnizes the type of casting by the address format of the packet. Routing mechanism for unicast (and anycast) is conceptually very simple, there is always only one link to which a router sends a unicast packet. Broadcast is also a quite simple case.

Multicast routing, on the other hand, is far from simple. Hosts may join or leave a IP multicast group at any time, which means that the set of links where a router must send a multicast packet keeps changing over time. Simply sending multicast packets to all interfaces would waste bandwith, especially as multicast applications usually consume much bandwith (such as audio or video conferencing).

Growing popularity of the World Wide Web has (in the recent years) demonstrated the vulnerability of network resources as the use of the network grows fast. World Wide Web communication paradigm is simple query and response so it is easier to save bandwith with good cache policy than to save bandwith in group oriented communication. Now the popularity of the group oriented applications in MBone multicast network over Internet is growing and as most MBone applications are also bandwith hungry, good multicast routing as a bandwith saver is important and will be even more important as the number of users grows.

Various bandwith needs [2, 13] are:

Application Bandwith

Voice quality audio 32-64 kb/s

Low quality video 128 kb/s

NTCS video, uncompressed 27 Mb/s

Application	Bandwith
Voice quality audio	32-64 kb/s
Low quality video	128 kb/s
NTCS video, uncompressed	27 Mb/s

Finding out a good path (creating a routing table in a router) to a destination is not always easy. Calculating the path on the link cost basis is not always enough, the same example applications have a maximum delay that a good transmission can take ,and they also tolerate quite small jitter during a singe session [11]. So calculating paths could also be done on the link delay basis.

2. Where do we need multicasting

Watching a live NASA space shuttle lauch or watching Rolling Stones playing live on the other side of the world. Joining a IETF conference while in your office. Drawing on the same whiteboard with your colleagues even they are on the other side of the country.

All this is already today's reality on your computer desktop over the Internet. Previous examples show the most popular new applications in the Internet and the use of these is growing rapidly. These types of applications are also the most bandwith demanding and sevice quality critical applications in the Internet.

Multicasting is now the only cost effective solution to this bandwith need for most of these new applications. It will be even more important for IPv6, than it is today in IPv4, because these multicast groups may be very large as the popularity of these applications grows [8]. Ideally, multicast routing would save bandwith by adjusting dynamically the paths between senders and receivers and it would for example use the information about requested and available qualities of service of the links while calculating the paths.

2.1 Net Radio and TV

As current workstations and other personal computers have at least reasonable audio and video capabilities and network bandwith and connectivity have grown quite fast, moving multimedia type broadcasts to computer networks has been a natural development.

IP networks have one important difference compared to normal radio and tv networks, IP network is a router network. In normal radio and tv transmissions it is a perfectly good way to broadcast the transmissions, because there is enough bandwith to allocate different channels for different messages. As IP networking is mostly best effort networking, and different messages share the same channel, multicasting is needed to save network resources.

2.2 Groupware

A simple email communication inside a working group is a simple groupware techique, but it is not realtime communication. With multicasting realtime text, audio or videoconferencing becomes reasonable with current network resources. The communication in groupware compared with net radio and tv is not only from one sender to others but between some or all in the group. An electronic whiteboard is another example of groupware applications.

2.3 Resource Discovery

When a host needs to locate a server in the network, and it does not yet know the address or name of the server, it can simply send a broadcast message asking the service. An example of this kind of query is BootP, where a booting host asks it's network configuration from a BootP server. The server identifies the host with the link media address. This causes unwanted processing in all those hosts inside the broadcast area that do not provide the asked service. And this can even cause totally unneeded traffic in some links if the broadcast is wider than a single link. If the solicitation packets are sent to a specific multicast address instead of a multicast address, then hosts that do not provide the service ignore the packet at much lower level and less processing.

3. Addressing

Multicast addresses are defined as a bit pattern in the most significant end of the address both in the current version of IP (version 4) and in the next IP version (version 6).

3.1 IP version 4, IPv4

Host groups are identified in IPv4 by class D IP addresses [4]. Class D addresses have "1110" as their high-order four bits, so as expressed in the normal dotted desimal format, group addresses range from 224.0.0.0 to 239.255.255.255.

IPv4 multicast address format is:

4 bits 28 bits

1110 group ID

4 bits	28 bits
1110	group ID

Group address 224.0.0.0 is left unused. Some addresses are permanet group addresses, such as 224.0.0.1 is assigned to the permanent group of all IP hosts on the directly connected network. The addresses of other well-known, permanent groups are published in "Assigned Numbers" [9].

Examples of permanet groups are:

224.0.0.2 all routers on the subnet

224.0.0.9 RIP2 routers

3.2 IP version 6, IPv6

IPv6 multicast addresses have "11111111" as their high-order eight bits, so group addresses range as expressed in the IPv6 hex format from FF01:0:0:0:0:0:0:0 to FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF [7].

IPv6 multicast address format is:

8 bits 4 bits 4 bits 112 bits

11111111 flags scope group ID

8 bits	4 bits	4 bits	112 bits
11111111	flags	scope	group ID

The least significant bit of the flags tells if the address is permanent IANA assignet address. If it is 0 it is permanent, if it is 1 it is not permanently assigned. The scope field gives the scope of the multicast group, node-local, link-local, site-local, oranisation-local or global scope. As in IPv4, multicast addresses naturally must not be used as source addresses in IPv6 datagrams or appear in any routing header.

Examples of IPv6 permanent and RFC perdefined multicast addresses are:

FF01:0:0:0:0:0:0:1 All Nodes

FF02:0:0:0:0:0:0:1 All Nodes

FF01:0:0:0:0:0:0:2 All Routers

FF02:0:0:0:0:0:0:2 All Routers

FF02:0:0:0:0:0:0:C DHCP Server/Relay-Agent

FF02:0:0:0:0:1:XXXX:XXXX Solicited-Node Address

4. Multicast Routing Algorithms and Protocols

4.1 Flooding

In flooding a router that receives an IP packet with a multicast destination address simply sends it to all interfaces exept the interface where the packet came to the router. The router can try to find out if it has seen the same packet before to prevent routing loops. So flooding is an extremely simple algorithm, all packets are flooded through the whole network. [1]

Some application level routing such as news and OSPF use this method. News uses a article path history and OSPF uses link state database to detect if the message is received for the first time. At the IP level, it is quite difficult, or at least resource consuming, to find out that information. Using a list of last seen packets would need a lot of memory in current high speed routers and the checking if the packet is in the list would slow down the router. IP TTL is usually the best way to take care of this problem.

Flooding is a very robust algorithm, but the plain flooding, at least at the IP level, is too greedy on router memory resources and link bandwith resources. It is only suitable in very small networks.

4.2 Spanning tree

Building a logical or overlay network on top of the real network by creating a loopless graph between all nodes resolves the looping problem in the flooding [1]. Otherwise, spanning tree has the same good points and drawbacks as flooding, being a bandwith waster, but robust and simple to implement.

Spanning tree is used, for instance, in bridges to connect link level non routing protocol links, such as IEEE 802.

4.3 Reverse-Path forwarding, RPF

The basic idea of RPF is to generate an implicit spanning tree for each source in the network starting from the source to other nodes [1]. Routers along the path are also on the sortest path from that source to the node, so OSPF routers fit well into this algorithm.

Good point in RPF is that using shortest paths gives the fastest possible delivery. Also RPF does not concentrate the traffic on some network links because each source has a spanning tree of its own. RPF has the same bad point as the previous algorithms, they do not really use the group membership information, so they all waste bandwith. Group membership can be taken into account only in the leaf links, but not in the router network.

4.4 RPF and Prunes

This is a modification of the reverse path forwarding routing algorithm that takes group membership into account [1]. When the first packet in a multicast transmission reaches the end leaves in the routing tree, the leaf router sends a pruning message upstream if it does not have any group members attached to it. Likewise, if a any router in the tree receives a prune message from all of its downstream interfaces it sends a prune message upstream. The purpose of the prune message is to prevent sending unneeded following packets in that group to the pruned branch.

A node leaving a group is handled easily, if it is the last node behind a leaf router a pruning message is sent upstream, otherwise there is no need to changes. Nodes joining a group can be handled by removing the pruning information periodically from the routers. Each refresh causes a flooding message, so if the period is too short we waste bandwith, but half of the period defines an average waiting time for a node between joining a group and actually receiving any packets, it should not be too long either.

There is still a one bad point in the way group membership information is used - the first packet in a group is always flooded in the whole network. All routers keep state of the pruning for each group and each interface.

4.5 Steiner trees

The idea of steiner trees is to build an overlay network that connects all nodes in a group with minimum total number of links [1]. This looks like a good alternative but it is not usually suitable in real networks because the computing of the tree is hard and it must be done again each time a node joins or leaves a group.

4.6 Core-based trees

Each multicast group has a core, a fixed point, where all nodes that want to join the group send their join messages. A router along the path from a node to the core marks the interface to belong to the tree, so also in this algorithm we build a spanning tree per multicast group. [1]

Because in this algorithm there is one spanning tree per group,the traffic of one group concentrates on certain links. The concentration of the whole multicasting traffic depends on where all the cores in the network are. The good point in this algorithm, compared to the previous algorithms, is that there is not that first flood message in a group, the packets are transmitted only to those hosts that belong to the group.

4.7 DVMRP

The Distance Vector Multicast Routing Protocol [3], DVMRP, is a RIP style distance vector routing protocol. The algorithm that it uses is RPF. The first versions of mrouted software, that were used in MBone, used this protocol. The 1993 version started to use RPF and Prunes as the algorithm.

4.8 MOSPF

MOSPF is a multicast extension to the OSPF routing protocol [15, 6]. As OSPF is a intra domain or intra AS routing protocol so MOSPF is also targeted to be a intra domain or intra AS multicasting protocol.

OSPF routers have a link state database that describes the network. By adding the group membership data to the link information, a router can compute a pruned tree for each source and member nodes. This makes it possible to avoid flooding the first packet, and so the MOSPF is even better than plain RPF and Prunes algorithm.

OSPF allows an AS to be divided into areas. As routers have complete link state information of only their own area, the whole AS tree is only an approximation based on summary information. Area border routers advertise a summary of groups in the area to the OSPF backbone. External router is a part of all groups in a OSPF network.

The drawback in MOSPF is that the route computations are costly. This can be helped by computing them only on demand.

4.9 PIM

The Protocol Independent Multcast (PIM) is targeted to have good scalability by having two different modes. The two modes are the dense mode and the sparse mode [1, 10, 21]. The modes differ in the density of group members in the target area.

The dense mode uses the RPF and Prunes algorithm but is simpler than DVMRP, as it does not compute multicast specific routes, it only uses the pruning information. The good point of this algorithm is that the implementation is very light and as the density is supposed to be high, the overhead of the flooding of the first packet is not bad.

The sparse mode is close to the Core-based trees algorithm, but instead of a single core it uses multiple redezvous points.

PIM is still at "working draft" level, but there already exists an implementation by a commercial router vendor.

5. IGMP

IP service interface in a multicast capable host provides JoinHostGroup and LeaveHostGroup operations for the upper software layers [4]. These operations map to the join a group and leave a group operations of a multicast based user process or program.

The Internet Group Management Protocol (IGMP) is used between IP hosts and multicast IP routers to share information about group memberships in a link [4]. Instead of those previous two operations IGMP has two message types, Host Membership Query and a Host Membership Report.

When a host joins a group it sends a "Host Membership Report" message which information is noted in the router. When a host leaves a group it does not need to send any messages. Leaving a group is handled by a router which periodically sends a "Host Membership Query" to the link, and all group membership information is replied by the other machines with "Host Membership Report" messages. Using this periodic querying resolves the problem of hanging information of group membership after, for instance, the users multicast program or the whole host has crashed [14].

6. MBone

The MBone network is a virtual multicasting network on top of the Internet. It started as an experiment to multicast IETF meeting audio in March 1992 and in July 1992 over Internet. The MBone topology is a combination of mesh and tree [2, 13].

As multicasting is still rather new idea, all routers in the Internet are not yet multicast capable. So the MBone uses tunneling over routers that are not multicast capable. The MBone tunnels are currently done with IP-in-IP encapsulation. Tunnels have also been done with the help of loose source routing, but it consumes more router processing resources. There has also been multicasting tunnels in the Internet before MBone [13].

Current MBone protocol is DVMRP and the algorithm is RPF and Prunes. Earlier versions have used plain RPF and trucated broadcasts in leaf nets instead pruning. Currently inter domain protocol could be PIM or MOSPF.

The Session Directory program sd is used by MBone users to reserve new and browse current media channels in the net. It presents a simple list of ongoing sessions and the user can start the appropiate program to receive the session just by clicking the item in the list. Current popular applications are NetVideo nv and WhiteBoard wb.

7. Other multicasting

There are also other multicasting methods avilable than the normal UPD/IP multicasting with multicast addresses. Using the normal method naturally does not prevent using other methods at the same time.

7.1 IPv4 Option for Sender Directed Multi-Destinaton Delivery

IPv4 Option for Sender Directed Multi-Destinaton Delivery provides a unreliable UDP/IP delivery to a set of recipients that are listed in the option field of an IPv4 datagram [19]. It originated in the U.S. Army to provide a basis for a reliable sender directed multidestination transmission. The reliability is added at the application layer. Maximum number of receiver addresses that fit in the IP option field is 9 IP addresses.

7.2 Multicast Transport Protocol

The normal IP multicast uses connectionless UDP. Multicast Transport Protocol [18], MTP is a connection oriented and reliable multicast protocol. It does not use TCP but is a protocol directly on top of IP. The multicast group is called a web in the protocol, each web has one master that instantiates and controls the behavior of the web. Other users are either producers or consumers depending on if they are allowed to transmit user data or not. MTP uses a negative acknowledgement protocol. There are also other protocols for reliable multicast [5].

7.3 MARS-based ATM multicasting

As many multicasting applications need a lot bandwith, ATM would seem to be an attractive choice, but ATM is connection oriented and does not directly fit into the normal IP multicasting. One mechanism to implement IP multicasting is by using a Multicast Address Resolution Server (MARS). A MARS server works as a group information holder in one multicast cluster. A node willing to send to a group asks the list of group members from the MARS server and creates point to multipoint unidirectional VCs. The server can actually return ATM level multicast servers (MCS) instead of the real group members and the MCSs create VCs to receivers. MARS-based multicasting in ATM is still at the "working draft" level [20, 16].

7.4 CLNP Multicasting

There is a experimental specification for ISO IP world for similar multicast method as the normal UDP/IP multicasting [17]. It is a transmission of a CLNP datagram to end systems (interfaces or hosts in the IP terminology) that are identified with single group network address (multicast address in IP terminology). The communication between hosts and multicast capable routers (IGMP in the IP world) is done with a extension to the ES-IS protocol.

References

[1]: C. Huitema. Routing in the Internet, Prentice Hall, (ISBN 1-201-592-2498) 1995
[2]: Frequently Asked Questions (FAQ) on the Multicast Backbone (MBONE)
[3]: D. Waitzman, C. Partridge, S. Deering. RFC 1075, Distance Vector Multicast Routing Protocol, November 1988
[4]: S. Deering. RFC 1112, Host Extensions for IP Multicasting, November 1989
[5]: R. Braudes, S. Zabele. RFC 1458, Requirements for Multicast Protocols, May 1993
[6]: J. Moy. RFC 1585, MOSPF: Analysis and Experience, March 1994
[7]: R. Hinden, S. Deering. RFC 1884, IP Version 6 Addressing Architecture, December 1995
[8]: C. Partridge, F. Kastenholz. RFC 1726, Technical Criteria for Choosing IP The Next Generation (IPng), December 1994
[9]: J. Reynolds, J. Postel. RFC 1700, ASSIGNED NUMBERS, October 1994
[10]: Estrin, Farinacci, Jacobson, Liu, Wei, Sharma, Helmy. INTERNET-DRAFT, Protocol Independent Multicast-Dense Mode (PIM-DM): Protocol Specification January 1996
[11]: J. Pascuale, G. Polyzos, V. Kompella. The Multimedia Multicasting Problem
[12]: J. Mogul, S. Deering. RFC 1191, Path MTU Discovery, November 1990
[13]: V. Kumar. MBone: Interactive Multimedia on the Internet, New Riders Publishing, (ISBN 1-56205-397-3) 1995
[14]: W.R. Stevens. TCP/IP Illustrated: the protocols, Addison-Wesley Publishing Company, (ISBN 0-201-63346-9) 1994
[15]: J. Moy. RFC 1584, Multicast Extensions to OSPF, March 1994
[16]: Talpade, Ammar. INTERNET-DRAFT, Multicast Server Architectures for MARS-based ATM multicasting, March 1996
[17]: D. Marlow. RFC 1768, Host Group Extensions for CLNP Multicasting, March 1996
[18]: S. Armstrong, A. Freier, K. Marzullo. RFC 1301, Multicast Transport Protocol, February 1992
[19]: C. Graff. RFC 1770, IPv4 Option for Sender Directed Multi-Destination Delivery, March 1995
[20]: G. Armitage. INTERNET-DRAFT, Support for Multicast over UNI 3.0/3.1 based ATM Networks, February 1996
[21]: Deering, Estrin, Farinacci, Jacobson, Liu, Wei, Sharma. INTERNET-DRAFT, Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol Specification September 1995

9. Glossary

IP Internet Protocol, RFC 791

TCP Transmission Control Protocol, RFC 793

UDP User Datagram Protocol, RFC 768

IEEE Institute of Electrical and Electronics Engineer

IETF Internet Engineering Task Force

NASA National Aeronautics and Space Administration

BootP Bootstrap Protocol, RFC 951

AS Autonomous System

TTL Time To Live

RIP Routing Information Protocol, RFC 1058

RIP2 Routing Information Protocol, Version 2, RFC 1388

DHCP Dynamic Host Configuration Protocol, RFC 1531

ATM Asynchronous Transfer Mode

VC ATM Virtual Channel

ISO International Organization for Standardization

CLNP Connectionless Network Protocol

ES-IS End System-to-Indermediate System Protocol

Jorma Paananen
Jorma.Paananen@tele.fi

FF01:0:0:0:0:0:0:1	All Nodes
FF02:0:0:0:0:0:0:1	All Nodes
FF01:0:0:0:0:0:0:2	All Routers
FF02:0:0:0:0:0:0:2	All Routers
FF02:0:0:0:0:0:0:C	DHCP Server/Relay-Agent
FF02:0:0:0:0:1:XXXX:XXXX	Solicited-Node Address

IP	Internet Protocol, RFC 791
TCP	Transmission Control Protocol, RFC 793
UDP	User Datagram Protocol, RFC 768
IEEE	Institute of Electrical and Electronics Engineer
IETF	Internet Engineering Task Force
NASA	National Aeronautics and Space Administration
BootP	Bootstrap Protocol, RFC 951
AS	Autonomous System
TTL	Time To Live
RIP	Routing Information Protocol, RFC 1058
RIP2	Routing Information Protocol, Version 2, RFC 1388
DHCP	Dynamic Host Configuration Protocol, RFC 1531
ATM	Asynchronous Transfer Mode
VC	ATM Virtual Channel
ISO	International Organization for Standardization
CLNP	Connectionless Network Protocol
ES-IS	End System-to-Indermediate System Protocol