VXLAN

VXLAN is a network virtualization technology that attempts to address the scalability problems associated with large cloud computing deployments. It uses a VLAN-like encapsulation technique to encapsulate OSI layer 2 Ethernet frames within layer 4 UDP datagrams, using 4789 as the default IANA-assigned destination UDP port number. VXLAN endpoints, which terminate VXLAN tunnels and may be either virtual or physical switch ports, are known as VTEPs.

VXLAN is an evolution of efforts to standardize on an overlay encapsulation protocol. It increases scalability up to 16 million logical networks and allows for layer 2 adjacency across IP networks. Multicast or unicast with head-end replication (HER) is used to flood broadcast, unknown unicast, and multicast (BUM) traffic.

The VXLAN specification was originally created by VMware, Arista Networks and Cisco. Other backers of the VXLAN technology include Huawei, Broadcom, Citrix, Pica8, Big Switch Networks, Cumulus Networks, Dell EMC, Ericsson, Mellanox, FreeBSD, OpenBSD, Red Hat, Joyent, and Juniper Networks.

VXLAN was officially documented by the IETF in RFC 7348.

If configuring VXLAN in a VyOS virtual machine, ensure that MAC spoofing (Hyper-V) or Forged Transmits (ESX) are permitted, otherwise forwarded frames may be blocked by the hypervisor.

Note

As VyOS is based on Linux and there was no official IANA port assigned for VXLAN, VyOS uses a default port of 8472. You can change the port on a per VXLAN interface basis to get it working across multiple vendors.

Configuration

Common interface configuration

set interfaces vxlan <interface> address <address>

Configure interface <interface> with one or more interface addresses.

  • address can be specified multiple times as IPv4 and/or IPv6 address, e.g. 192.0.2.1/24 and/or 2001:db8::1/64

Example:

set interfaces vxlan vxlan0 address 192.0.2.1/24
set interfaces vxlan vxlan0 address 2001:db8::1/64
set interfaces vxlan <interface> description <description>

Set a human readable, descriptive alias for this connection. Alias is used by e.g. the show interfaces command or SNMP based monitoring tools.

Example:

set interfaces vxlan vxlan0 description 'This is an awesome interface running on VyOS'
set interfaces vxlan <interface> disable

Disable given <interface>. It will be placed in administratively down (A/D) state.

Example:

set interfaces vxlan vxlan0 disable
set interfaces vxlan <interface> disable-flow-control

Ethernet flow control is a mechanism for temporarily stopping the transmission of data on Ethernet family computer networks. The goal of this mechanism is to ensure zero packet loss in the presence of network congestion.

The first flow control mechanism, the pause frame, was defined by the IEEE 802.3x standard.

A sending station (computer or network switch) may be transmitting data faster than the other end of the link can accept it. Using flow control, the receiving station can signal the sender requesting suspension of transmissions until the receiver catches up.

Use this command to disable the generation of Ethernet flow control (pause frames).

Example:

set interfaces vxlan vxlan0 disable-flow-control
set interfaces vxlan <interface> disable-link-detect

Use this command to direct an interface to not detect any physical state changes on a link, for example, when the cable is unplugged.

Default is to detects physical link state changes.

Example:

set interfaces vxlan vxlan0 disable-link-detect
set interfaces vxlan <interface> mac <xx:xx:xx:xx:xx:xx>

Configure user defined MAC address on given <interface>.

Example:

set interfaces vxlan vxlan0 mac '00:01:02:03:04:05'
set interfaces vxlan <interface> mtu <mtu>

Configure MTU on given <interface>. It is the size (in bytes) of the largest ethernet frame sent on this link.

Example:

set interfaces vxlan vxlan0 mtu 9000
set interfaces vxlan <interface> ip arp-cache-timeout

Once a neighbor has been found, the entry is considered to be valid for at least for this specifc time. An entry’s validity will be extended if it receives positive feedback from higher level protocols.

This defaults to 30 seconds.

Example:

set interfaces vxlan vxlan0 ip arp-cache-timeout 180
set interfaces vxlan <interface> ip disable-arp-filter

If set the kernel can respond to arp requests with addresses from other interfaces. This may seem wrong but it usually makes sense, because it increases the chance of successful communication. IP addresses are owned by the complete host on Linux, not by particular interfaces. Only for more complex setups like load-balancing, does this behaviour cause problems.

If not set (default) allows you to have multiple network interfaces on the same subnet, and have the ARPs for each interface be answered based on whether or not the kernel would route a packet from the ARP’d IP out that interface (therefore you must use source based routing for this to work).

In other words it allows control of which cards (usually 1) will respond to an arp request.

Example:

set interfaces vxlan vxlan0 ip disable-arp-filter
set interfaces vxlan <interface> ip disable-forwarding

Configure interface-specific Host/Router behaviour. If set, the interface will switch to host mode and IPv6 forwarding will be disabled on this interface.

set interfaces vxlan vxlan0 ip disable-forwarding
set interfaces vxlan <interface> ip enable-arp-accept

Define behavior for gratuitous ARP frames who’s IP is not already present in the ARP table. If configured create new entries in the ARP table.

Both replies and requests type gratuitous arp will trigger the ARP table to be updated, if this setting is on.

If the ARP table already contains the IP address of the gratuitous arp frame, the arp table will be updated regardless if this setting is on or off.

set interfaces vxlan vxlan0 ip enable-arp-accept
set interfaces vxlan <interface> ip enable-arp-announce

Define different restriction levels for announcing the local source IP address from IP packets in ARP requests sent on interface.

Use any local address, configured on any interface if this is not set.

If configured, try to avoid local addresses that are not in the target’s subnet for this interface. This mode is useful when target hosts reachable via this interface require the source IP address in ARP requests to be part of their logical network configured on the receiving interface. When we generate the request we will check all our subnets that include the target IP and will preserve the source address if it is from such subnet. If there is no such subnet we select source address according to the rules for level 2.

set interfaces vxlan vxlan0 ip enable-arp-announce
set interfaces vxlan <interface> ip enable-arp-ignore

Define different modes for sending replies in response to received ARP requests that resolve local target IP addresses:

If configured, reply only if the target IP address is local address configured on the incoming interface.

If this option is unset (default), reply for any local target IP address, configured on any interface.

set interfaces vxlan vxlan0 ip enable-arp-ignore
set interfaces vxlan <interface> ip enable-proxy-arp

Use this command to enable proxy Address Resolution Protocol (ARP) on this interface. Proxy ARP allows an Ethernet interface to respond with its own MAC address to ARP requests for destination IP addresses on subnets attached to other interfaces on the system. Subsequent packets sent to those destination IP addresses are forwarded appropriately by the system.

Example:

set interfaces vxlan vxlan0 ip enable-proxy-arp
set interfaces vxlan <interface> ip proxy-arp-pvlan

Private VLAN proxy arp. Basically allow proxy arp replies back to the same interface (from which the ARP request/solicitation was received).

This is done to support (ethernet) switch features, like RFC 3069, where the individual ports are NOT allowed to communicate with each other, but they are allowed to talk to the upstream router. As described in RFC 3069, it is possible to allow these hosts to communicate through the upstream router by proxy_arp’ing.

Note

Does not need to be used together with proxy_arp.

This technology is known by different names:

  • In RFC 3069 it is called VLAN Aggregation
  • Cisco and Allied Telesyn call it Private VLAN
  • Hewlett-Packard call it Source-Port filtering or port-isolation
  • Ericsson call it MAC-Forced Forwarding (RFC Draft)
set interfaces vxlan <interface> ip source-validation <strict | loose | disable>

Enable policy for source validation by reversed path, as specified in RFC 3704. Current recommended practice in RFC 3704 is to enable strict mode to prevent IP spoofing from DDos attacks. If using asymmetric routing or other complicated routing, then loose mode is recommended.

  • strict: Each incoming packet is tested against the FIB and if the interface is not the best reverse path the packet check will fail. By default failed packets are discarded.
  • loose: Each incoming packet’s source address is also tested against the FIB and if the source address is not reachable via any interface the packet check will fail.
  • disable: No source validation
set interfaces vxlan <interface> ipv6 address autoconf

SLAAC RFC 4862. IPv6 hosts can configure themselves automatically when connected to an IPv6 network using the Neighbor Discovery Protocol via ICMPv6 router discovery messages. When first connected to a network, a host sends a link-local router solicitation multicast request for its configuration parameters; routers respond to such a request with a router advertisement packet that contains Internet Layer configuration parameters.

Note

This method automatically disables IPv6 traffic forwarding on the interface in question.

Example:

set interfaces vxlan vxlan0 ipv6 address autoconf
set interfaces vxlan <interface> ipv6 address eui64 <prefix>

EUI-64 as specified in RFC 4291 allows a host to assign iteslf a unique 64-Bit IPv6 address.

Example:

set interfaces vxlan vxlan0 ipv6 address eui64 2001:db8:beef::/64
set interfaces vxlan <interface> ipv6 address no-default-link-local

Do not assign a link-local IPv6 address to this interface.

Example:

set interfaces vxlan vxlan0 ipv6 address no-default-link-local
set interfaces vxlan <interface> ipv6 disable-forwarding

Configure interface-specific Host/Router behaviour. If set, the interface will switch to host mode and IPv6 forwarding will be disabled on this interface.

Example:

set interfaces vxlan vxlan0 ipv6 disable-forwarding
set interfaces vxlan <interface> vrf <vrf>

Place interface in given VRF instance.

See also

There is an entire chapter about how to configure a VRF, please check this for additional information.

Example:

set interfaces vxlan vxlan0 vrf red

VXLAN specific options

set interfaces vxlan <interface> vni <number>
Each VXLAN segment is identified through a 24-bit segment ID, termed the VNI, This allows up to 16M VXLAN segments to coexist within the same administrative domain.
set interfaces vxlan <interface> port <port>

Configure port number of remote VXLAN endpoint.

Note

As VyOS is Linux based the default port used is not using 4789 as the default IANA-assigned destination UDP port number. Instead VyOS uses the Linux default port of 8472.

set interfaces vxlan <interface> source-address <interface>
Source IP address used for VXLAN underlay. This is mandatory when using VXLAN via L2VPN/EVPN.

Unicast

set interfaces vxlan <interface> remote <address>
IPv4/IPv6 remote address of the VXLAN tunnel. Alternative to multicast, the remote IPv4/IPv6 address can set directly.

Multicast

set interfaces vxlan <interface> source-interface <interface>
Interface used for VXLAN underlay. This is mandatory when using VXLAN via a multicast network. VXLAN traffic will always enter and exit this interface.
set interfaces vxlan <interface> group <address>

Multicast group address for VXLAN interface. VXLAN tunnels can be built either via Multicast or via Unicast.

Both IPv4 and IPv6 multicast is possible.

Multicast VXLAN

Topology: PC4 - Leaf2 - Spine1 - Leaf3 - PC5

PC4 has IP 10.0.0.4/24 and PC5 has IP 10.0.0.5/24, so they believe they are in the same broadcast domain.

Let’s assume PC4 on Leaf2 wants to ping PC5 on Leaf3. Instead of setting Leaf3 as our remote end manually, Leaf2 encapsulates the packet into a UDP-packet and sends it to its designated multicast-address via Spine1. When Spine1 receives this packet it forwards it to all other Leafs who has joined the same multicast-group, in this case Leaf3. When Leaf3 receives the packet it forwards it, while at the same time learning that PC4 is reachable behind Leaf2, because the encapsulated packet had Leaf2’s IP-address set as source IP.

PC5 receives the ping echo, responds with an echo reply that Leaf3 receives and this time forwards to Leaf2’s unicast address directly because it learned the location of PC4 above. When Leaf2 receives the echo reply from PC5 it sees that it came from Leaf3 and so remembers that PC5 is reachable via Leaf3.

Thanks to this discovery, any subsequent traffic between PC4 and PC5 will not be using the multicast-address between the Leafs as they both know behind which Leaf the PCs are connected. This saves traffic as less multicast packets sent reduces the load on the network, which improves scalability when more Leafs are added.

For optimal scalability Multicast shouldn’t be used at all, but instead use BGP to signal all connected devices between leafs. Unfortunately, VyOS does not yet support this.

Example

The setup is this: Leaf2 - Spine1 - Leaf3

Spine1 is a Cisco IOS router running version 15.4, Leaf2 and Leaf3 is each a VyOS router running 1.2.

This topology was built using GNS3.

Topology:

Spine1:
fa0/2 towards Leaf2, IP-address: 10.1.2.1/24
fa0/3 towards Leaf3, IP-address: 10.1.3.1/24

Leaf2:
Eth0 towards Spine1, IP-address: 10.1.2.2/24
Eth1 towards a vlan-aware switch

Leaf3:
Eth0 towards Spine1, IP-address 10.1.3.3/24
Eth1 towards a vlan-aware switch

Spine1 Configuration:

conf t
ip multicast-routing
!
interface fastethernet0/2
 ip address 10.1.2.1 255.255.255.0
 ip pim sparse-dense-mode
!
interface fastethernet0/3
 ip address 10.1.3.1 255.255.255.0
 ip pim sparse-dense-mode
!
router ospf 1
 network 10.0.0.0 0.255.255.255 area 0

Multicast-routing is required for the leafs to forward traffic between each other in a more scalable way. This also requires PIM to be enabled towards the Leafs so that the Spine can learn what multicast groups each Leaf expect traffic from.

Leaf2 configuration:

set interfaces ethernet eth0 address '10.1.2.2/24'
set protocols ospf area 0 network '10.0.0.0/8'

! Our first vxlan interface
set interfaces bridge br241 address '172.16.241.1/24'
set interfaces bridge br241 member interface 'eth1.241'
set interfaces bridge br241 member interface 'vxlan241'

set interfaces vxlan vxlan241 group '239.0.0.241'
set interfaces vxlan vxlan241 source-interface 'eth0'
set interfaces vxlan vxlan241 vni '241'

! Our seconds vxlan interface
set interfaces bridge br242 address '172.16.242.1/24'
set interfaces bridge br242 member interface 'eth1.242'
set interfaces bridge br242 member interface 'vxlan242'

set interfaces vxlan vxlan242 group '239.0.0.242'
set interfaces vxlan vxlan242 source-interface 'eth0'
set interfaces vxlan vxlan242 vni '242'

Leaf3 configuration:

set interfaces ethernet eth0 address '10.1.3.3/24'
set protocols ospf area 0 network '10.0.0.0/8'

! Our first vxlan interface
set interfaces bridge br241 address '172.16.241.1/24'
set interfaces bridge br241 member interface 'eth1.241'
set interfaces bridge br241 member interface 'vxlan241'

set interfaces vxlan vxlan241 group '239.0.0.241'
set interfaces vxlan vxlan241 source-interface 'eth0'
set interfaces vxlan vxlan241 vni '241'

! Our seconds vxlan interface
set interfaces bridge br242 address '172.16.242.1/24'
set interfaces bridge br242 member interface 'eth1.242'
set interfaces bridge br242 member interface 'vxlan242'

set interfaces vxlan vxlan242 group '239.0.0.242'
set interfaces vxlan vxlan242 source-interface 'eth0'
set interfaces vxlan vxlan242 vni '242'

As you can see, Leaf2 and Leaf3 configuration is almost identical. There are lots of commands above, I’ll try to into more detail below, command descriptions are placed under the command boxes:

set interfaces bridge br241 address '172.16.241.1/24'

This commands creates a bridge that is used to bind traffic on eth1 vlan 241 with the vxlan241-interface. The IP-address is not required. It may however be used as a default gateway for each Leaf which allows devices on the vlan to reach other subnets. This requires that the subnets are redistributed by OSPF so that the Spine will learn how to reach it. To do this you need to change the OSPF network from ‘10.0.0.0/8’ to ‘0.0.0.0/0’ to allow 172.16/12-networks to be advertised.

set interfaces bridge br241 member interface 'eth1.241'
set interfaces bridge br241 member interface 'vxlan241'

Binds eth1.241 and vxlan241 to each other by making them both member interfaces of the same bridge.

set interfaces vxlan vxlan241 group '239.0.0.241'

The multicast-group used by all Leafs for this vlan extension. Has to be the same on all Leafs that has this interface.

set interfaces vxlan vxlan241 source-interface 'eth0'

Sets the interface to listen for multicast packets on. Could be a loopback, not yet tested.

set interfaces vxlan vxlan241 vni '241'

Sets the unique id for this vxlan-interface. Not sure how it correlates with multicast-address.

set interfaces vxlan vxlan241 port 12345

The destination port used for creating a VXLAN interface in Linux defaults to its pre-standard value of 8472 to preserve backwards compatibility. A configuration directive to support a user-specified destination port to override that behavior is available using the above command.

Unicast VXLAN

Alternative to multicast, the remote IPv4 address of the VXLAN tunnel can be set directly. Let’s change the Multicast example from above:

# leaf2 and leaf3
delete interfaces vxlan vxlan241 group '239.0.0.241'
delete interfaces vxlan vxlan241 source-interface 'eth0'

# leaf2
set interface vxlan vxlan241 remote 10.1.3.3

# leaf3
set interface vxlan vxlan241 remote 10.1.2.2

The default port udp is set to 8472. It can be changed with set interface vxlan <vxlanN> port <port>