Overview
Join Jeff Kish as he dives into VXLAN and EVPN, two technologies that are crucial to modern data centers.
Gain an understanding of VXLAN VNIs, Leaf-Spine, VXLAN Data Plane, and VXLAN Control Plane options: Flood and Learn, and EVPN.
Intro
Welcome to Explain VXLAN and EVPN!
VXLAN and VNIs
In this session we cover VXLAN and VNIs.
Knowledge Check
How many bits are in a Virtual Network Identifier (VNI)?
- A12
- B16
- C20
- D24
- E28
Verify your team's readiness — Request a Demo to verify practice assessments, completion reporting, and CSV / SCORM exports on the Team plan.
VXLAN and Spine-Leaf
VXLAN integrates well with Spine-Leaf.
Knowledge Check
How can the VXLAN infrastructure be extended into the virtual space?
- ABy building VXLAN tunnels to virtual switches
- BBy establishing L3 routing to virtual routers
- CBy connecting virtual hosts to spine switches
- DBy eliminating VLANs in the virtual space
Verify your team's readiness — Request a Demo to verify practice assessments, completion reporting, and CSV / SCORM exports on the Team plan.
VXLAN Data Plane
VXLAN is primarily a Data Plane protocol.
Knowledge Check
Which two types of encapsulation are used by VXLAN? (Choose two)
- AIP
- BUDP
- CTCP
- DRTP
- EMAC
Verify your team's readiness — Request a Demo to verify practice assessments, completion reporting, and CSV / SCORM exports on the Team plan.
VXLAN Control Plane - Flood and Learn
When VXLAN lacks a control plane protocol, it uses flood and learn.
Knowledge Check
When using flood-and-learn, when does a leaf switch learn about a MAC address on a remote leaf switch?
- AWhen it receives a packet from the end host
- BWhen it receives a packet from the remote leaf switch
- CWhen it receives an update from the remote leaf switch
- DWhen it receives an update from the spine switch
Verify your team's readiness — Request a Demo to verify practice assessments, completion reporting, and CSV / SCORM exports on the Team plan.
VXLAN Control Plane - EVPN
EVPN can be deployed as a VXLAN control plane protocol.
Knowledge Check
Which type of traffic is reduced when using EVPN as a VXLAN control plane?
- AUnknown Unicast
- BKnown Unicast
- CMulticast
- DBroadcast
Verify your team's readiness — Request a Demo to verify practice assessments, completion reporting, and CSV / SCORM exports on the Team plan.
Review and Quiz
Let's review VXLAN and EVPN!
Knowledge Check
What best describes VXLAN?
- AA data plane protocol
- BA control plane protocol
- CA management plane protocol
- DAn orchestration plane protocol
Verify your team's readiness — Request a Demo to verify practice assessments, completion reporting, and CSV / SCORM exports on the Team plan.
Conclusion
I hope this has been informative for you and I would like to thank you for consuming.
View Transcript
Intro
0:00[MUSIC PLAYING]
0:05Welcome to Explain VXLAN and EVPN.
0:08In this video series, we're going to dive deep
0:10into this concept of VXLAN, because we've
0:12been talking about it quite a few times
0:14throughout this skill.
0:14We've mentioned that we're going to use it
0:16in a leaf-spine architecture, we're
0:18going to create these tunnels between these switches.
0:21And somehow we're going to transport our Layer
0:232 across Layer 3 boundaries.
0:24And so, this whole concept of "somehow"
0:26needs to go away, because we need to dive deep
0:28and understand how exactly VXLAN's accomplishing that.
0:31Furthermore, VXLAN operates at Layer 2.
0:34And if we think about how we do control plane at Layer 2,
0:37we think about flood and learn mechanisms,
0:39and that's exactly how VXLAN was designed to operate as well.
0:41However, in our data centers, flood and learn
0:43isn't usually the most efficient way of doing things,
0:46and so we probably are going to want to look at a control plane
0:49mechanism like EVPN.
0:51And so we'll take a look at that later on as well.
0:53So not only the VXLAN at the data plane,
0:55but VXLAN at the control plane.
0:57Those are topics are going to need to cover and understand
0:59for our DCID.
1:00And with that, I will see you in the next video.
VXLAN and VNIs
0:00[MUSIC PLAYING]
0:05Most of our conversation of VXLAN to this point
0:07has been about the fact that we can
0:09use VXLAN to extend layer 2 connectivity across the layer 3
0:12boundary.
0:13VXLAN does more than just that.
0:14VXLAN is going to provide us with a whole new way
0:17of identifying our broadcast domains.
0:19VLANs are limited to 12 bits.
0:22We only get 4,096 VLANs and minus a few reserved ones.
0:26And so we don't get a whole lot of our VLANs--
0:29we don't get a whole lot of broadcast domains
0:31in our data centers.
0:32For a small, medium-sized data center, four thousand's a lot.
0:35But for a service provider trying
0:37to service potentially thousands of customers,
0:394,000 is very limiting.
0:41And so, fortunately for us, VXLAN
0:43is going to expand that up to 16 million different network
0:46segments that we call virtual network identifiers.
0:48So let's dive in and see what these VNIs are all about.
0:51Data center are all about layer 2.
0:53I think we're pretty comfortable with that concept
0:55at this point.
0:56And so there's going to be a lot of broadcast domains inside
0:58of our data centers.
1:00And so, as mentioned in the intro,
1:01we're going to divide those broadcast
1:03domains among different virtual LANs, otherwise known as VLANs.
1:06We've been configuring VLANs since we
1:08were pretty early on in our networking careers for most us.
1:11We're comfortable with this concept.
1:13We've got over 4,000 VLANs because we assign 12 bits
1:16to our VLAN identifiers.
1:18And this sounds like a lot.
1:19If you're working for a smaller organization or maybe
1:22even an enterprise-level organization,
1:244,000 VLANs in a single space, a layer 2 domain, is a lot.
1:29But, as mentioned, some of our larger data
1:31centers, it's nowhere near enough,
1:33especially when you think about a service provider.
1:35If we've got a service provider, for example,
1:37that has let's just say 4,000 customers, well, 4,000
1:41customers, we don't have a whole lot of VLANs
1:43that we can assign to them.
1:44And so we've been using a lot of tricks over the years.
1:46For example, we've got private VLANs.
1:48Certainly we could place one organization
1:50into a single private VLAN and then
1:52try to extend a bunch of sub VLANs within that organization.
1:56But that's not really scalable and doesn't work very well.
1:59And so we've done other things, like try to isolate our layer
2:022 domains a little bit more.
2:03So we talked about the pod concept.
2:06And so every pod, basically, is going
2:08to be given the 4,000 VLANs.
2:10And maybe we're not out of space at the aggregation layer,
2:13like we've talked about.
2:14And instead, we've just exceeded our 4,000 VLANs.
2:16And so we need to create a new pod, which means a new layer 3
2:19domain.
2:20And so going back to this 4,000 number,
2:21we recognize that if we've got 4,000 customers, without doing
2:25anything else, we're only going to get
2:26a single VLAN per customer, which just isn't enough.
2:30We know that a lot of our customers
2:31are going to require, potentially,
2:33a dozen VLANs, potentially more depending
2:36on how much of a deployment that they're
2:38putting into our data center.
2:40Now, fortunately for us, we understand
2:41that VXLAN at this point is going
2:43to integrate primarily into a leaf and spine architecture.
2:45It doesn't strictly speaking require
2:47that physical architecture.
2:48We'll go ahead and draw this out as spine and leaf.
2:50And so we've got our spine switches connecting down
2:53to our leaf switches.
2:54And this domain in here we call the fabric.
2:56The fabric is layer 3.
2:58So we don't have to worry about broadcast domains or broadcast
3:00storms and Spanning Tree Protocol or any of that.
3:03The fabric is layer 3.
3:05So that is wonderful from scaling perspective.
3:07And then we are going to have our 4,000 VLANs extending down
3:10on a per switch basis because since we have layer
3:143 boundaries upstream, we've got the entirety of our broadcast
3:17domain set within a single switch.
3:20Now, from here, we're going to extend our layer 2 domain
3:22from one switch to another.
3:23And so we need to understand that even though we've
3:25got 4,000 VLANs over here, for example,
3:28we are going to start binding them together.
3:30And we're going to use VXLAN to do that.
3:33So we need a little bit of terminology here.
3:34First of all, this is a VXLAN tunnel as we understand it.
3:38And so we get to use a new phrase for these edge switches.
3:41These are known within the VXLAN world
3:43as a VXLAN Tunneling Endpoint, otherwise known as a VTEP.
3:47Furthermore, we need a new way to identify
3:49the different domains that we're connecting.
3:50And so we're going to map a single VLAN to a single VXLAN
3:54identifier.
3:55And that identifier is going to be known as a Virtual Network
3:58Identifier, usually referred to as a VNI.
4:01Every now and again, you might see this
4:02referred to as a VN ID, or otherwise sometimes called
4:05a VNID.
4:06However, for the most part, we'll
4:07see it abbreviated as VNI.
4:09So what we're doing here, what we're trying to accomplish
4:11is let's say we've got VXLAN here.
4:13Let's say this is VLAN 100.
4:16And we want to connect the servers and the clients
4:18and everything that's connected on VLAN 100
4:20to this broadcast domain over here.
4:23In other words, we want them to be on the same subnet.
4:25Now, what's interesting about this
4:26is that this doesn't actually have to be
4:28VLAN 100 on the other side.
4:29Certainly we could do it that way.
4:31Just maybe organizationally it makes the most sense.
4:34However, we could call this VLAN 102 or any VLAN identifier
4:38because the VLANs are not carried across our VXLAN
4:41tunnels.
4:42And so we're going to send this traffic through the VXLAN
4:45tunnel to one another.
4:46And, therefore, at this point, the clients
4:48that are on the second switch are
4:49going to be on the same subnet as the clients that
4:51are on the first switch.
4:53And this is where the VNI comes into play, because essentially
4:55what we're going to do is we're going to take VLAN 100,
4:58and we're going to map it to a particular VNI.
5:01Let's say we map this to VNI 6,000.
5:04And yes, we do I have 6,000 VNIs.
5:05We're going to be talking about that here in a few moments.
5:08And then once it gets through the VXLAN tunnel,
5:09that traffic is going to be mapped to our VLAN
5:11on the other side-- in this case, 102.
5:13And so the VNI is what bridges the gap between our two
5:16different broadcast domains and make it, effectively,
5:19a single broadcast domain, even though we're
5:20traversing a layer 3 fabric.
5:22Now, to further understand this, let's go ahead and break down
5:25the VXLAN header.
5:26Breaking down a header is always a great way
5:28to see what exactly is happening.
5:29Now, first of all, this is an 8-byte header.
5:32And it's going to encapsulate the layer 2 frame.
5:34So this goes on the outside of our entire ethernet
5:37frame, which is a little bit different
5:39from a tunneling perspective.
5:40And we're going to see this in more detail here
5:41in a little bit.
5:42So let's go and break these 8 bytes down.
5:44And when we do that, what we find
5:45is that the first three bytes are
5:47reserved for the virtual network identifier.
5:50And yes, I did not misspeak.
5:51That's 3 bytes.
5:52That would be 24 bits reserved for the VNI.
5:55We only reserved 12 bits for the VLANs, and that got us 4,000.
5:59But when we take 24 bits, that's going to result in
6:02over 16 million different VNIs.
6:05Our scale just went from 4,000 to over 16 million.
6:08Technically speaking, it's 16,777,216.
6:14Hopefully, Cisco doesn't ask us that on the exam,
6:16but, just in case, there it is.
6:18So those are the first three bytes.
6:19That would be for the VNI.
6:20The next byte is just a single byte dedicated for flags.
6:24And believe it or not, the last 4 bytes, these
6:26are all going to be reserved.
6:28Now, let me move our VNI count out of the way
6:30here because we do need to talk about one other concept.
6:32And that is the idea of VNI scopes.
6:35And really what we're talking about here
6:36is the idea that we might have up to 16 million different VNIs
6:40in our environment.
6:41And I know that when we're managing, for example,
6:434,000 VLANs that we tend to want to know what each VLAN is for.
6:47VLAN 10 might be for data.
6:49VLAN 20 might be for Wi-Fi, et cetera.
6:51But when we have 16 million different network segments,
6:55we might not want to necessarily have
6:57to manually manage each one.
6:58And so what we're going to find is
7:00that we've got two different ways of managing these VNIs.
7:02One would be if we want to manage them like VLANs
7:05and know which VNI is mapped to which broadcast domain.
7:08And the other of which would be maybe
7:10we should embrace a more dynamic nature
7:12of managing 16 million VNIs.
7:14So the first of which is going to be
7:15known as a network-wide scope.
7:17The network-wide scope says that a layer 2 domain is represented
7:21by the exact same VNI everywhere in the network, meaning that,
7:24once again, up here where we've got a broadcast domain, that,
7:27yes, the VLAN ID changes, but the VNI stays the same.
7:30And so, for example, if we have a host over here
7:33that needs to be part of that broadcast segment,
7:35we're going to need to ride over to those other switches
7:37as well.
7:38And so we would use the same VNI in this case,
7:41and that would be reserved for that particular broadcast
7:43domain.
7:44This is nice from a readability perspective-- we might say even
7:46from a troubleshooting perspective--
7:48to be able look at a VNI and know which broadcast domain
7:50that represents.
7:52However, the downside to this is management.
7:54Holy cow, are we really going to do
7:55this for potentially thousands and maybe even millions of VNIs
8:00in our data center?
8:01And so this is where the second option
8:02comes into play, which is known as locally scoped.
8:06This is a dynamic process, meaning
8:08that really what we care about is connecting the two broadcast
8:11domains among our different switches.
8:13And so these VNIs over here, now on the right,
8:15they could be different than the VNIs that are used
8:17between the other two switches.
8:19And so this is great from a management perspective.
8:21And in larger data centers, this is absolutely
8:23what we're going to want to do.
8:25However, the downside, naturally,
8:26is that the readability is going to be a little bit harder.
8:29However, for the most part, we're
8:30not going to troubleshoot a whole lot of VXLAN
8:32at the VNI layer.
8:34And so, in most cases, again, especially at larger data
8:37centers, we're going to find that we deploy locally scoped--
8:39and in other words, dynamic--
8:41VNI assignment.
8:42So, hopefully, it's clear that VXLANs are going to increase
8:45our scalability inside of our data centre.
8:47It's going to take our 4,000 broadcast segments that we
8:49supported with VLANs and expand that to over 16 million
8:53in VXLAN.
8:54And so I suppose that's our second point.
8:56We had 16 million different VNIs that we can use
8:59to connect our layer 2 domains.
9:01The VNIs are going to be how we connect a broadcast
9:04domain that's that one switch to the broadcast
9:06domain in another switch.
9:07As a result, we get 4,000 broadcast domains
9:10within a single switch.
9:11And we can extend those throughout the data center
9:13as needed using these VXLAN VNIs.
9:16Now, lastly, we talked about the scopes.
9:17We've got the network-wide scope.
9:19That would be-- think about the fact
9:21that we used to deploy VLANs with VLAN names
9:24and with a specific purpose.
9:26And that would be what network-wide scope means.
9:28However, the locally-scoped option
9:30is going to be more of a dynamic process, where the switches
9:33will just grab a VNI.
9:34And we don't need to worry about it anymore.
9:36That's going to make our lives a whole lot easier,
9:38even if it does reduce our readability just a bit.
9:40I hope this has been informative for you,
9:42and I'd like to thank you for viewing.
VXLAN and Spine-Leaf
0:00[MUSIC PLAYING]
0:05As we understand VXLAN at a deeper level,
0:07it's going to become more and more clear
0:09why we'd want to deploy spine and leaf architectures
0:11into our data centers.
0:12In this video, we're going to see how VXLAN really matches up
0:15very well to spine and leaf and also
0:17reviewing a couple of the key most important parts of a spine
0:20and leaf architecture and why we'd be deploying them
0:22from the beginning.
0:23Let's take a look.
0:24So once again, let's draw out this spine and leaf
0:26architecture.
0:27It's important that we understand
0:28how exactly this works and why it's so special,
0:30and so this is why we're spending
0:31so much time on this architecture
0:33throughout these skills.
0:34So as a quick review, we know that the spine layer
0:36is really primarily geared towards being a backbone.
0:39All he cares about is switching between our leaf switches.
0:42That's because all of the intelligence
0:44actually lives down here at the leaf layer,
0:46meaning that, when a packet comes in, the switch is what
0:48determines, hey, I'm leaf one, let's say,
0:51and I need to send this packet to leaf three.
0:53We've got leaves one, two, three, and four.
0:56And so I encapsulate that packet into VXLAN
0:58with a destination IP address of leaf three.
1:00And so, when the spine switch receives
1:02that packet, it realizes, hey, I've
1:03just got to send that down to leaf three as fast as possible.
1:06And so it gets down to leaf three.
1:08Leaf three deencapsulates that packet
1:10and forwards the packet down from there.
1:12Furthermore, any outbound connections
1:14are going to come in through the leaf layer.
1:16We're not going to connect, for example, our internet circuits
1:18and our core layer in via the backbone.
1:20We're going to bring that straight into the leaf layer
1:22as we would bring in any other connection.
1:24And so everything connects at the leaf layer.
1:26The backbone, spine switches, are primarily
1:28geared towards switching among the leaf switches.
1:31If we want more ports, then we will expand our leaf layer.
1:33And if we need more backbone bandwidth,
1:35then we simply expand the spine layer.
1:37And because we're not running any services,
1:39such as first-hop redundancy protocols
1:40and spanning tree and all of this,
1:42we can simply scale out to however many spines we need
1:44and to however many leaf switches
1:46that we need to service our data center properly.
1:49Now VXLAN is the secret to why spine and leaf works so well.
1:52A key part of the spine and leaf architecture
1:54is that this fabric connection is layer 3, as we
1:56discussed in the last video.
1:58And so we're going to use these VXLAN tunnels
2:00to not only connect our leaf switches together,
2:02actually to do it at layer 2.
2:05And so we can, for example, have a host here on the left.
2:08We can have a host here on the right.
2:10And these two hosts can be in the same subnet, even
2:12though, technically speaking, they're
2:14separated by a layer 3 boundary.
2:15VXLAN gets them over that hurdle.
2:17Now, first and foremost, VXLAN is
2:19going to enable us to scale out our layer 2.
2:21We're able to scale it out without spanning tree.
2:24And furthermore, without even first-hop redundancy protocols.
2:26We're going to take advantage of something
2:28that we call anycast gateways.
2:30Anycast gateways are going to be deployed to the leaf layer.
2:33For example, if my gateway is, let's say,
2:35.100, well then my .100 SVI is going to live here and here
2:40and here.
2:41We can deploy it to every single leaf switch,
2:42where it makes sense to do so.
2:44If, for example, I'm going to deploy it to leaf three,
2:46then I hope that I've got a client there
2:48that is supposed to be part of that same broadcast domain.
2:51If, for example, I don't have any hosts on leaf four,
2:53then I wouldn't want to deploy that gateway there
2:55because that will consume resources
2:56that don't need to be spent.
2:58So instead, I'm only going to deploy the anycast gateway
3:00to the leaf switches where it's required.
3:02Now, because of spine and leaf, we're
3:03going to be able to load balance across the different spine
3:05switches.
3:06We're relying purely on layer 3 at this point and so,
3:09because we're sending our traffic to a destination VXLAN
3:12tunneling endpoint-- we talked about that in the last video--
3:15each one of these leaf switches is known as a VTEP.
3:17When we're sending it to a destination IP address,
3:19we can choose which of these links to send it up.
3:22So I could send it up link one or I could send it
3:24up link two, which would take me to spine one and spine two,
3:27respectively.
3:28The hope here is that, as I'm load balancing this traffic
3:30throughout the network, that my spine switches will
3:32be evenly consumed from a resource perspective.
3:36And so this multi-pathing concept
3:37not only gives me good load balancing but also, by the way,
3:40good resiliency.
3:41If anything in this fabric were to go down,
3:43whether it's a link to a spine or a spine itself, then
3:46I continue to be able to afford my traffic because
3:48of the redundancy that I have to the second spine or maybe
3:51third spine or however many spines I have.
3:53Now I want to introduce another concept to this equation
3:55as well because at some point, we
3:57need to recognize that we're going
3:58to have virtual hosts that are connecting down from our leaf
4:01switches.
4:02And these virtual host can make things a little bit
4:04confusing because inside these virtual hosts
4:07is a virtual switch.
4:08This virtual switch is going to connect us down
4:10to our virtual machines.
4:11So we have a lot of different virtual machines connecting
4:14downstream from this virtual switch.
4:16And in the meantime, we might do something
4:17like this, where we bring a physical connection in
4:20and a physical connection in, so we're
4:21connecting to two different leaf switches for redundancy.
4:24And then these get mapped to the output
4:26side of the virtual switch.
4:27Now, something we need to decide at this point
4:29is if we want these connections to run at what we
4:32would call classical ethernet.
4:34Classical ethernet is going to be with VLANs
4:36and spanning tree protocol and all of the things
4:38that we usually need to make this work.
4:40For example, even if we're not planning
4:42to block any of these links, because a virtual switch
4:44doesn't typically bring packets in and send them back out,
4:47we'd still need to be sending our BPDUs just in case.
4:50And so this is why we call it the classical ethernet world,
4:53because we're relying on these more traditional network
4:55architectures rather than VXLAN.
4:58However, one of the great things about VXLAN and spine and leaf
5:01is that I can actually extend my VXLAN down
5:03into the virtual layer.
5:05For example, I could form a VXLAN tunnel from leaf number
5:08four, down with this virtual switch,
5:10if it's capable of forming a VXLAN tunnel.
5:12That turns my virtual switch down here into a VTEP.
5:16And so maybe I've got another tunnel from leaf number three,
5:18and for that matter, leaf two and leaf one.
5:20At this point, this is no longer a classical ethernet domain.
5:23I don't need to worry about any of these other protocols
5:25and potential loops downstream.
5:27And so I've got the ability to extend my VXLAN domain down
5:30into the virtual layer this way, and that means, by the way,
5:33that I can extend a lot of my policy
5:35down to the virtual layer as well.
5:37Cisco makes a great platform for this.
5:38That would be the 9000V, this virtual Nexus
5:419000 that's designed to go into the virtual space.
5:44And so whether it's Cisco's product
5:45or a different product-- there's plenty of third-party products
5:48out there that could go into the virtual environment
5:50that we could use to terminate a VXLAN tunnel,
5:53even if our spine and leaf is built on Cisco Nexus
5:56and the virtual host is not.
5:58And so, in the end, VXLAN's going to keep us very flexible.
6:01It's going to give us all of the benefits that we've
6:02been talking about up until now, plus the ability
6:05to extend all of this great policy
6:06that we're deploying down into the virtual layer.
6:09So just to review this, once again,
6:10spine and leaf architectures are designed
6:12to help us make our data center scale.
6:14It's designed to help us with very small data centers,
6:17sure, but primarily, as we get larger and larger into our data
6:20center spaces, we're going to see
6:21the benefits of spine and leaf really present themselves.
6:24Now, VXLAN does provide a layer 2 extension,
6:27but it gives us so many more benefits as well.
6:29In the last video, we saw the VNI concept
6:31and how we can extend to many different broadcast domains.
6:34And in this video, we talked about the fact
6:35that, yes, we've got load balancing.
6:38We've got resiliency.
6:39We've got these anycast gateways that we can deploy.
6:41It makes it so that our data center fabrics scale
6:44not only from a physical perspective
6:45but also at layer 2.
6:47Now, lastly, we saw that we can even go beyond the leaf layer.
6:49If we need to push our VXLAN connections down
6:51into the virtual layer, then VXLAN supports that.
6:54As long as the virtual switch that we're
6:55deploying, which maybe is the Nexus 9000V, for example,
6:58as long as that supports being a VXLAN tunneling
7:01endpoint or a VTEP, then we can make that connection happen.
7:03I hope this has been informative for you
7:05and I'd like to thank you for viewing.
VXLAN Data Plane
0:05First and foremost, VXLAN is a data plane protocol.
0:08What that means is that VXLAN is going
0:10to take care of sending the frame-- sending it
0:12across the Layer 3 fabric, for example.
0:15But it's going to assume that, "hey, I
0:17need to know where to send this, so I need somebody
0:19to tell me where to send it."
0:20I'm going to assume that that information has already
0:22been found.
0:23And so let's start in this video by making that assumption,
0:26and we'll cover exactly how that happens in the next two videos.
0:29But for now, again, we're going to assume that, "hey, I
0:31know where I'm sending the traffic,
0:32and therefore now I need to encapsulate it.
0:34I need to send it out."
0:35And so we're going to look at how exactly VXLAN does that
0:38from a data plane perspective.
0:39Let's dive in.
0:41We spend a lot of time talking about how VXLAN
0:43is all about carrying Layer 2 over Layer 3,
0:45and it's finally time to see how exactly it
0:47goes about doing that.
0:48The way this works is, we're going
0:50to take an original Ethernet frame,
0:51and this Ethernet frame is going to have everything.
0:53It's going to have the Ethernet header,
0:55it's going to have the IP header,
0:56and it's going to have the TCP or UDP header, whatever we
0:59have at Layer 4 going on there.
1:01And it's going to have the payload.
1:02Now, we're going to take this entire thing
1:04and encapsulate it inside of a VXLAN header.
1:07Now, VXLAN itself isn't going to have the source and destination
1:10IP addresses and everything like that.
1:12And so we're actually going to further encapsulate VXLAN
1:15into a UDP header, followed by another IP header.
1:19At this point, we'd encapsulate it into Ethernet
1:21and send it off just like we typically would.
1:23And so everything up here is the responsibility of VXLAN,
1:26and we're going to call this Mac and IP/UDP encapsulation.
1:30Reason is probably straightforward,
1:31we have a MAC address over here and we're encapsulating
1:34that inside of IP and UDP.
1:36So let's talk about these headers, first of all the IP
1:38header.
1:39The IP header is going to be sourced and destined
1:41from our VTEPs.
1:43And so when I send it from one VTEP to another VTEP over here,
1:47I'm going to send it through the spine layer,
1:49and my source IP address is going
1:50to be an IP that I use on the fabric from my ingress VTEP,
1:54and my destination IP will be the destination
1:56VTEP on the other side.
1:58Next, let's talk about UDP.
2:00UDP is going to use specifically port 4,789, so 47-89.
2:05And by the way, this did change at one point.
2:07This is how VXLAN should be deployed today.
2:09However, before this port number was standardized,
2:12it could have been anything, depending on the vendor.
2:14And so understand that it should be 4,789, but at the same time,
2:18if you are deploying this with a different set of hardware
2:22that maybe is relying on an older implementation of VXLAN,
2:24the port could possibly be different.
2:26Now, UDP is going to provide all of the typical benefits
2:29that UDP does.
2:30And you might notice also by the way that, "wait a second,
2:33we're using UDP, why aren't we using TCP?
2:36Because TCP would be far more reliable."
2:38Well, the reason we're using UDP is because if we do
2:40need reliability in this Ethernet frame
2:42that we're transmitting, then we should already
2:44have TCP embedded in here.
2:47And so on the case that I draw out, we do.
2:49Now, we could also be forwarding a UDP frame.
2:51If we're forwarding an Ethernet frame that's built on UDP,
2:55then we wouldn't have any reliability in this.
2:57But the point is that we don't actually
2:59need to provide additional reliability,
3:01because if we need the reliability,
3:04it should already be there in the packet.
3:06Now, next, the VXLAN header.
3:07We already broke down the VXLAN header,
3:09but let's just take a look at it.
3:10It's eight bytes in size, and we remember
3:12that we have three bytes dedicated for the VNI.
3:15This provides us with 24 bits, which
3:17is, once again, 16 million-- really closer to 17 million.
3:20It's like 16.777 million different virtual network
3:24identifiers.
3:25Furthermore, we've got one byte that's dedicated for flags,
3:27and the rest of it is reserved.
3:29So the remaining four bytes, we're
3:31not actually going to use this.
3:33Now, with any tunneling mechanism,
3:34we always need to come back to a maximum transmission unit
3:37conversation.
3:38And I know it's everyone's favorite time when we talk
3:40about MTU, but it is important.
3:43So let's walk through this.
3:44First of all, we know that the VXLAN we've already
3:46identified this is eight bytes.
3:48Second, UDP is also another eight bytes,
3:50so we need to account for this.
3:52Third, IP is going to be 20 bytes.
3:54So, so far we've got 20 plus 8 plus 8, so that should be 36.
3:58However, we're not actually done counting yet.
4:00Because remember, we don't consider
4:02Ethernet to be part of our Layer 2 header.
4:04However, in this case, inside the frame that we're
4:06transmitting, we've got this as 1,500 bytes,
4:09but we also have this Ethernet header
4:11that we need to account for.
4:12And in most cases, this is going to be a 14-byte header.
4:15This results in an MTU increase of 50 bytes required
4:18on the fabric itself.
4:20So if we don't have 1,550 bytes configured on our Layer 3
4:23fabric, then we're going to end up fragmenting it,
4:25and VXLAN doesn't even support that, so let's just not
4:27go there.
4:28Now, the interesting note in this
4:29is that we're assuming here that our Ethernet
4:31header is 14 bytes.
4:33However, there is a possibility that we are carrying a VLAN
4:36identifier in here.
4:37If we have a VLAN tag that adds four bytes, which
4:40means that now we're talking about 18 bytes for the Ethernet
4:43header that's embedded.
4:44And so we might actually see 54 bytes required on the fabric.
4:48This would be, once again, with a DOT1Q header.
4:51In most cases, we will not need to worry about these extra four
4:54bytes, and that's because we're generally
4:55going to map our VLAN to our VNIs at a one-to-one ratio,
4:58and at that point, we don't need to carry the VLAN IDs.
5:01In fact, carrying the VLAN IDs between switches
5:03restricts what we're doing.
5:05Because remember that we had VLAN 100 on this side
5:08and VLAN 102 on this side, and the flexibility to say that,
5:11hey, I don't care what the VLAN is on the other side,
5:13because I might already have you on 100
5:15assigned to a different customer over here.
5:17So the fact that I can assign it to 102 is excellent.
5:19However, every now and again, we might still
5:21need to transmit the tag.
5:22Maybe it's because we're transmitting it
5:24for a customer who requires the tag.
5:27Who knows?
5:27So ideally, we build our data centers
5:29where we don't need to transmit the tags,
5:31and that would be 50 bytes of MTU.
5:32However, let's always keep in mind
5:34that if we do need the VLAN tag carried,
5:36then that would be another four bytes,
5:37so we're at 54 at that point.
5:39So let's break down the VTEP anatomy.
5:41Generally speaking, when we have a Virtual Tunneling Endpoint,
5:43we're going to have a fabric interface that
5:46points into the Layer 3 domain.
5:48Now, in many cases, we might assign this physical interface
5:50to have an IP address and be part of the Layer 3 fabric.
5:53But remember that we're going to have connections up
5:55to each spine switch.
5:56So if we have another connection up to a different spine,
5:58then we'd also need that to be Layer 3,
6:00and therefore at that point, we're
6:02consuming another IP address.
6:03In those cases, we might even need to make them /30,
6:06so we're consuming four IP addresses for the sake
6:09of having a point-to-point link.
6:10And so in some cases, we might actually
6:12deploy a virtual IP address in here
6:14on a loopback, which is called loopback zero.
6:16And then we deploy the concept of IP unnumbered
6:19onto this interface here.
6:20And furthermore, we can apply that
6:22to all of its upstream interfaces.
6:24This conserves the number of IP addresses in our environment,
6:27because now we're only consuming a single IP address, so one IP
6:30address, rather than some number of IP addresses,
6:33N IP addresses, depending on how many uplinks
6:35we have to the spine switches.
6:37And so certainly in smaller data centers,
6:39this isn't as big of a deal.
6:40However, in larger data centers, it might matter.
6:42If we've got thousands of these leaf switches,
6:44then we might want to make sure that we're conserving the IP
6:47addresses just for the sake of organization and scalability.
6:50Now, downstream of the VTEP, we have what's
6:52known as the LAN interface.
6:54The LAN interface is going to run what's
6:55known as Classical Ethernet.
6:57Classical Ethernet is what we talked about in the last video.
7:00We're running Spanning Tree, we're processing and sending
7:02BPDUs, for example.
7:04We also are going to use traditional VLAN tagging
7:06on this links.
7:07And so this is running everything outside of the VXLAN
7:10architecture, more or less.
7:12We're back to traditional networking at this point.
7:14Now, the beautiful thing about this part of the architecture
7:16is that the downstream clients do not need to be VXLAN-aware.
7:20For example, I might have a virtual machine,
7:22and the virtual machine's ready to send traffic
7:24towards a different virtual machine that lives
7:26somewhere else in the fabric.
7:27And so this virtual machine has no idea
7:29that it's connected via VXLAN.
7:31Instead, it's going to simply send this packet with a source
7:34MAC address of itself, so let's just call it MAC-1,
7:36and its destination address is going to be MAC-2.
7:40So the leaf switcher receives this, looks up MAC
7:42address 2-- which, once again, we're
7:44going to be covering how it knows where to send MAC-2 here
7:46in the next two videos.
7:47But either way, it's going to look it up,
7:49and it's going to encapsulate it,
7:50and it's going to send it off into the fabric,
7:52and it'll get de-encapsulated on the other side,
7:55such that, when it arrives on the destination VM,
7:58it also is running Classical Ethernet.
8:00And so it has no idea that, that packet traversed
8:02Layer 3 boundaries via VXLAN.
8:04Now, it is worth pointing out that when
8:06we look at the destination MAC address,
8:08that might have actually been located out another interface.
8:11And so if we do have another physical device down here,
8:14and it's connecting in, and we're simply
8:15trying to communicate between these two devices,
8:17well, that's going to all traverse the Classical Ethernet
8:20domain.
8:21We're going to do traditional CAM table lookups,
8:23we're going to see that this destination MAC lives out
8:25a physical interface that's locally attached.
8:27And so we're going to send that out just
8:29like we would in a traditional Layer 2 environment.
8:32So if our destination is local, then we're
8:34going to check the traditional CAM table.
8:36Whereas if it's a remote device, then we
8:38are going to check VXLAN space and we're
8:40going to encapsulate it, and send it off into the fabric.
8:43So as we review this, VXLAN is going to use that MAC-in-IP/UDP
8:47encapsulation.
8:48That part's important.
8:49We're putting a new IP header on there, that's 20 bytes.
8:52And we're putting a new UDP header
8:54on there, that's eight bytes.
8:55Plus the VXLAN header at eight, and we
8:57have to think about the internal Ethernet header as well.
9:00So we're encapsulating all that together,
9:02and that brings us to our next point, which
9:03is simply that we need to account
9:05for all of these extra bytes.
9:06These headers consume an extra 50 bytes,
9:08and so we need to make sure that our internal fabric is adjusted
9:12for that.
9:12Now, in most cases, we'll just set that to 9,000 bytes
9:15and be done with it.
9:16However, if our Layer 3 devices don't support that,
9:18then we need to make sure that we're at least getting
9:20that 1,550 bytes, and possibly 1,554 if we
9:23need to carry the DOT1Q header for whatever reason.
9:26Now lastly, the VTEPs have fabric connections
9:28and they have LAN connections.
9:30So the fabric connections are Layer 3.
9:31It's how we form our VXLAN tunnels.
9:33And the LAN connections, that would be our traditional--
9:36really, call it Classical Ethernet, the CE domain,
9:38where we're processing the Spanning Tree BPDUs and VLAN
9:41IDs the traditional way and such.
9:43And so we're going to take the frames that
9:46come in on the Classical Ethernet interfaces.
9:48If they are destined for local port,
9:50then it's a traditional CAM table lookup.
9:51However, if not, then we're going
9:53to encapsulate that inside the VXLAN packet
9:55and send it across the fabric to its destination.
9:57I hope this has been informative for you,
9:59and I'd like to thank you for viewing.
VXLAN Control Plane - Flood and Learn
0:00[MUSIC PLAYING]
0:05When a VXLAN telling endpoint receives an ethernet frame,
0:08it needs to know what to do with it.
0:09And so if it's locally attached here we
0:11might check the Cam table.
0:12But what if it's not locally attached?
0:14Which is kind of the point of this whole conversation.
0:16Because we're encapsulating this traffic
0:18and sending it across the fabric to another VTEP.
0:20But where does it know to send it?
0:22How does it know that, hey, this is attached to leaf number 50
0:25or what have you.
0:26And so in this video, we're going
0:27to start by explaining the default behavior of VXLAN.
0:30VXLAN was designed as a layer 2 protocol
0:32and specifically as a data plane protocol.
0:34Just like with ethernet.
0:35And so if we don't have a control plane protocol running,
0:38then we're going to have to default to what's known
0:40as flood and learn behavior.
0:42So in this video, we're going to talk about, once again,
0:44the default behavior of VXLAN and how it behaves from a flood
0:46and learn perspective, and in the next video,
0:48we're going to talk about this idea
0:50that we could deploy a control plane protocol which will
0:52really clean some of this up.
0:53So let's dive in and take a look.
0:55As mentioned, flood and learn is going
0:57to be the default behavior for VXLAN.
0:59It's how it's going to figure out
1:01where to send traffic, which more or less means
1:03if it doesn't know where to send it, it's going to flood it out.
1:06And so we think about traditional layer 2 domains.
1:09We think about a switch that maybe receives
1:11a destination Mac address that it doesn't know,
1:14and so what's a layer 2 switch do?
1:16Well, in this case, it's going to flood it out all
1:18of the other interfaces.
1:19It puts out this interface and this interface
1:21and this interface and it waits more or less.
1:24If it gets more traffic destined for this Mac address,
1:26it's going to flood out every single packet
1:28every single time it receives it.
1:30However, at some point, hopefully, the idea here
1:33is that maybe down past another few switches,
1:35somewhere out on the network we have the host that
1:38owns that Mac address.
1:39And so that host is going to respond back.
1:41And at this point, the layer 2 switch
1:43now knows where the Mac address lives.
1:45So we flooded the traffic and then we
1:47learned where it is therefore, we can go into the Cam table
1:49we can say that, say Mac address a maybe lives out ethernet 1/1.
1:55And so now we know in the future that the next packet
1:58I get-- so I get another packet in that's
2:00destined for that Mac address.
2:01I don't have to flood it anymore,
2:02I can simply send it out the appropriate link
2:04that will hopefully assuredly get it towards its destination.
2:08So let's try out our spine and leaf architecture
2:10and see how exactly VXLAN is going
2:12to mimic this behavior in order to figure out
2:14where everything is.
2:15And so we've got our leafs which is connected to our spine
2:18switches and we know that we're going
2:19to form VXLAN tunnels among the different leaf switches which
2:22turns them into VXLAN tunneling endpoints otherwise known
2:25as VTEPs.
2:26So let's imagine the same scenario here.
2:28I get a packet in and it's destined for let's just
2:30say in this case Mac address A. So we received it on leaf 1,
2:34and leaf 1 has no idea what to do with this frame.
2:37Because especially at the start of the network,
2:39when we first turn this thing on,
2:41it's not going to know where any of the destination Mac
2:43addresses are in the environment.
2:45And so it's going to check its local Cam table first just
2:47to see hey do I happen to know where it is.
2:50And I will still do traditional layer 2 behavior.
2:52In other words, I will still flooded it
2:53out learning local interface that might be connected
2:56on that same broadcast domain.
2:57That's not to say for sure that I even
2:59have any ports that are part of that broadcast domain.
3:02But if I do have a few other ports that are part of that
3:05then I will forward that traffic down just to make sure
3:07that it's not locally attached.
3:09However I also need to find it out in this direction
3:12and when I say flooded out I mean
3:13I've got to get it out to leaf 2 and leaf 3 and leaf 4.
3:17I need to get it out to every single other leaf
3:19switch in the environment, just in case
3:21that Mac address lives out there somewhere.
3:23Because until I know where it lives flooding
3:25it is the only way to ensure that the packet gets
3:28to its destination.
3:29Now, just a further complicate things,
3:31I also don't want to have to flood it everywhere.
3:33I only want to flood it to the locations that
3:35are responsible for having devices
3:38that are part of that same VNI.
3:39In other words, I need to flood it
3:41to everywhere that is part of that broadcast domain.
3:43Now, That makes sense that's what
3:44we're trying to mimic over here from a layer 2 perspective.
3:47And so ultimately let's just not flooded out
3:49to every single leaf switch only those
3:51that might actually have the host that
3:53is supposed to respond.
3:54So what technology can enable us to flood traffic
3:57out certain links, especially across a layer 3 domain?
4:00Because this isn't ethernet.
4:02We can't just broadcast it out and hope that it
4:04arrives on the other side.
4:05Well, for this we are going to take advantage of multicast.
4:08Multicast is a great technology.
4:10It operates at layer 3 and it's going
4:12to be able to optimize the delivery of traffic
4:15to multiple locations.
4:16And the way we're going to do this is we're
4:18going to map each VNI with a multicast address.
4:21And so and the multicast world we call these group addresses.
4:24It might be something like 239.1.1.1.
4:28And any time I need to send any kind of broadcast,
4:31unknown unicast, or multicast traffic,
4:33I'm going to rely on this multicast address.
4:35Now, our focus at this point is on this concept
4:37of an unknown unicast, but keep in mind
4:39that there are other reasons why we might need to flood traffic
4:42to multiple leaf switches.
4:43Now, I just listed them out.
4:45We've got this concept of BUM.
4:46B stands for broadcast.
4:47So we've got our broadcast traffic we need to consider.
4:50U stands for unknown multicast which is the exact scenario
4:53that we've been painting here.
4:54We have a destination packet to deliver
4:56and we don't know where to deliver it to.
4:58And then we have multicast.
5:00This would be for handling multicast and a layer 2
5:02perspective.
5:03And so regardless of the type of traffic
5:05we're trying to send we're going to take
5:07advantage of this multicast group in order to send it.
5:09Now the way multicast works is that we're
5:11going to send this packet out just
5:13like we said here, except now our destination is going
5:15to be a multicast address.
5:17Now when I use a multicast address that tells
5:19the spine switch that it needs to deliver that packet
5:22to multiple leaf switches.
5:24And so because this multicast address
5:25belongs to a specific VNI, all of these leaf switches,
5:29if they belong to that VNI they will be
5:30part of that multicast group.
5:32Let's say in our situation that leaf 2 and leaf 3, these
5:35actually want to receive that traffic.
5:37So we'll just even say leaf 1, leaf 2, leaf 3,
5:40those are all part of this multicast tree, which
5:42means our multicast trees look like this.
5:44We've actually got two of them.
5:46We've got one going to spine 1 and we've
5:48got a different multicast tree going to spine 2 Multicast
5:52trees are loop-free topologies that
5:53ensure that we deliver our multicast
5:55packets to many different destinations
5:57without ever looping the traffic and possibly even replicating
6:00it multiple times.
6:01And that's an important factor here
6:02is this idea of replication.
6:04One of the great parts about multicast
6:06is that remember we're going to optimize our delivery
6:08of multiple packets.
6:10And one of the ways that we do that is
6:12by ensuring that the ingress device here,
6:13so leaf 1 in our case, it's only going
6:15to have to process and send a single packet.
6:18The network handles the replication for us.
6:20In other words, the spine switch in our situation.
6:23Let's say we receive it on spine switch 1 here,
6:25we receive the packet in.
6:27Now we need to send it to leaf 2 and leaf 3.
6:29So we are going to replicate that packet,
6:31we're going to send two different copies of the packet,
6:33one down to leaf 2 and one down to leaf 3.
6:36So leaf 2 and leaf 3 are both going
6:37to receive this packet over here.
6:40Remember what we're doing here we're
6:41trying to send a packet to a destination address of Mac
6:44address A. Now, if leave to or leave three
6:46know where Mac address A is, then
6:47they're going to send it out the appropriate link.
6:50However, if they don't know where that link is and they're
6:52going to be flooded out all of their links
6:53that belong to that broadcast domain.
6:55But here's the other part of this is flood and learn.
6:58Meaning that I'm learning Mac addresses as I see them.
7:01We'll leaf 2 and leaf 3 they just saw a Mac address.
7:03Let's just say my source Mac address
7:05was Mac address B in this case.
7:07So down here, we had a destination Mac address of A,
7:09but we're sending it from a Mac address of B.
7:12So now we know where Mac address B is.
7:15If for any reason I need to send traffic back to Mac address B,
7:18then I'm going to know which leaf switch
7:20I need to send it towards.
7:21And the reason I know that is because I just
7:23got this multicast packet.
7:25And the multicast packet has a source IP address of the leaf
7:28switch that sent it.
7:29And so I know now that source Mac address
7:31B is mapped to leaf 1.
7:33Now, ideally in the scenario, somewhere down the line
7:35we've actually got the client we're trying to talk to.
7:38Though this is Mac address A down here.
7:40And so it's going to send a packet back towards leaf 1.
7:43Now leaf 2 can look up its table here
7:46and see that Oh, I've got Mac address B mapped to leaf 1.
7:49And, therefore I simply need to send it
7:51out one time via unicast to leaf 1.
7:54I don't have to worry about multicast at this point.
7:56I already know where this destination Mac address lives.
7:59Now leaf 1, once it receives it, it also
8:01is going to learn where Mac address A is.
8:03And so in our Cam table here we say
8:05that Mac address A is mapped out towards leaf number 2.
8:09And so as traffic continues to flow back and forth,
8:11I no longer need to rely on multicast and this flood
8:14and learn methodology instead I have now learned it.
8:17So I can simply send it out via unicast.
8:19Either way I de-encapsulate the return traffic,
8:21I send it down towards this host,
8:23and once again when I receive another packet from it,
8:25destined towards Mac address A I use the VXLAN tunnel.
8:28I simply send it out on this VNI that lands it on leaf 2,
8:32and sends it down towards that Mac address.
8:34Now, this seems like a whole lot of work.
8:36Flood and learn is not optimal because I
8:39had to send some of that traffic out to many leaf switches.
8:42And again we have think about a data center with thousands
8:44of leaf switches, were flooding a whole lot of traffic
8:47out there that we'd probably rather not flood.
8:49And so it'd be lovely if we could propagate
8:51this information of where Mac address B lives,
8:54before leaf needs to figure it out through this flood
8:56and learn methodology.
8:57And the way we can do that is by deploying control plane
9:00protocols.
9:01Now, we have several options for this.
9:02The primary one is going to be what's known as BGP EVPN.
9:06And we're going to be talking about this in the next video.
9:09Now, we also see that Cisco have deployed some other SDN
9:11solutions that take advantage of VXLAN
9:14and they have their own control mechanisms as well.
9:16For example, Cisco has a solution
9:18known as the software defined access or SDA.
9:20And SDA is going to rely on LISP.
9:23The Location Identifier Separation
9:24Protocol in order to transmit this control plane information.
9:28Cisco also makes a great data center
9:30solution called Application Centric Infrastructure,
9:32we've talked about this.
9:33ACI doesn't actually use BGP EVPN, instead,
9:37it uses the Council of Oracle protocol otherwise known
9:40as COOP.
9:40And lastly there is an option that we
9:42could deploy if our underlying architecture doesn't
9:45support multicast.
9:46That would not be a great scenario
9:47we've said that we rely on multicast for flood and learn.
9:51But if for any reason, we don't actually
9:52have multicast available to us, then we
9:55could do what's known as ingress replication.
9:58Ingress replication is essentially a unicast version
10:01of flood and learn, except instead
10:02of relying on the network to do the replication part,
10:05we're going to have to do the replication ourselves
10:07at the leaf layer.
10:08And so we more or less flood it out
10:10to every single leaf switch via unicast which is definitely not
10:13optimized.
10:14If we think that multicasting, to all of the leaf switches
10:17isn't optimized, then this is going to be so much worse.
10:20And so we generally speaking want
10:22to avoid ingress replication if at all possible.
10:24However, we do understand that if these switches in here
10:27don't support multicast, then that's going to be a problem.
10:30And so fortunately for us if we're deploying a spine leaf
10:32architecture on Cisco Nexus platforms,
10:34then we absolutely support multicast,
10:36and we never need to worry about doing ingress replication.
10:40So the primary point in this conversation
10:42is that VXLAN doesn't have a native control planning
10:44protocol as part of the specifications the next line is
10:48data focused it just wants to send traffic it doesn't
10:51care how we learn about it.
10:52So worst case scenario, we're going
10:54to use this one learn methodology.
10:55Now, if we're going to use flood and learn
10:57it's going to be flooded via multicast.
10:59And it's not just the unknown unicast that gets flooded we
11:01also flood our broadcast and our multicast traffic this way,
11:04and that will happen independently
11:05of whether we have a control plane mechanism in play.
11:07We do need a way to optimally forward
11:10this BUM type of traffic out to the rest of the network.
11:13But as far as this conversation is concerned,
11:15we are a little bit more focused on the unknown unicast.
11:17Now lastly, once we do actually learn, in other words,
11:20when we get return traffic back and we've
11:22seen that, oh, that Mac address is now attached to leaf
11:25whatever in this environment, well, at this point,
11:28I can add that into my Cam table and therefore the next time I
11:30receive traffic destined for that host,
11:33I can simply send it via unicast to the appropriate leaf.
11:35I hope this has been informative for you.
11:37I'd like to thank you for viewing.
VXLAN Control Plane - EVPN
0:05So even though Flood and Learn will
0:07learn to populate our control plane tables from a VXLAN
0:10perspective, we might want to be a little bit more
0:12efficient in our data centers than flooding traffic
0:15everywhere.
0:15And so at the core of this, we can deploy
0:17a lot of different protocols.
0:18We talked about this in the last video.
0:20We might want to deploy something
0:22like Cisco ACI, which uses a Council of Oracle protocol,
0:25or COOP.
0:25Or for that matter, we might deploy software-defined access
0:28and find that VXLAN is using LISP in that case.
0:31However, for us in the data center,
0:32if we're not going to deploy one of these prepackaged solutions,
0:35then for the most part, we're going
0:37to land on leveraging multi-protocol BGP,
0:39and specifically as part of that,
0:41what we call the Ethernet VPN.
0:42So in this video, we're going to unpack EVPN, how it ties back
0:45to BGP, and in the end, how it accomplishes what we're
0:48trying to do, which is to make our VXLAN traffic more
0:50efficient.
0:51Let's take a look.
0:52When most of us think of the border gateway
0:54protocol, otherwise known as BGP,
0:56we tend to think of it as if it's an IP routing protocol.
0:59But if that were the case, then the only thing BGP
1:01would be able to carry is IP information.
1:03And the reality is that BGP is far more flexible than that.
1:06It's not an IP routing protocol because it can carry
1:09more than just IP version 4.
1:10For example, it could carry IP version 6.
1:13And it can carry multicast routes.
1:15We can use BGP for a lot.
1:17And at that point, when we're doing it
1:18for something other than IP version 4,
1:21we typically call this multi-protocol BGP.
1:24Multi-protocol BGP is what we use inside of MPLS,
1:26for example, which allows us to carry
1:28routes for many different customers
1:29across a single BGP domain.
1:31And we call those VPNs in that situation.
1:34And so we're going to apply something similar
1:35from that perspective in the world of MP-BGP.
1:39And we're going to call that Ethernet VPN, otherwise known
1:41as EVPN.
1:43The goal here is to leverage MP-BGP
1:45to carry MAC address information,
1:47to learn where a MAC address is, and then
1:49to propagate that information throughout the entire domain.
1:52If we're sending MAC address information
1:54via multi-protocol BGP, then we no longer
1:56need to worry about flooding traffic.
1:58Instead, we learn where our destinations
2:00are via this EVPN process.
2:02And therefore, we don't need to worry about flooding anymore.
2:05We've already got the capability of sending our traffic
2:08via Unicast instead.
2:09Now, there's a few things to keep in mind
2:11here with regards to VPN.
2:13First of all, we are still going to rely on multicast
2:16for our BUM traffic.
2:17And as we discussed, our BUM traffic
2:19is broadcast unknown Unicast and multicast traffic.
2:22The goal of EVPN is to simply reduce
2:24the amount of unknown Unicast.
2:26However, we still have to have this mechanism in place
2:28in order to process any information
2:30that we might be missing or again,
2:32for broadcast and multicast traffic.
2:34Furthermore, we are going to leverage internal BGP in order
2:36to make this happen.
2:37IBGP requires full mesh of relationships.
2:40And that would be a whole lot of relationships among all
2:43of our leaf and spine switches.
2:44And so instead, we use a mechanism known
2:46as the BGP route reflectors.
2:48BGP route reflectors make it so that all
2:50of our individual devices, in other words our leaf switches,
2:54only need to have connectivity from a BGP perspective
2:57to the route reflectors.
2:59So if we make our spine switches the route reflectors,
3:02then we simply need our BGP relationships
3:04to be between our leaf switches and our spine switches, which
3:06is great because we already have that architecture set up.
3:09Now generally speaking, we are going
3:10to want to take advantage of what
3:12BGP offers from a pure authentication perspective.
3:15We want to make sure that this environment is secure.
3:17And that means that if we accidentally create a BGP
3:21neighborship that we didn't mean to, or something worse,
3:24we have a malicious attempt to inject bad routes into our BGP
3:27domain, well, these are great reasons
3:29to have pure authentication in place to really lock down
3:32those connections.
3:33Now lastly, when we have all of this control plane information
3:36in place already, we can look into something
3:38that we call ARP suppression.
3:39The idea of ARP suppression is pretty straightforward.
3:42If I have a leaf switch, and here's my spine switch,
3:44and then another leaf switch over here,
3:46and I've got MAC address be living on the right,
3:49and I've got a device over here that wants
3:51to know where MAC address B is.
3:53And so it sends out an ARP request.
3:54It says, hey, where are you?
3:56And it sends it out towards my local switch.
3:59Well, if I have EVPN in place, I've
4:01got this control plane protocol then
4:02my list which already knows where MAC address B lives.
4:05In fact, it's got a mapping.
4:07It knows the IP address and the MAC address.
4:08And therefore, it has all of the information needed
4:11to respond to that ARP request.
4:13Now, technically speaking, since ARP requests are a broadcast,
4:16we could take advantage of our multicast tree
4:18and simply send it out as we would with any other BUM
4:20traffic.
4:21Or we could further optimize our data center
4:23rather than sending that traffic all the way through the fabric,
4:26and keep in mind that we'd be doing that for every ARP
4:28request in the network, so there's a whole lot of that
4:30happening all at once.
4:31But either way, rather than sending that
4:33through the fabric, we simply use the information
4:35that we already have and send the ARP response back.
4:38We send it back on behalf of the client that's located
4:40out that other leaf switch.
4:42And therefore, we are suppressing
4:43the amount of ARP requests that are being
4:45transmitted across the fabric.
4:47So let's go ahead and draw all of this
4:48out and see what it looks like from an example perspective.
4:51So here's our spine and leaf architecture.
4:53We're getting pretty good at drawing that out.
4:54And so now what we're going to do
4:56is we're going to configure BGP in this domain.
4:58And so remember, we want these two spines,
5:00which is to be BGP route reflectors, which
5:02means we only need relationships between the spine switches
5:05and the leaf switches.
5:06The leaf switches do not need to be BGP peers with one another.
5:09So let's say we have a new MAC address that shows up
5:12over here on the right.
5:13Let's say leaf four gets MAC address D that attaches to it.
5:17Now in some fashion, maybe it's through DHCP,
5:19maybe it's through some other mechanism.
5:21Either way, we're going to send a packet towards leaf four.
5:24And now leaf four knows, hey, MAC address D
5:26lives out my ethernet interface, whatever
5:28ethernet interface that is.
5:29Maybe it's one slash one.
5:31At this point, leaf four wants to alert
5:33the entire rest of the network where this MAC address lives.
5:36This is great that's part of its local cam table.
5:38But once again, the purpose of deploying a control plane
5:40protocol like EVPN is so that we don't have to Flood and Learn
5:43any more.
5:44I can simply tell all the other leaf switches
5:46that, hey, MAC address D lives here.
5:48So at this point I'm going to leverage BGP messaging
5:51and I'm going to advertise up to the rest of the network
5:54that, hey, I've got this MAC address attached.
5:56I don't have to specify which interface it's attached to.
5:59I simply have to say that, hey, I've
6:00got MAC address D. It's attached to me, leaf four.
6:04Now, the route reflector is going to push that down
6:06to all of the other leaf nodes.
6:07That's the purpose of the route reflector,
6:09it reflects the routes.
6:10And so each one of these other leaf switches, all these VTEPs,
6:13are going to add into their CAM tables,
6:15the location of MAC address D. They're going to say D
6:18belongs to leaf four.
6:20However, we don't just share the MAC address in the situation.
6:23We also map the IP address, as well as
6:25the VNI that that particular host is attached to.
6:28And hey, at this point, we're in pretty good shape.
6:30Because now let's go back to the situation
6:32that we talked about in the last video.
6:33Now we have a host that spins up.
6:35And it's got-- well, it doesn't really matter.
6:37Let's just say this is MAC address B here.
6:39And we are going to send a packet destined towards MAC
6:42address D. Now, in the last video,
6:45we explained that, well, this switch
6:47doesn't know where MAC address D is because it's not
6:49part of its local CAM table.
6:50And therefore, we'd have to rely on Flood and Learn.
6:52Over in this case, we do have an entry in our CAM table.
6:55And so we don't have to Flood and Learn.
6:57Instead, we can simply encapsulate that packet,
7:00send it through the fabric.
7:01And we can land on leaf four via Unicast.
7:03Furthermore, if we think back to this conversation down here,
7:06when this host first fired up its connection
7:08and it needed to connect to MAC address D,
7:11how did it learn that it was MAC address
7:12D it needed to connect to?
7:14Because in most cases, it's going to resolve a DNS name
7:17and it's going to get an IP address.
7:18And of course, what I'm really leaning towards
7:20is the fact that we probably would have already submitted
7:23an ARP request.
7:24Though since we sent an ARP request saying,
7:25hey, I know your IP address-- let's
7:27just say you're IP address D. And the question, of course,
7:29is what is your MAC address?
7:31And so by activating ARP suppression,
7:33we can further reduce ARP flooding that switch, leaf two,
7:36will just send back the ARP response on behalf
7:38of the client.
7:38And therefore now the client knows that it's MAC address D.
7:41And this whole process here is what occurs next.
7:44And so we see the amount of flooding
7:47that we are restricting in this environment
7:48by deploying BGP EVPN.
7:51Not only does EVPN save us from that Flood and Learn behavior,
7:54but it also enables ARP suppression,
7:56which further reduces the amount of broadcast traffic
7:59that's on our network.
8:00So as we see here, EVPN is going to take advantage
8:03of multi-protocol BGP.
8:05Multiprotocol BGP is going to enable us to share the IP,
8:07and the MAC, and the VNI information
8:09that we learned for a locally attached host, so we share that
8:12with every other leaf switch in the environment
8:14that needs to know about it.
8:15EVPN is going to make it so we don't
8:16need this flooding behavior, this Flood and Learn concept.
8:19We can learn without flooding.
8:20So that's even better than the alternative,
8:23which is to simply flood all this traffic out there.
8:25And we end up creating so much more traffic
8:28than we needed to otherwise.
8:29Now, our BUM traffic is going to continue to leverage
8:31that multicast tree.
8:32So we do still need multicast configured.
8:34We are still going to leverage it.
8:36Because we will always have broadcast traffic and multicast
8:38traffic.
8:39And even in this situation, we will
8:41have situations where we don't know where a destination is.
8:44And so we might end up with unknown Unicast
8:46that we still need to forward.
8:48And lastly, EVPN brings these enhancements
8:50like pure authentication, and making it
8:52so that we have to trust each other, or at least
8:54we can validate that our leaf switches that
8:56are sharing this information are part of the design,
9:00that we are intentionally deployed into this fabric.
9:02And furthermore, we've got ARP suppression.
9:04ARP suppression further reduces the amount
9:06of flooding in our infrastructure
9:07because now our leaf switches can respond to ARP requests
9:10on behalf of the end hosts.
9:12I hope this has been informative for you.
9:13And I'd like to thank you for viewing.
Review and Quiz
0:05We've reached the end of another skill.
0:06So let's review what we've learned by taking a quiz.
0:09First up out of four questions, what
0:10is the size of the VXLAN header and how does it break down?
0:19Well, the overall size of the VXLAN header is 8 bytes.
0:22Let's see if we can remember how it breaks down.
0:24First of all, we should be able to remember
0:25that the VNI is 24 bits.
0:2724 bits is 3 bytes.
0:29So A would be an answer.
0:30Then we just need to remember the rest of the header
0:32and how it breaks down.
0:33Well, it's pretty straightforward.
0:35We have a single byte dedicated for Flags.
0:37So that means that C is our answer here.
0:39And the rest of the 8 bytes are reserved.
0:41And so right now we're looking at 4 bytes of reserved data
0:45inside of the VXLAN header.
0:46So 8 bytes total, broken down by VNI at 3, flags at 1,
0:50and reserved at 4 bytes.
0:52Question two, how much MTU overhead
0:54must be accounted for when deploying VXLAN?
1:02The answer here is C, 50 bytes.
1:04We just said that the VXLAN header itself is 8 bytes.
1:06But we also have to account for the new IP header--
1:08and IP headers are 20 bytes--
1:10the new UDP header--
1:12UDP is 8-- and the internal Ethernet header,
1:15because we're going to have an external Ethernet header.
1:17But that doesn't count towards our MTU.
1:19However, the fact that we're carrying not just a 1,500 byte
1:21packet inside of 1,500 plus the Ethernet header size--
1:25and by the way, it might be 1,500; it might be 9,000--
1:27whatever the payload is, it doesn't
1:29include the Ethernet header.
1:30The Ethernet header is assumedly 14 bytes with no .1Q header
1:34in there.
1:34And so we add all this up, and it's 50 bytes.
1:36Now, notice if we do include the .1Q header in there for any
1:39reason, then it becomes 18 bytes,
1:41which means we need to consider 54 bytes.
1:43However, that doesn't usually happen
1:44with our VXLAN deployments.
1:46Next question, what mechanism will be used by the spine
1:49switches to forward VXLAN BUM traffic?
1:57Remember, BUM traffic would be broadcast, unknown-unicast,
2:00and multicast.
2:01And all of these are multidestination traffic.
2:03So we either need to flood them out to specific leaf switches,
2:06or possibly, flooding it out to all of the leaf switches.
2:09And so the fact that this is multidestination traffic
2:12should clue us in to the fact that it's going
2:13to be B, multicast traffic.
2:15We're going to rely on multicast in the underlay
2:18to carry our traffic to the appropriate leaf switches.
2:20Notice, by the way, that this happens independently
2:22of whether we're using flood and learn,
2:24which is the reference here, but also if we're using EVPN.
2:27If we're going to use EVPN, then we
2:28should see a whole lot less U traffic in the BUM traffic.
2:31But we'll still see plenty of broadcast and multicast
2:33traffic.
2:34And in those scenarios, we're still going to use multicast.
2:37Either way, in this scenario, B is our answer.
2:40And last question, what protocol is leveraged by EVPN
2:43to propagate information to leaf switches?
2:51The answer-- D, Multiprotocol BGP.
2:54BGP is the protocol we're going to use because, remember,
2:56we said that BGP is more of an information carrying protocol.
3:00It's not locked into IPv4 or IPv6.
3:03We can use it to carry all kinds of information, including
3:06MAC address information, as well as IP address and VNI info.
3:10And so we put all of this inside of Multiprotocol BGP
3:13as part of our EVPN configuration.
3:15EIGRP and OSPF, these are IGPs.
3:18And they're only able to carry IPv4, or possibly
3:21IPv6 information if we're deploying
3:23the appropriate version.
3:24But even those are completely different versions
3:26of the protocols.
3:27Now, IS-IS, interestingly enough, it's also like BGP.
3:29It can carry a lot of different info.
3:31So it can carry MAC address information.
3:33We're going to see that IS-IS is used
3:35by OTV to propagate MAC address information, for example.
3:38And so IS-IS wouldn't have been a bad answer.
3:41However, in our case with EVPN, it's all about MP-BGP.
3:44And that wraps up this quiz.
3:45If you found that the information we covered
3:47didn't quite propagate to where it needed to go,
3:49be sure to go back and watch the appropriate videos in order
3:52to fill those knowledge gaps.
3:53Otherwise, congratulations on completing Explain VXLAN
3:56and EVPN.
3:56I hope this has been informative for you.
3:58And I'd like to thank you for viewing.
Team training path
Turn this skill into assignable team training
This free skill is a preview of the courses your team can assign, track, and report on with CBT Nuggets.
Cisco
300-610 DCID
Assign the full course, track completion, and connect this skill to your team's readiness plan.
Cisco
Cisco Data Center Network Design
Assign the full course, track completion, and connect this skill to your team's readiness plan.
Cisco
Network Connectivity Design
Assign the full course, track completion, and connect this skill to your team's readiness plan.
For teams
Build a path around this skill
See how courses, reporting, labs, and Trainerbot fit your rollout.
$749
seat / year