Main Nav

Message from jstapleton@computer-business.com

If sufficient router memory to hold full Internet BGP tables is a concern, you might want to consider a software-based router, like Vyatta.  Adding memory is cheap and easy when you are dealing with standards-based architecture.

 

Personally, I can’t wait to get one of these $99 software-based routers at my house:  http://www.ubnt.com/edgemax.

provides 145X more Kpps per USD than Cisco; provides 205X more Kpps per USD than Juniper

http://dl.ubnt.com/Tolly212127UbiquitiEdgeRouterLitePricePerformance.pdf

 

From: The EDUCAUSE Network Management Constituent Group Listserv [mailto:NETMAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Green, Doug
Sent: Tuesday, October 30, 2012 9:00 AM
To: NETMAN@LISTSERV.EDUCAUSE.EDU
Subject: Re: [NETMAN] ISP aggregation

 

In order to run BGP, you’ll need an Autonomous System Number (ASN) if you don’t already have one. Apply to ARIN https://www.arin.net/resources/request/asn.html

You will need sufficient router memory to hold internet tables. Full internet BGP tables are huge, but there are techniques to filter incoming tables to essential routes to reduce memory usage.

You might also want to read up on “AS prepending” and “BGP communities” in order to help with load balancing.

We found Cisco’s BGP 4-day training class very helpful.

One last note: BGP was never designed to balance links, so you’ll need to keep an eye on your link usages and make some adjustments from time to time.

 

Best of luck with this.

 

Doug

 

Douglas Green - Network Architect

University of New Hampshire - Regional Optical Network Dept.

131 Main Street - 307 Nesmith Hall

Durham NH 03824

Lat long 43.138  -70.936

(603) 862-4921 desk - (603) 978-1180 mobile

http://NetworkNHnow.org/

“If I had more time, I would have written a shorter letter.” - Cicero

 

 

 

From: The EDUCAUSE Network Management Constituent Group Listserv [mailto:NETMAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Pete Hoffswell
Sent: Tuesday, October 30, 2012 8:45 AM
To: NETMAN@LISTSERV.EDUCAUSE.EDU
Subject: Re: [NETMAN] ISP aggregation

 

Good morning - 

 

We run a Cisco 7206 VXR with lots of memory into a pair of ASA firewalls as well.  We peer with two separate ISPs.

 

We keep the bandwidth equal to each ISP, and use BGP to announce our networks.  We use default routes only, for the most part, and not a full BGP table.

 

Happy to give further detail if needed.
-
Pete Hoffswell - Network Manager
pete.hoffswell@davenport.edu
http://www.davenport.edu
616-732-1101

Comments

HI Bruce,
I don't think anyone has yet mentioned that BGP multihoming is a very inexact science.  Multiple upstreams are very good and stable for resiliency if one goes down.  Load-sharing is another kettle of fish.  I will assume that you (like us) have far greater amount of inbound traffic than outbound traffic.  (Large research Unis may reverse this pattern.)  One can kind of distribute the inbound traffic with AS-path prepends, however conditions change.  Upstreams form different peering relationships, an Akamai cluster gets put in one place or taken out another and the traffic pattern changes.  To quote from Jeff Doyle and Jennifer Carroll's Routing TCP/IP volume 2: "Do not be too concerned if 75 percent of your traffic uses one link while only 25 percent of your traffic uses the other link.  Multihoming is for redundancy and increased routing efficiency, not load balancing."  We tend to only try to rebalance our inbound traffic when a link is approaching 100% and the other link has a great deal of unused bandwidth.  We take directly connected and default routes from our upstreams, and do a little of source routing to share the outbound, though it really isn't necessary for us, since the outbound is not large compared to inbound.  For us there is no value in taking a full internet routing table.  YMMV.  

best,
Dennis Bohn
Manager of Network and Systems
Adelphi University
bohn@adelphi.edu
5168773327


One more thing... I sleep much better at night having a separate router for each upstream with the internal addresses in HSRP configuration.  They are actually in two separate buildings, with the fiber coming in on two separate conduit paths and crossing the (East) river with two separate tunnels and ultimately going to separate carrier hotels.  I think it was the bridge collapse in Minneapolis when folks learned that though they had different upstreams, most fiber went over the same bridge.  Oops.
Dennis Bohn
Manager of Network and Systems
Adelphi University
bohn@adelphi.edu
5168773327


Message from mark.duling@biola.edu

I just looked at that quote in google books and it is referring to "load balancing."  If multihoming simply means having multiple paths and/or ISPs, I think the "multihoming is for redundancy" statement is an over generalization.  

Load sharing can quite effectively be done over multiple links in a determinate fashion by policy routing outbound (PBR in Cisco-speak), and using AS-PATH prepends to make it so that whatever pipe a packet is routed out, then the return transaction will also use the same pipe inbound unless that pipe is down.  You select which internal networks will use which pipes (I'm assuming NAT is in use) and set your dynamic NAT statements accordingly.  This method is highly determinate and very simple to setup, but still very flexible since it is easy to adjust the dynamic NAT statements to fine tune the traffic loads if adjustments are needed over time.  

So if the costs or your budget circumstances don't permit two similarly sized pipes you can still get the benefits of a total bandwidth increase for a reasonable cost by adding a pipe using this method, and using very unequal sized pipes isn't a problem because of the PBR out and AS-PATH prepend in combination of route determinacy I've described.  Now obviously, in this case you wouldn't get redundancy in any effective way if you have pipes of sizes 1X and 3X and are using 80% of the total bandwidth, since you can't lose the larger pipe without a devastating degradation of service.  But after setting up the multihomed infrastructure, and assuming you want at some point to be able to survive an ISP link failure, you can upgrade the smaller pipe quite easily when the money is available or the costs come down.

All to say if getting more bandwidth for a decent price is more critical for you or more attainable for some reason at a given point in time than path redundancy there is nothing wrong with that.  We are multihomed exactly this way and are quite happy with the results.  Dennis, we also have ASA's and a Cisco border router so I could share the relevant configuration components with you if you want to contact me offline for details.



Message from peter.charbonneau@williams.edu

I'll "second" this architecture.  We have used this for years and years.

Cisco's Policy Based Routing (pushing the outbound data) is predicated on link up and link down.  Unfortunately, for us, we have intermediate switches between us and our ISPs.  There have been many times that an ISP's link has gone down (mostly for maintenance), the BGP routes for that ISP get dropped, but the router is still trying to send/route data out that interface, because it sees the link as "up".  Therefore, this becomes a maintainability issue ... someone here has to shut the interface down at the router, so outbound data will flow correctly.

The issue, after, is ... "How do you know when that interface is really back up and ready for bidirectional traffic?"

**sigh**

P
On Nov 6, 2012, at 2:24 PM, Mark Duling wrote:

I just looked at that quote in google books and it is referring to "load balancing."  If multihoming simply means having multiple paths and/or ISPs, I think the "multihoming is for redundancy" statement is an over generalization.  

Load sharing can quite effectively be done over multiple links in a determinate fashion by policy routing outbound (PBR in Cisco-speak), and using AS-PATH prepends to make it so that whatever pipe a packet is routed out, then the return transaction will also use the same pipe inbound unless that pipe is down.  You select which internal networks will use which pipes (I'm assuming NAT is in use) and set your dynamic NAT statements accordingly.  This method is highly determinate and very simple to setup, but still very flexible since it is easy to adjust the dynamic NAT statements to fine tune the traffic loads if adjustments are needed over time.  

So if the costs or your budget circumstances don't permit two similarly sized pipes you can still get the benefits of a total bandwidth increase for a reasonable cost by adding a pipe using this method, and using very unequal sized pipes isn't a problem because of the PBR out and AS-PATH prepend in combination of route determinacy I've described.  Now obviously, in this case you wouldn't get redundancy in any effective way if you have pipes of sizes 1X and 3X and are using 80% of the total bandwidth, since you can't lose the larger pipe without a devastating degradation of service.  But after setting up the multihomed infrastructure, and assuming you want at some point to be able to survive an ISP link failure, you can upgrade the smaller pipe quite easily when the money is available or the costs come down.

All to say if getting more bandwidth for a decent price is more critical for you or more attainable for some reason at a given point in time than path redundancy there is nothing wrong with that.  We are multihomed exactly this way and are quite happy with the results.  Dennis, we also have ASA's and a Cisco border router so I could share the relevant configuration components with you if you want to contact me offline for details.



As long as you have a more up-to-date IOS version, IP SLA / object tracking should address that issue.

 

From: The EDUCAUSE Network Management Constituent Group Listserv [mailto:NETMAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Peter Charbonneau
Sent: Tuesday, November 06, 2012 2:34 PM
To: NETMAN@LISTSERV.EDUCAUSE.EDU
Subject: Re: [NETMAN] ISP aggregation

 

I'll "second" this architecture.  We have used this for years and years.

 

Cisco's Policy Based Routing (pushing the outbound data) is predicated on link up and link down.  Unfortunately, for us, we have intermediate switches between us and our ISPs.  There have been many times that an ISP's link has gone down (mostly for maintenance), the BGP routes for that ISP get dropped, but the router is still trying to send/route data out that interface, because it sees the link as "up".  Therefore, this becomes a maintainability issue ... someone here has to shut the interface down at the router, so outbound data will flow correctly.

 

The issue, after, is ... "How do you know when that interface is really back up and ready for bidirectional traffic?"

 

**sigh**

 

P

On Nov 6, 2012, at 2:24 PM, Mark Duling wrote:



I just looked at that quote in google books and it is referring to "load balancing."  If multihoming simply means having multiple paths and/or ISPs, I think the "multihoming is for redundancy" statement is an over generalization.  

 

Load sharing can quite effectively be done over multiple links in a determinate fashion by policy routing outbound (PBR in Cisco-speak), and using AS-PATH prepends to make it so that whatever pipe a packet is routed out, then the return transaction will also use the same pipe inbound unless that pipe is down.  You select which internal networks will use which pipes (I'm assuming NAT is in use) and set your dynamic NAT statements accordingly.  This method is highly determinate and very simple to setup, but still very flexible since it is easy to adjust the dynamic NAT statements to fine tune the traffic loads if adjustments are needed over time.  

 

So if the costs or your budget circumstances don't permit two similarly sized pipes you can still get the benefits of a total bandwidth increase for a reasonable cost by adding a pipe using this method, and using very unequal sized pipes isn't a problem because of the PBR out and AS-PATH prepend in combination of route determinacy I've described.  Now obviously, in this case you wouldn't get redundancy in any effective way if you have pipes of sizes 1X and 3X and are using 80% of the total bandwidth, since you can't lose the larger pipe without a devastating degradation of service.  But after setting up the multihomed infrastructure, and assuming you want at some point to be able to survive an ISP link failure, you can upgrade the smaller pipe quite easily when the money is available or the costs come down.

 

All to say if getting more bandwidth for a decent price is more critical for you or more attainable for some reason at a given point in time than path redundancy there is nothing wrong with that.  We are multihomed exactly this way and are quite happy with the results.  Dennis, we also have ASA's and a Cisco border router so I could share the relevant configuration components with you if you want to contact me offline for details.

 

 

Tim is correct.  Look up IP SLA, and the TRACK feature in Cisco IOS.  That's what I use to manage this issue.


-
Pete Hoffswell - Network Manager
pete.hoffswell@davenport.edu
http://www.davenport.edu
616-732-1101


Message from peter.charbonneau@williams.edu

Many thanks!

p
On Nov 6, 2012, at 2:51 PM, Pete Hoffswell wrote:

Tim is correct.  Look up IP SLA, and the TRACK feature in Cisco IOS.  That's what I use to manage this issue.


-
Pete Hoffswell - Network Manager
pete.hoffswell@davenport.edu
http://www.davenport.edu
616-732-1101


We have not found AS path prepending to be very deterministic.  You can be very deterministic for outgoing using “local pref” but incoming often does not always respond very well to prepending, and at least in our case, that the vast majority of the traffic is incoming. We have also found that once you get it all set up, something does indeed change, and your well crafted schemes get messed up.  We do not do NAT and are balancing chunks of a Class B and more, so maybe that is part of the issue for us.

 

Thank you for pointing out the fallacy of thinking that load balancing between two links can achieve both more available bandwidth and redundancy. The two are mutually exclusive goals unless you can implement per IP address rate limiting very quickly when you fail the oversubscribed traffic on to the one link.

 

If you simply need more bandwidth it is usually much cheaper to just increase the volume from your current provider as the cost per megabit per month is always going down, and you can get quantity discounts for it just like anything else.

 

What we do now is route all our traffic in and out of one provider, and fail it over to a backup provider who gives us a burstable service. The burst will support our current traffic levels, and if the failure lasts long enough we will have to pay extra which we would be glad to do at that point to keep our link up given the increasing importance.

 

Pete Morrissye

 

From: The EDUCAUSE Network Management Constituent Group Listserv [mailto:NETMAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Mark Duling
Sent: Tuesday, November 06, 2012 2:25 PM
To: NETMAN@LISTSERV.EDUCAUSE.EDU
Subject: Re: [NETMAN] ISP aggregation

 

I just looked at that quote in google books and it is referring to "load balancing."  If multihoming simply means having multiple paths and/or ISPs, I think the "multihoming is for redundancy" statement is an over generalization.  

 

Load sharing can quite effectively be done over multiple links in a determinate fashion by policy routing outbound (PBR in Cisco-speak), and using AS-PATH prepends to make it so that whatever pipe a packet is routed out, then the return transaction will also use the same pipe inbound unless that pipe is down.  You select which internal networks will use which pipes (I'm assuming NAT is in use) and set your dynamic NAT statements accordingly.  This method is highly determinate and very simple to setup, but still very flexible since it is easy to adjust the dynamic NAT statements to fine tune the traffic loads if adjustments are needed over time.  

 

So if the costs or your budget circumstances don't permit two similarly sized pipes you can still get the benefits of a total bandwidth increase for a reasonable cost by adding a pipe using this method, and using very unequal sized pipes isn't a problem because of the PBR out and AS-PATH prepend in combination of route determinacy I've described.  Now obviously, in this case you wouldn't get redundancy in any effective way if you have pipes of sizes 1X and 3X and are using 80% of the total bandwidth, since you can't lose the larger pipe without a devastating degradation of service.  But after setting up the multihomed infrastructure, and assuming you want at some point to be able to survive an ISP link failure, you can upgrade the smaller pipe quite easily when the money is available or the costs come down.

 

All to say if getting more bandwidth for a decent price is more critical for you or more attainable for some reason at a given point in time than path redundancy there is nothing wrong with that.  We are multihomed exactly this way and are quite happy with the results.  Dennis, we also have ASA's and a Cisco border router so I could share the relevant configuration components with you if you want to contact me offline for details.

 

 

I haven't jumped in yet, but I feel compelled to add my experiences.

I am using BGP to provide both redundancy and load sharing.

I have two ISPs, both connected by Metro-Ethernet like services over fiberoptics.  ISP #1 connects to us by two geographically diverse fiber paths.  ISP #2 uses a single path which is geographically diverse from the two paths of ISP #1.  This gives us protection from what I call "Dump-Truck attacks" where telephone poles are lost due to automobile accidents, and "Backhoe Attacks" where the underground fiber is damaged.  Yes, we've experienced both...

Our Internet router peers with each of the ISPs by BGP.  Our network uses a single public class B for networking.  Inside our network we give subnets to each building and each WiFi vlan.  (WiFi VLANs are assigned to clients by student class or by employment / guest status).

With each ISP I advertise our full class-b network.  I also advertise half of the largest bandwidth consumers' subnets to each provider.  For example with ISP #1 I advertise the dorms # 1, 3, 5, and the Freshman and Sophomore WiFI VLANs.  Then with ISP #2 I advertise dorms #2, 4, 6, and the Junior and Senior WiFi VLANs.  Based on occasional statistics I gather, this covers 80% or more of our bandwidth usage.  Any other bandwidth is routed to us however the remote ISPs see fit.

The result is our inbound bandwidth is very well balanced, they share the same daily peaks and troughs, and I've had very few complaints.

Our outbound bandwidth is small compared to inbound.  I send it our either one link or the other.  If it became a problem I could institute policy-based routing or something similar.

My experience with AS-Path prepends is that the remote ISPs are going to make their own decisions on routing, and there's not much I can do about it.  AS-Path prepends are just a suggestion to the ISP and aren't often taken seriously.

There are two philosophical schools of thought about tools.  As my philosopher professor once told our class, a tool has an intended use, i.e. a screwdriver is "for" driving screws.  Any other use of that tool is a misuse of it.  However I subscribe to the other school that declares a good tool is one that can be used elegantly in ways never imagined by its designer.

Best,
Matt

-- Matt Richard '08 Access and Security Coordinator Information Technology Services Franklin & Marshall College matt.richard@fandm.edu

On 11/6/12 3:39 PM, Peter P Morrissey wrote:

We have not found AS path prepending to be very deterministic.  You can be very deterministic for outgoing using “local pref” but incoming often does not always respond very well to prepending, and at least in our case, that the vast majority of the traffic is incoming. We have also found that once you get it all set up, something does indeed change, and your well crafted schemes get messed up.  We do not do NAT and are balancing chunks of a Class B and more, so maybe that is part of the issue for us.

 

Thank you for pointing out the fallacy of thinking that load balancing between two links can achieve both more available bandwidth and redundancy. The two are mutually exclusive goals unless you can implement per IP address rate limiting very quickly when you fail the oversubscribed traffic on to the one link.

 

If you simply need more bandwidth it is usually much cheaper to just increase the volume from your current provider as the cost per megabit per month is always going down, and you can get quantity discounts for it just like anything else.

 

What we do now is route all our traffic in and out of one provider, and fail it over to a backup provider who gives us a burstable service. The burst will support our current traffic levels, and if the failure lasts long enough we will have to pay extra which we would be glad to do at that point to keep our link up given the increasing importance.

 

Pete Morrissye

 

From: The EDUCAUSE Network Management Constituent Group Listserv [mailto:NETMAN@LISTSERV.EDUCAUSE.EDU] On Behalf Of Mark Duling
Sent: Tuesday, November 06, 2012 2:25 PM
To: NETMAN@LISTSERV.EDUCAUSE.EDU
Subject: Re: [NETMAN] ISP aggregation

 

I just looked at that quote in google books and it is referring to "load balancing."  If multihoming simply means having multiple paths and/or ISPs, I think the "multihoming is for redundancy" statement is an over generalization.  

 

Load sharing can quite effectively be done over multiple links in a determinate fashion by policy routing outbound (PBR in Cisco-speak), and using AS-PATH prepends to make it so that whatever pipe a packet is routed out, then the return transaction will also use the same pipe inbound unless that pipe is down.  You select which internal networks will use which pipes (I'm assuming NAT is in use) and set your dynamic NAT statements accordingly.  This method is highly determinate and very simple to setup, but still very flexible since it is easy to adjust the dynamic NAT statements to fine tune the traffic loads if adjustments are needed over time.  

 

So if the costs or your budget circumstances don't permit two similarly sized pipes you can still get the benefits of a total bandwidth increase for a reasonable cost by adding a pipe using this method, and using very unequal sized pipes isn't a problem because of the PBR out and AS-PATH prepend in combination of route determinacy I've described.  Now obviously, in this case you wouldn't get redundancy in any effective way if you have pipes of sizes 1X and 3X and are using 80% of the total bandwidth, since you can't lose the larger pipe without a devastating degradation of service.  But after setting up the multihomed infrastructure, and assuming you want at some point to be able to survive an ISP link failure, you can upgrade the smaller pipe quite easily when the money is available or the costs come down.

 

All to say if getting more bandwidth for a decent price is more critical for you or more attainable for some reason at a given point in time than path redundancy there is nothing wrong with that.  We are multihomed exactly this way and are quite happy with the results.  Dennis, we also have ASA's and a Cisco border router so I could share the relevant configuration components with you if you want to contact me offline for details.

 

 

Message from mark.duling@biola.edu

I'm not a routing expert, but it seems to me all routing metrics are merely suggestions strictly speaking.  It all depends on how they are used.  But in the context of the scheme I described with single links to multiple ISPs, it seems to me that appending multiple AS-PATHs onto route advertisements combined with outbound policy routing is very highly deterministic.  We have two pipes of wildly different bandwidths right now (though I hope we'll be able to equalize this soon) and it would be painfully obvious if AS-PATH prepending in this scenario were anything less than highly deterministic.

If I append my own ASN 4 times into a route advertisement to the immediate upstream router of one of my ISPs, I think that effectively tells it that the other link to me is actually four hops away.  It isn't, but I want him to think that so he won't try a supposed 4-hop route unless the other is down.  Ditto for the other ISP with the link preference reversed.  So maybe I'm missing something, but I don't see what kind of event there could be in the Internet that would make my immediate upstream peer decide it prefers a 4-hop route to a 1-hop route unless it goes down, which is the desired behavior in our case.  I've also never seen anything other than this behavior in our deployment.



One useful thing to do in general would be to read the BGP path selection algorithm document from your favorite router vendor. Here's the one from Juniper: https://www.juniper.net/techpubs/en_US/junos11.4/topics/reference/genera... As you can see, localpref trumps as-path length (this is also generally true of other vendors). So if I am a big content provider who happens to peer with both ISPs that you use, it doesn't matter how many times you prepend if I have localpref set to prefer sending my traffic to you via one of your ISPs. Let's say you have ISP 1 and ISP 2, and you want most of your traffic going over ISP 1. You therefore prepend in your announcements to ISP 2 and don't prepend to ISP 1. But I (the Big Content Provider) have a much bigger pipe to ISP 2, so I localpref your routes to ISP 2, and thence shall the traffic flow, despite your AS prepending. Even if ISP 1 and ISP 2 peer with each other, ISPs rarely localpref routes learned from peers higher--or the same--as routes learned from direct customers. So the traffic will get to you directly via ISP 2. So BGP really _is_ deterministic, it is just not deterministic from the standpoint of any one AS that is trying to manage inbound traffic. :) This also affects *which* ISPs to choose as your upstream providers, since they will have different customer and peering relationships with other ISPs and content providers, and these relationships can make traffic engineering easier (or harder). michael On 11/6/12 2:39 PM, Mark Duling wrote: > I'm not a routing expert, but it seems to me all routing metrics are > merely suggestions strictly speaking. It all depends on how they are > used. But in the context of the scheme I described with single links to > multiple ISPs, it seems to me that appending multiple AS-PATHs onto > route advertisements combined with outbound policy routing is very > highly deterministic. We have two pipes of wildly different bandwidths > right now (though I hope we'll be able to equalize this soon) and it > would be painfully obvious if AS-PATH prepending in this scenario were > anything less than highly deterministic. > > If I append my own ASN 4 times into a route advertisement to the > immediate upstream router of one of my ISPs, I think that effectively > tells it that the other link to me is actually four hops away. It > isn't, but I want him to think that so he won't try a supposed 4-hop > route unless the other is down. Ditto for the other ISP with the link > preference reversed. So maybe I'm missing something, but I don't see > what kind of event there could be in the Internet that would make my > immediate upstream peer decide it prefers a 4-hop route to a 1-hop route > unless it goes down, which is the desired behavior in our case. I've > also never seen anything other than this behavior in our deployment. > > > >
On 11/6/2012 5:56 PM, Michael Sinatra wrote: > As you can see, localpref trumps as-path length (this is also generally > true of other vendors). So if I am a big content provider who happens > to peer with both ISPs that you use, it doesn't matter how many times > you prepend if I have localpref set to prefer sending my traffic to you > via one of your ISPs. Yes, and localpref only works (for you) for outbound traffic. Inbound is quite different, you must "effect policy" on your upstreams. Some providers will accept communities on prefixes that they will in turn tag with *their* localpreferences. For others, you can send "more specific" prefixes for traffic than you announce to other providers. Some providers will accept communities that influence to which of their peers they will announce a particular prefix, so you can extend the "more-specific" override beyond the immediate upstream. "More specific" prefixes trump everything :) For all cases, you can often tag with the "no-export" community to keep those advertisements local to your upstream only (will affect their local customers only, if you don't want to fight the whole internet over traffic engineering your prefixes). Prepending is essentially network-wide (and for providers using localpref, can be totally ignored). If you have multiple links to a single provider, these can be reasonably used to load-balance the links. Beyond that, with multiple providers, "it depends". > This also affects *which* ISPs to choose as your upstream providers, > since they will have different customer and peering relationships with > other ISPs and content providers, and these relationships can make > traffic engineering easier (or harder). Yes. If your upstreams support these functions (localpref communities, and honor no-export, and peering communities) it helps greatly. Jeff ********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/.
Message from mark.duling@biola.edu

Ok, so I now get that local preference trumps.  I guess the question is how likely is this to actually occur.  It hasn't been a problem for us, but it would be a big problem at the moment if it happened because of the unequal sized pipes we have at the moment.  It sounds like from what I'm hearing that some large service such as Akamai wired in the different way to enhance regional performance could change our inbound routing behavior.

I see in this NANOG presentation it mentions a method (different than what we're discussing) that "works best in a network that can be divided into (usually regional) ISP transit islands that are usually limited to a metro area".  http://www.nanog.org/meetings/nanog45/presentations/Monday/Roisman_bgp_metric_N45.pdf



We are in the process of building this out now.  I don't know your situation but in our neighborhood one vendor was half the price of the rest when we were upgrading last.  So we went with a model of having one ISP as a primary and the second as a hot standby.  We have 300 Mb for our primary connection, 100 for Admin and 200 for the students.  Our backup connection is just 100 Mb for the Admin side.  We figure that we have to keep our vitals up, i.e. the website, DNS, remote access, email.
 
We are in the process of setting this up with BGP now.  Keep in mind that crossing the 100 Mb barrier for routing can be a slap budget-wise.  We shopped around and settled on a pair of used Cisco 3825s from Network Hardware Resale = VERY affordable.  They come with 2 copper Gb ports and a SFP.  The SFP is handy b/c ISPs are just handing off a layer 2 ethernet fiber connection these days. 
 
As for outbound  routing we will just be learning our default route via BGP from our pirmary ISP.  We will have a static route to our backup ISP with a higher cost than the learned BGP route.  If our primary goes away so will the learned default route and we will failover to our secondary for outbound traffic.  Between the edge routers and the FWs we will be running HRSP.
 
As for inbound we will be prepending the advertised routes 4-5 times to the backup ISP per our agreement with them.  We actually negociated a 0 Cir circuit.  We can burst up to 100 Mb for 3 days after which we have to pay.
 
John Kaftan
Infrastructure Manager
Utica College
 
 

 
You can trump localpref/weight with BGP conditional advertisement; however, ISP2 can be used only as backup in this case Regards, Adrian -----Original Message----- From: Michael Sinatra [mailto:michael@RANCID.BERKELEY.EDU] Sent: Tuesday, November 06, 2012 4:56 PM Subject: Re: ISP aggregation One useful thing to do in general would be to read the BGP path selection algorithm document from your favorite router vendor. Here's the one from Juniper: https://www.juniper.net/techpubs/en_US/junos11.4/topics/reference/general/ routing-ptotocols-address-representation.html As you can see, localpref trumps as-path length (this is also generally true of other vendors). So if I am a big content provider who happens to peer with both ISPs that you use, it doesn't matter how many times you prepend if I have localpref set to prefer sending my traffic to you via one of your ISPs. Let's say you have ISP 1 and ISP 2, and you want most of your traffic going over ISP 1. You therefore prepend in your announcements to ISP 2 and don't prepend to ISP 1. But I (the Big Content Provider) have a much bigger pipe to ISP 2, so I localpref your routes to ISP 2, and thence shall the traffic flow, despite your AS prepending. Even if ISP 1 and ISP 2 peer with each other, ISPs rarely localpref routes learned from peers higher--or the same--as routes learned from direct customers. So the traffic will get to you directly via ISP 2. So BGP really _is_ deterministic, it is just not deterministic from the standpoint of any one AS that is trying to manage inbound traffic. :) This also affects *which* ISPs to choose as your upstream providers, since they will have different customer and peering relationships with other ISPs and content providers, and these relationships can make traffic engineering easier (or harder). michael On 11/6/12 2:39 PM, Mark Duling wrote: > I'm not a routing expert, but it seems to me all routing metrics are > merely suggestions strictly speaking. It all depends on how they are > used. But in the context of the scheme I described with single links > to multiple ISPs, it seems to me that appending multiple AS-PATHs onto > route advertisements combined with outbound policy routing is very > highly deterministic. We have two pipes of wildly different > bandwidths right now (though I hope we'll be able to equalize this > soon) and it would be painfully obvious if AS-PATH prepending in this > scenario were anything less than highly deterministic. > > If I append my own ASN 4 times into a route advertisement to the > immediate upstream router of one of my ISPs, I think that effectively > tells it that the other link to me is actually four hops away. It > isn't, but I want him to think that so he won't try a supposed 4-hop > route unless the other is down. Ditto for the other ISP with the link > preference reversed. So maybe I'm missing something, but I don't see > what kind of event there could be in the Internet that would make my > immediate upstream peer decide it prefers a 4-hop route to a 1-hop > route unless it goes down, which is the desired behavior in our case. > I've also never seen anything other than this behavior in our deployment. > > > >
Hi All,
A really interesting talk.  I noted that presentation is vectored for content providers balancing traffic out.  From slide 3:
• Techniques are different for inbound versus 
outbound BGP load balancing at your network’s 
border
• Today’s talk focuses on outbound
• Typically for content networks, outbound traffic is 
the billable value, the benefit is that there is 
more control on outbound load balancing
• Technique used mostly for ISP uplinks, peering 
uplinks are usually considered differently
• Inbound load balancing is less precise, and 
often relies on behavior of upstream and further 
remote networksTechniques are different for inbound versus 
outbound BGP load balancing at your network’s 
border
• Today’s talk focuses on outbound

I think where the inbound load balancing gets a little difficult, is when there are two (roughly) equal-sized pipes running for a portion of the day at somewhat > 50%, maybe up to 80%.  This as compared with a large circuit and a small one.  With two largish circuits, one of them can go down without too much problem.  During the storm last week, we lost two of our upstreams (we are fortunate enough to have three), one for a day and one for several days without any trouble for the campus (partially because traffic was low.)  OTOH,  in 2001, we had a large pipe and a small pipe.  In the WTC event, we lost the large pipe for several months, and it was really a problem.  I would not want to risk that kind of situation ten years later, when everyone is so much more dependent on the internet, when university business really grinds to a halt without a robust internet connection.  

My $.02. 

Dennis Bohn
Manager of Network and Systems
Adelphi University
bohn@adelphi.edu
5168773327


Message from mark.duling@biola.edu

Dennis,

Yes, I noted the scenario for the presentation was different from the ones we're discussing.  I was just wondering if the part I quoted from it could be relevant to the "it depends" part of the possible inbound unpredictability others have referred to.  We've got a routing trump card in play here but no way yet for me to tell how likely it is I'll encounter it.  It's hard for me to tell if people are talking about possibilities more theoretical than actual or not.

Say, what type of event was it that caused one of your pipes to be down for several months?

Mark


Message from mark.duling@biola.edu

Oh I didn't get that WTC must mean World Trade Center.


Hi Mark,
I think it depends on who that smaller upstream is, and what else is going on at the POP or carrier hotel which hosts it.  Four prepends is quite a good bit, clearly enough to keep most traffic away from that link.  Say that the small circuit is a customer of Level3. And Level3 puts in an Akamai server cluster in that pop.  Traffic would likely flow through that smaller pipe, even if it originated from the larger pipe.  I have noticed that without any changes on our part, our microsoft updates shift from link to link, based on where the Akamai presence is.  A few months ago, there was a google server very close by one of our upstreams: <5 msec away.  Then something happened and it was 20 msec away.  Now there is some movement and it is 10 msec away.  Changing relationships between isps?  New servers?  Who knows?  

I wouldn't mind seeing your configs, see if there is something else I should be doing.  Oh, and yes it was the World Trade Center.... there was a large connection point right across west street from the WTC that was taken out for months.  
Dennis Bohn
Manager of Network and Systems
Adelphi University
bohn@adelphi.edu
5168773327


Message from jcolunio@elmira.edu

Greetings,

First, I want to apologize if this subject has already been covered here. We are looking into using two ISP's for greater bandwidth, redundancy, and aggregation. We have looked into a couple of devices AND a couple of different ways to handle this, but the lines seem to blur somewhat.

I want to ask anyone has already done this, to, please, give me any insights you have. We have looked into Fatpipe, which promises a solution, but there have been questions raised (especially about incoming traffic, IF the ISP, currently hosting our DNS, goes down). I think that the only other solution that was brought up is BGP.

Please, let me know if you have used either of these methods AND what your experience was.

Thanks so much for all your help.

Jim

--
Jim Colunio
Network Administrator
Elmira, College
One Park Place
Elmira, NY 14901
Ph. (607) 735-1921
********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/.

Jim,

 

We have two sources, each terminating in different computer rooms.  The reason for different termination points is business continuity in the event of computer room fire, etc.

The principle ISP line is configured with BGP.

The secondary line (lower speed) has about a dozen vendor-provided IP addresses that appear on the most mission-critical services with multi-port ethernet cards.

Currently, we must manually update our DNS zone file to bring the college up to full service when we test failure of the primary internet line.

We are working on an open-source solution for automatic DNS update for a hands-free solution.

 

We do not attempt aggregation of the two sources (400/400 and 100/10 Mbps), but do route “visitor” traffic through the secondary line as a “canary in the coal mine”.

 

Neil Fay

Hood College

 

At Ursinus College in Pennsylvania we multi-home with two different ISP and advertise the college-owned publicly routed networks in our BGP Autonomous System to both. We do not rely on ISP-owned networks for anything so that we could shift ISP easily if we ever wanted to. We “manually” load-balance our traffic at a gross level by preferring one ISP for some of our public address spaces, and the other ISP for others via the BGP advertisements. Either path could carry all of our traffic (with potentially reduced performance) if the other ISP “went away”, however. Our authoritative DNS is local and we have established a Disaster Recovery environment for selected (public-facing) services with a DR company called BlueLock who would assume our AS advertisements should we ever declare an emergency to them.

 

JD

 

Here at Franklin & Marshall College we do exactly the same thing as Ursinus College, so I won't repeat what John said.  We have pairs of routers, firewalls and bandwidth managers, but only one set is active at any time and the traffic from both Internet connections goes through the active set.

Alan


Message from jcoehoorn@york.edu

We use a product called Untangle to aggregate two distinct connections from two ISPs. This product effectively is our firewall and router, and also does some traffic shaping for us. I am very happy with it.


Joel Coehoorn
Director of Information Technology
York College, Nebraska
402.363.5603
jcoehoorn@york.edu

 

The mission of York College is to transform lives through Christ-centered education and to equip students for lifelong service to God, family, and society



Jim,

What others have said I won't echo (we also do basically what Ursinus does), other than to say that I strongly recommend that you go to ARIN and purchase institutionally owned IP address space if you don't already own your own block. While you're at it, register IPv6 space even if you're not planning on using it for a while. It make's your life a lot easier if you are not dependent on your ISP for that resource. 

You ask about incoming traffic. Well, there's nothing you can do about that. No matter what you do to balance your outgoing traffic using either an appliance like Fatpipe or clever dynamic routing, you can't control which pipe incoming traffic chooses. Well, maybe a little. BGP is a least cost algorithm, which means you can add cost to a particular route by making it look like there are extra hops. Still, that's pretty crude and if you are using something like Fatpipe you need to accept a certain amount of "split horizon" routing, meaning connections where the outgoing path is one pipe and the incoming path is another. That is not normally a problem unless one of the paths is congested or flapping. 

Unless you have a very abnormal traffic flow for a college I wouldn't recommend something like Fatpipe because most of your traffic will be inbound and you won't have congestions problems even if ALL your outbound traffic goes on just one of your connections.  

In terms of DNS, I recommend running your own DNS servers, it's not that difficult. Ask your ISPs to be secondary servers in case yours goes down, and think about running your own secondary server either in the cloud or at a sister institution. Again, this is easier if you own your own address space. 

By the way, I'm assuming here that you don't have enough IP address space to assign all your internal endpoints public addresses. If you do, I can make some other, more complicated suggestions. Call me.

Good luck! 

 - Mark
--
Mark Berman, Chief Information Officer
Siena College
515 Loudon Road
Loudonville, NY  12211
(518)782-6957,  Fax: (518)783-2590
Siena College is a learning community advancing the ideals of a liberal arts education, rooted in its identity as a Franciscan and Catholic institution.

CONFIDENTIALITY NOTICE: This e-mail, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure, or distribution is prohibited. If you received this e-mail and are not the intended recipient, please inform the sender by e-mail reply and destroy all copies of the original message. 

Greetings,
First, I want to apologize if this subject has already been covered here. We are looking into using two ISP's for greater bandwidth, redundancy, and aggregation. We have looked into a couple of devices AND a couple of different ways to handle this, but the lines seem to blur somewhat.
I want to ask anyone has already done this, to, please, give me any insights you have. We have looked into Fatpipe, which promises a solution, but there have been questions raised (especially about incoming traffic, IF the ISP, currently hosting our DNS, goes down). I think that the only other solution that was brought up is BGP.
Please, let me know if you have used either of these methods AND what your experience was.
Thanks so much for all your help.
Jim
-- 
Jim Colunio
Network Administrator
Elmira, College
One Park Place
Elmira, NY 14901
Ph. (607) 735-1921
********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/.

********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/.
********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/.

Just as a practical addendum on this;

 

Late on Friday 2/28 one of our two ISP’s had an internal problem with their network. Some of their internal ISP routers came under a DoS attack and the result was that performance for us and many other of this ISP’s customers was impossibly bad, even though our link to them never actually went down.

 

Once we became aware of the issue and realized that it was an ISP-thing and not local, we just disabled the physical interface on our router that connected to them. After about  30-seconds worth of route convergence, all of our IP traffic was flowing in and out of our second ISP (which has plenty of bandwidth) and all was wonderful. About 12-hours later, when we were comfortable that everything was stable in the problem-ISP environment, we reactivated that first ISP link, and everything transparently rolled back over to its normal configuration.  It was all very quick and clean and the weighted BGP advertisements through dual-homed ISP connections did exactly what they were supposed to do.

 

John Daggett

Ursinus College