Home

BGP Slow Convergence

image

Contents

1. N LP ST LO Reduced the input queue drops by a factor of 10 Peer group members Reduces convergence speed by 50 BGP Slow Convergence Initial Convergence TCP MTU path discovery allows BGP to use the largest packets possible 300 350 Without PMTU discovery a we can support 100 peers amp 200 with 120 000 routes each 2 150 O With PMTU discover we 100 can support 175 peers with N 50 120 000 routes each A Note this is 12 0 18 S Cisco 80K 90K 100K 110K 120K IOS Software can support Routes more than this now 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Route Change Convergence There are two elements to route change convergence for BGP How long does it take to see the failure How long does it take to propagate information about the failure For faster peer down detection there are several tools you can use Fast layer two down detection Fast external fallover for directly connected eBGP peers Faster keepalive and dead interval timers Down to 3 and 9 are commonly used today BGP Slow Convergence Route Change Convergence e Fast Session Deactivation _ _ _ ink fails p The address of each peer is converges E a eens registered with the Address Tracking Filter ATF system When the state of the route salad OW changes ATF notifies BGP out BGP tears down the peer impacted acr RIB ATF Interface BGP does not wait on th
2. 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Speakers Won t Peer Bad Messages SBGP 3 NOTIFICATION sent to neighbor 2 2 2 2 2 2 peer in wrong AS 2 bytes 00C8 FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 002D 0104 00C8 00B4 0202 0202 1002 0601 0400 0100 0102 0280 0002 0202 00 unknown subcode The peer open notification subcode isn t known incompatible BGP version The version of BGP the peer is running isn t compatible with the local version of BGP peer in wrong AS The AS this peer is locally configured for doesn t match the AS the peer is advertising BGP identifier wrong The BGP router ID is the same as the local BGP router ID unsupported optional There is an option in the packet which the local BGP parameter speaker doesn t recognize authentication failure The MD5 hash on the received packet does not match the correct MD5 hash unacceptable hold time The remove BGP peer has requested a BGP hold time which is not allowed too low unsupported disjoint capability The peer has asked for support for a feature which the local router does not support BGP Speaker Flap Case Study e Here we see a message from bgp log neighbor changes telling us the hold timer expired R1 e We can double check this by looking at show ip bgp neighbor x x x x include last reset R2 BGP 5 ADJCHANGE neighbor 10 1 1 1 Down BGP Notification sent SBGP 3 NOTIFICATION sent to ne
3. a O You set a community on 4 10 1 1 0 24 65600 AS65200 translates this T oh community into a Local J Vs E Preference 069400 965500 AS65200 then prefers the ne route through AS65300 over the connected route Don t count on this 10 1 1 0 24 happening most providers don t support RFC1998 communities Impacting Inbound Traffic Path Why cant load share traffic between the two links A 65300 i oN 65200 65600 l ve tried AS Path prepend why doesn t Path 2 it work 65400 Sa Path 1 65100 a a 10 1 1 0 24 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Impacting Inbound Traffic Path Any traffic from AS65500 will always come through AS65200 N Any traffic from AS65300 will always come through 65300 65200 AS65300 IN 85500 j Path 2 There s no way to alter this 65400 A 65600 So if the majority of your traffic comes from 65100 AS65500 there s not much you can do ho Path 1 10 1 1 0 24 Impacting Inbound Traffic Path The only traffic you can really adjust with AS Path prepend is from AS65600 You can influence which path AS65600 will take Through AS65200 or through AS65200 This may or may not allow you to tune inbound traffic well 10 1 1 0 24 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Recommended Reading
4. BGP Speakers Won t Peer Source Destination Address Matching m Both sides must agree on neighbor 2 2 2 2 remote as 100 neighbor 2 2 2 2 update source source and destination loopback 0 addresses R1 and R2 do not agree on ns a Ri T E R2 N what addresses to use E t X BGP will tear down the TCP session due to the conflict Points out configuration a taie ee problems and adds some er security BGP Speakers Won t Peer Source Destination Address Matching R2 attempts to open a session to R1 BGP 10 1 1 1 open active local address 2 2 2 2 R1 denies the session because of the address mismatch debug ip bgp on R1 shows BGP 2 2 2 2 passive open to 10 1 1 1 BGP 2 2 2 2 passive open failed 10 1 1 1 is not update source Loopback0 s address 1 1 1 1 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Speakers Won t Peer Active vs Passive Peer Active Session neighbor 2 2 2 2 remote as 100 A neighbor 2 2 2 2 connection mode If the TCP session initiated by active R1 is the one used between R1 amp R2 then R1 actively established the session _ gt ess Passive Session Ri z R2 For the same scenario R2 bE amp N passively established the T a session R1 Actively opened the g neighbor 10 1 1 remote as 100 session neighbor 1041 1 connection mode R2 Passively accepted the m se
5. Continue your Cisco Live TEE learning experience with further reading from Cisco Press Check the Recommended Reading flyer for suggested boo ks Authorized Self Study Guide Building Scalable Cisco Internetworks BSCI Third Edition Foundation Learning for CCNP 642 901 BSCI Available Onsite at the Cisco Company Store M EE 2 22 ee Complete Your Online Session Evaluation Give us your feedback and you could win Don t forget to activate fabulous prizes Winners announced daily your virtual account for access to Receive 20 Passport points for each session all session material evaluation you complete on demand and return for our live virtual event Complete your session evaluation online now in October 2008 open a browser through our wireless network Go to the Collaboration to access our portal or visit one of the Internet Zone in World of stations throughout the Convention Center Solutions or visit sco live con d Sy si Di ifa ce i ur wl LE Wi 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr Apafi CISCO 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr
6. 65400 65500 Even though do AS Path Prepend EN bf Impacting Inbound Traffic Path Why would AS65200 ever prefer Path 2 over Path 1 You pay for the AS65200 link L N They pay for the AS65200 to 65600 AS65300 link 86300 65200 If they preferred Path 2 they ponte 7 would be paying to support E E r your preferred inbound traffic path Path 1 65100 There s not much of a chance of this happening i 10 1 1 0 24 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Impacting Inbound Traffic Path How does AS65200 implement this policy e Routes received from customers are preferred over routes received from peers 65600 using Local Preference a V om a Adding AS Path hops won t 2 overcome AS65200 s Local 65400 65500 Preference Path 1 So traffic from AS65500 will 65100 always come in through the AS65200 link as long as you re advertising 10 1 1 0 24 through the link Impacting Inbound Traffic Path Possible Solutions 65600 Live with traffic from N 4 AS65200 s peers coming A p in through this link bee 65300 65200 Use conditional he advertisement eue Conditional as x advertisement could be Path 1 slow though a gt 10 1 1 0 24 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Impacting Inbound Traffic Path Possible Solutions Use RFC1998 Communities
7. et ee Consumes large amounts of memory EEE EE EE EE EE EE EE EE EE EE EE Advertise 10 1 1 0 24 e BGP now uses the route refresh capability to rebuild the local table if the local filters change 10 1 1 0 24 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Routing Problems BGP Routing Problems e Route Reflector Loops e Route Reflector Suboptimal Routes Inbound Traffic Path Problems 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Route Reflector Loops e Router B BGP Next Hop Router A Local Next Hop Router A Set Next Hop Self e Router C BGP Next Hop Router B Local Next Hop Router D e Router D BGP Next Hop Router E Local Next Hop Router C e Router E BGP Next Hop Router A Local Next Hop Router A Set Next Hop Self gt ee ene lt e Route Reflector Loops This results in a permanent routing loop Route reflectors must always follow the topology Never peer through a route reflector client to reach a route reflector 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Route Reflector Suboptimal Routing Route reflectors can also cause routing to be different or suboptimal compared to full mesh iBGP E advertises 10 1 1 0 24 through eBGP to both B and gt da The local preference MED AS Path length and all other attributes are the same for 10 1 1 0 24 at
8. NHT scan bgp nexthop trigger delay lt 0 100 gt May lower default value as we gain experience e Event driven model allows BGP to react quickly to IGP changes No longer need to wait as long as 60 seconds for BGP to scan the table and recalculate bestpaths Tuning your IGP for fast convergence is recommended 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Route Change Convergence e Dampening is used to reduce frequency of triggered scans e show ip bgp internal Displays data on when the last NHT scan occurred Time until the next NHT may occur dampening information e New commands bgp nexthop trigger enable bgp nexthop trigger delay lt 0 100 gt show ip bgp attr next hop ribfilter debug ip bgp events nexthop debug ip bop rib filter e Full BGP scan still happens every 60 seconds Full scanner will no longer recalculate bestpaths if NHT is enabled BGP Slow Convergence Route Change Convergence e How is the timer enforced for peer X Timer starts when all routes have been advertised to X For the next MRAI seconds we will not propagate any bestpath changes to peer X Once X s MRAI timer expires send him updates and withdraws Restart the timer and the process repeats e User may see a wave of updates and withdraws to peer X every MRAI e User will NOT see a delay of MRAI between each individual update and or withdraw BGP would probably never converge if this was the ca
9. Utilization e Next Hop Tracking e High Memory Utilization 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr EO O High Processor Utilization e Why This could be for several reasons High route churn is the most likely router show process cpu CPU utilization for five seconds 100 0 one minute 99 five minutes 81 139 6795740 1020252 6660 88 34 91 63 74 01 0 BGP Router High Processor Utilization e Check how busy the peers are The Table Version You have 150k routes and see the table version increase by 150k every minute something is wrong _ You have 150k routes and see the table version increase by 300 every minute sounds like normal network churn The InQ Flood of incoming updates or build up of unprocessed updates The OutQ Flood of outgoing updates or build up of untransmitted updates router show ip bgp summary Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up Down State PfxRcd LO ee Twi 4 64512 309453 157389 19981 0 253 22 06 44 111633 172 16 1 1 4 65101 188934 1047 40081 41 0100 07 51 58430 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr High Processor Utilization e If the Table Version is Changing Quickly Are you in initial convergence with this peer Is the peer flapping for some reason Examine the table entries from this peer why are they changing If there is a group of routes which are constantly chang
10. both B and C IGP costs 10 1 1 0 24 Route Reflector Suboptimal Routing e Assume A B C and D are configured for full mesh iBGP e A chooses B as its exit point because of the IGP cost e D chooses C as its exit point because of the IGP cost IGP costs 10 1 1 0 24 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr Route Reflector Suboptimal Routing e Assume B C and D are configured as route reflector clients of A chooses B as its best path because of the IGP cost A reflects this choice to C but C chooses its locally learned eBGP route over the internal 1 se through B ee Re A reflects this choice to D and D chooses the path through B even though the path through C is shorter IGP costs 10 1 1 0 24 Route Reflector Suboptimal Routing e There is little you can do about this e Whenever you remove routing information you risk suboptimal routing e Keeping the route reflector topology in line with the layer 3 topology helps e iBGP multipath can resolve some of these problems At the cost of additional memory e Otherwise use policy to choose the best exit point 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Impacting Inbound Traffic Path 65600 m in AS65100 Why does my traffic N Come in through AS65200 and AS65300 65300 65200 although want it to Naib ee D come in through AS65300 only
11. Cis soll ve Networkers d June 22 26 2008 Orlando FL The Power of A Collaboratio Apafi CISCO Afiafi PFE E sr Cis CO Vis VU Networkers Troubleshooting BGP BRKRST 3320 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr eee a Overview e Troubleshooting Peers BGP Convergence e High Utilization e BGP Routing Problems Troubleshooting Peers 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Speakers Won t Peer This can be difficult to troubleshoot if you can only see one side of the connection Start with the simple pee check for common mistakes Is it su ser to be configured for eBGP multihop Are the AS numbers right Next try pinging the peering address If the ping fails there s likely a connectivity problem Ping Fails gt No Peering BGP Speakers Won t Peer e Try some alternate ping options Protocol ip Target IP address 192 168 40 1 Is the local peering address Datagram size 100 the actual peering interface Y Source address or interface 172 16 23 2 If not use extended ping to source from the loopback or actual peering address If this fails there is an underlying routing problem The other router may not know how to reach your peering interface 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr BGP Speakers Won t Peer e Try extended ping to sweep oo a range of pos
12. This can be very chatty so be careful with this debug 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Rl sh TEROR A TOROF TCRO 275274246 TCP 0 2 TELE Di RS XP TCE Oe LCE Os 21227 KOROR TEROR log TCPOES state was ESTAB gt FINWAITI 12345 gt 2 179 J sending FIN state was FINWAITI gt FINWAIT2 12345 gt 2 179 FIN processed state was FINWAIT2 gt TIMEWAIT 12345 gt 2 179 Connection to 2 2 2 28179 acvertising MSS 1460 SALE was CLOSED gt SYINSHINE L2346 2 179 State was SYNSENTI gt ESTAB 12346 gt 2 2 2 2 179 Eco GASODCDE connection te 2 2 Z2 Z3179 received MSS 1460 MSS is 1460 BGP Speakers Won t Peer e If the connectivity is good the next step is to check BGP itself debug ip bgp Use with caution Configure so the output goes to the log rather than the console logging buffered lt size gt no logging console It s easier to find the problem points this way router show log i NOTIFICATION BGP Speakers Won t Peer e show ip bgp neighbor 1 1 1 1 include last reset This should give you the resets for a peer The same information as is shown through debug ip bgp bgp log neighbor changes Provides much of the same information as debug ip bgp as well 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr
13. e hold timer to expire ATR OTTOS EP 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr BGP Slow Convergence Route Change Convergence e Very dangerous for iBGP peers IGP may not have a route to a peer for a split second FSD would tear down the BGP session Imagine if you lose your IGP route to your RR Route Reflector for just 100ms e Off by default neighbor x x x x fall over BGP Slow Convergence Route Change Convergence e ATF can also be used to track changes in next hops iBGP recurses onto an IGP next hop to find a path through the local AS Changes in the IGP cost or reachability are normally seen only by the BGP scanner Since the scanner runs every 60 seconds by default this means iBGP convergence can take up to 60 seconds on an IGP change 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Route Change Convergence BGP Next Hop Tracking Enabled by default no bgp nexthop trigger enable BGP registers all nexthops with ATF Hidden command will let you see a list of nexthops show ip bgp attr nexthop ATF will let BGP know when a route change occurs for a nexthop ATF notification will trigger a lightweight BGP Scanner run Bestpaths will be calculated None of the other Full Scan work will happen BGP Slow Convergence Route Change Convergence e Once an ATF notification is received BGP waits 5 seconds before triggering
14. ighbor 1 1 1 1 4 0 hold time expired 0 bytes R2 show ip bgp neighbor 10 1 1 1 include last reset Last reset 00 01 02 due to BGP Notification sent hold time expired 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Speaker Flap Case Study There are lots of possibilities here R1 has a problem sending keepalives The keepalives are lost in the cloud R2 has a problem receiving R2 the keepalive Pa BGP 5 ADJCHANGE neighbor 10 1 1 1 Down BGP Notification sent BGP 3 NOTIFICATION sent to neighbor 1 1 1 1 4 0 hold time expired 0 bytes R2 show ip bgp neighbor 10 1 1 1 include last reset Last reset 00 01 02 due to BGP Notification sent hold time expired BGP Speaker Flap Case Study e Did R1 build and transmit a keepalive for R2 debug ip bgp keepalive show ip bgp neighbor e When did we last send or receive data with the peer R2 show ip bgp neighbors 1 1 1 1 BGP neighbor is 1 1 1 1 remote AS 100 external link BGP version 4 remote router ID 1 1 1 1 BGP state Established up for 00 12 49 Last read 00 00 45 last write 00 00 44 hold time is 180 keepalive interval is 60 seconds e If R1 did not build and transmit a KA How is R1 on memory What is the R1 s CPU load Is R2 s TCP window open 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Speaker Flap Case Study R2 show ip bgp sum begin Neighb
15. ing consider route flap dampening e If the InQ is high You should see the table version changing quickly If it s not the peer isn t acting correctly Consider shutting it down until the peer can be fixed e If the OutQ is high Lots of updates being generated Check table versions of other peers Check for underlying transport problems High Processor Utilization e Check on the BGP Scanner Walks the table looking for changed next hops Checks conditional advertisement Imports from and exports to VPNv4 VRFs router show processes include BGP Scanner 172 Lsi 407A1BFC 29144 29130 1000 8384 9000 0 BGP Scanner 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr EO O High Processor Utilization e To relieve pressure on the BGP Scanner Upgrade to newer code Most of the work of the BGP Scanner has been moved to an event driven model This has reduced the impact of BGP Scanner significantly Reduce route and view count Reduce or eliminate other processes which walk the RIB SNMP routing table walks for instance Deploy BGP Next Hop Tracking NHT D Next Hop Tracking ATF is a middle man between the RIB and RIB clients BGP OSPF EIGRP etc are all clients of the RIB client tells ATF what prefixes he is interested in e ATF tracks each prefix Notify the client when the route to a registered prefix changes Client is responsible for taking action based on ATF notification Provides a scalable even
16. ll rights reserved 13884 05 2007 ci scr BGP Slow Convergence Initial Convergence e Initial convergence is limited by The number of packets required to transfer the entire BGP database The number of routes The ability of BGP to pack routes into a small number of packets The number of peer specific policies TCP transport issues How often does TCP go into slow start How much can TCP put into one packet BGP Slow Convergence Initial Convergence BGP starts a packet by building an attribute set Attribute Attribute It then packs as many destinations NLRIs as it can into the packet O 2 s lt Only destinations with the same attribute set can be placed in the packet More Efficient Destinations can only be put into the packet until it s full Less Efficient First rule of thumb to increase convergence speed decrease unique sets of attributes 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Initial Convergence The larger the packet BGP can build the more destinations it can put in the packet The more you can put ina single packet the less often you have to repeat the same attributes O 42 2 s gt lt O D paan gt lt More Efficient Second rule of thumb allow BGP to use the largest packets possible Less Efficient More Efficient Less Efficient BGP Slow Convergence Initial C
17. n a lower scale In the hundreds not the thousands Neighbor adjacencies in IGPs normally pick up different routes rather than the same route multiple times e Each view takes up some amount of space 250 000 routes x 100 views a lot of memory usage High Memory Utilization Views and Routes To reduce memory consumption Reduce the number of routes This is particularly true in providers supporting L3VPN services The route and view count can escalate quickly when supporting many customer s L3VPNs Filter aggressively Accept partial routing tables rather than full routing tables Reduce the number of views Use route reflectors rather than full mesh iBGP peering Peer only when needed 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr High Memory Utilization Attributes BGP implementations build their memory structures around minimizing storage e Attributes are stored once Rather than once per route Each route references an attribute set rather than storing the attribute set e This is similar to the way BGP updates are formed AS Path 1 AS Path 2 Community Set 1 Community Set 2 High Memory Utilization Attributes e The more unique attribute 10 1 1 0 24 sets you re receiving the D lt more unique attribute sets 101202 KY ape S EEEO ou mig t have the same number of routes and views a over time but memory utilization can increase Community Se
18. onvergence BGP must create packets based the policies towards each peer Third rule of thumb Minimize the number of unique policies towards eBGP peers Attribute Less Efficient More Efficient 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Initial Convergence e TCP Interactions Each time a TCP packet is dropped the session goes into slow Start It takes a good deal of time for a TCP session to come out of slow start Fourth rule of Thumb Try and reduce the circumstances under which a TCP segment will be dropped during initial convergence BGP Slow Convergence Initial Convergence e Bottom Line Hold down the number of unique attributes per route Don t send communities if you don t need to etc Hold down the number of policies towards eBGP peers Try to find a small set of common policies rather than individualizing policies per peer Stop TCP segment drops Increase input queues Increase SPD thresholds Make certain links are clean 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Initial Convergence Here we see the results of Convergence Input Queue setting up maximum sized ame minutes Drops input queues 20 250K A single router running 16 200K 12 0 18 S 100 to 500 peers in a single 12 150K peer group 8 100K Sending 100 000 routes to each peer 4 50K Increasing the input queue A sizes m E a
19. or Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up Down State PfxRcd 22 232 4 2 53 10167 0 0 But the number of packets The number of packets At least one BGP transmitted is not increasing generated is increasing keepalive interval apart R2 show ip bgp summary begin Ne ghbor Neighbor V AS MsgRcvd MsgSent TblVer InQ tO Up Ddwn State PfxRcd 2 2 0 8 4 2 53 10167 0 98 00 03 04 0 The keepalives aren t leaving R2 BGP Speaker Flap Case Study e Go back to square one and check the IP connectivity This is a layer 2 or 3 transport issue etc Rl ping 10 2 2 2 Type escape sequence to abort Sending 5 100 byte ICMP Echos to 2 2 2 2 timeout is 2 seconds Success rate is 100 percent 5 5 round trip min avg max 16 21 24 m Rl ping ip Target IP address 10 2 2 2 Repeat count 5 Datagram size 100 1500 Timeout in seconds 2 Extended commands n Sweep range of sizes n Type escape sequence to abort Sending 5 1500 byte ICMP Echos to 2 2 2 2 timeout is 2 seconds Success rate is 0 percent 0 5 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Convergence aa BGP Slow Convergence Hey Who are you calling slow Slow is a relative term BGP probably won t ever converge as fast as any of the IGPs e Two general convergence situations Initial startup between peers Route changes between existing peers 2007 Cisco Systems Inc A
20. riggered scans show ip bgp internal Displays data on when the last NHT scan occurred Time until the next NHT may occur dampening information e New commands bgp nexthop trigger enable bgp nexthop trigger delay lt 0 100 gt show ip bgp attr next hop ribfilter debug ip bgp events nexthop debug ip bgp rib filter e Full BGP scan still happens every 60 seconds Full scanner will no longer recalculate bestpaths if NHT is enabled 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr High Memory Utilization Views and Routes Why is BGP taking up SO much memory A A BGP speaker generally he j T View receives a number o copies of the same route or set of routes Each of these copies of the same route or routes is called a view A has two views of 10 1 1 0 24 10 1 1 0 24 High Memory Utilization Views and Routes Multiple views can come from iBGP peers peering with the same remote AS IBGP peers peering with remote AS with generally the same table This is common in the case of the global Internet eBGP peers peering with the same remote AS eBGP peers peering with remote AS with generally the same table This is common in the case of the global Internet 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr High Memory Utilization Views and Routes e Multiple views exist in IGPs as well But not on the same scale Neighbor adjacencies in IGPs are generally o
21. se 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Route Change Convergence MRAI timeline for iBGP peer Bestpath Change 1 at t7 is TXed immediately Bestpath Change 2 Bestpath j Change 1 MRAI timer starts at t7 will expire at t12 Bestpath Change 2 at t10 15 j t15 t20 t25 must wait until t12 for MRAI to expire TX update 1 i auan e Bestpath Change 2 is TXed at t12 MRAI Expires TX update 2 eStart MRAI MRAI timer starts at t12 will expire at t17 MRAI expires at t17 no updates are pending BGP Slow Convergence Route Change Convergence e BGP is not a link state protocol May take several rounds cycles of exchanging updates and withdraws for the network to converge e MRAI must expire between each round e The more fully meshed the network and the more tiers of ASes the more rounds required for convergence e Think about How many tiers of ASes there are in the Internet How meshy peering can be in the Internet 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Route Change Convergence Full mesh is the worst case MRAI convergence scenario R1 will send a withdraw to all peers for 10 0 0 0 8 e Count the number of R1 rounds of UPDATEs and R1 withdraws until the R1 network has converged Note how MRAI slows convergence Blue path is the bes
22. sib e MTUs Protocol ip Target IP address 192 168 40 1 Repeat count 5 Note the MTU at which the b nO 2 ping starts to fail a ar Source address or interface Make certain the interface is ee ome configured for that MTU size Validate reply data no Data pattern 0xABCD Loose Strict Record Timestamp e If these all fail Verbose none Sweep range of sizes n y None of the pings work no ee LS a matter how you try PRES It s likely a transport problem Drop back and punt Es BGP Speakers Won t Peer e Remember that BGP runs on top of IP and can be affected by Rate limiting Traffic shaping Tunneling problems IP reachability problems the underlying routing isn t working TCP problems Etc 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Speakers Won t Peer Useful Peer Troubleshooting Commands show tcp brief all show tcp statistics WCB Local Address Foreign Address state GAS TORTA T25 27275 nee S ESTAB SA SIBA CCE O 2273275 E LISTEN 62FFDEF4 LE AI LISTEN Fev 0 Sent 1005 Total 10 mo port checksumserror MUNDO Ser MON CHS honte 0 out of order packets 0 bytes 4186 ack packets 73521 bytes 9150 Total O urgent packets 4810 control packets including 127 retransmitted 2172 data packets 71504 bytes BGP Speakers Won t Peer Useful Peer Troubleshooting Commands debug ip tcp transactions
23. ssion Can be configured neighbor x x x x transport connection mode active passive BGP Speakers Won t Peer Active vs Passive Peer Use show ip bgp neighbor to determine if a router actively or passively established a session Rl show ip bgp neighbors 2 2 2 2 BGP neighbor is 2 2 2 2 remote AS 200 external link BGP version 4 remote router ID 2 2 2 2 snip Local most L 1 1 1 Local port 12343 Foreign Most 2 2 2 2 Foreign port 179 TCP open from R1 to R2 s port 179 established the session Tells us that R1 actively established the session 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Speakers Won t Peer Session Collisions Both speakers initiate their ER or 4040400 ata j neig bor 2 2 2 2 Connection mode sessions at the same time T The active session 2 E established by the peer q a with the highest router ID 2 5 5 IS the winner This rarely happens neighbor 10 1 5 z remote as 100 Not an issue if this neighbor 10 1 1 1 connection mode passive does occur BGP Speakers Won t Peer Time to Live AS65001 BGP uses a TTLof 1 for 7 eBGP peers Default TTL For eBGP peers that are more than 1 hop away a ey larger TTL must be used neighbor x x x x ebgp multihop rete 2 255 J AS65000 ee R1 show ip bgp neighbors 2 2 2 2 inc External BGP snip External BGP neighbor may be up to 1 hops away
24. t 4 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr High Memory Utilization Attributes e To Conserve Memory Strip unneeded attributes on the inbound side of eBGP peering sessions Verify you don t really need them or they aren t useful after the route has transited your AS Communities are the biggest only target Use Communities wisely within your network A large mishmash of communities can consume memory High Memory Utilization Soft Reconfiguration e B advertises 10 1 1 0 24 to A e A filters the route locally e The filters on A are changed to permit 10 1 1 0 24 But how does A relearn 10 1 1 0 24 NR Blocked by filter T Advertise 10 1 1 0 24 10 1 1 0 24 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr High Memory Utilization Soft Reconfiguration e With soft reconfiguration A saves all the routes it receives from B Applies any inbound filters Local Fiter between this saved copy of Bs U D Localcopy updates and the local BGP table Advertise 10 1 1 0 24 Blocked by filter If the local filters change they can be Spee by simply pulling all the updates from the L table into the local BGP able 10 1 1 0 24 High Memory Utilization Soft Reconfiguration e Keeping this local copy uses a lot of memory Blocked by filter In general don t use soft reconfiguration Local Filter cca Copy
25. t driven model for dealing with RIB changes 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr Next Hop Tracking BGP tells ATF to let us Know BGP BGP Nexthops about any changes X 0145 to 10 1 1 3 and 10 1 1 5 34 10 1 1 5 ATF filters out any changes for De 10 1 1 1 32 10 1 1 2 32 and 10 1 1 4 32 e Changes to 10 1 1 3 32 and 10 1 1 5 32 are passed along to BGP 10 1 1 1 32 10 1 1 2 32 10 1 1 3 32 10 1 1 4 32 10 1 1 5 32 Next Hop Tracking e BGP Next Hop Tracking Enabled by default no bgp nexthop trigger enable e BGP registers all nexthops with ATF Hidden command will let you see a list of nexthops show ip bgp attr nexthop e ATF will let BGP know when a route change occurs for a nexthop ATF notification will trigger a lightweight BGP Scanner run Bestpaths will be calculated None of the other Full Scan work will happen 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr Next Hop Tracking e Once an ATF notification is received BGP waits 5 seconds before triggering NHT scan bgp nexthop trigger delay lt 0 100 gt May lower default value as we gain experience e Event driven model allows BGP to react quickly to IGP changes No longer need to wait as long as 60 seconds for BGP to scan the table and recalculate bestpaths Tuning your IGP for fast convergence is recommended Next Hop Tracking e Dampening is used to reduce frequency of t
26. tpath BGP Slow Convergence Route Change Convergence e R1 withdraws 10 0 0 0 8 to all peers e R1 starts a MRAI timer for each peer D R4 R1 R4 DEN R2 R1 R3 R1 Withdraw Denied Update Update 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr BGP Slow Convergence Route Change Convergence R2 R3 amp R4 recalculate their bestpaths R2 R3 amp R4 send updates based on new bestpaths R2 R3 amp R4 start a MRAI timer for each peer End of Round 1 R21 e Withdraw Denied Update Update BGP Slow Convergence Route Change Convergence R2 R3 amp R4 recalculate their bestpaths R2 R3 amp R4 must wait for their MRAI timers to expire R2 R3 amp R4 send updates and withdraws based on their new bestpaths R2 R3 amp R4 restart the MRAI timer for each peer End of Round 2 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr oo R4 A R amp R R3 R2 R1 MARCA Withdraw Denied Update Update BGP Slow Convergence Route Change Convergence e R3 amp R4 recalculate their bestpaths R3 amp R4 must wait for their MRAI timers to expire R3 amp R4 send updates and withdraws based on their new bestpaths R3 amp R4 restart the MRAI timer for each peer Withdraw Denied Update End of Round 3 Update BGP Slow Convergence Route Change Con
27. vergence R2 R3 amp R4 took 3 rounds of messages to converge MRAI timers had to expire between 1st 2nd round and between 2nd 3rd round Total MRAI convergence delay for this example IBGP mesh 10 seconds eBGP mesh 60 seconds Withdraw Denied Update Update 2007 Cisco Systems Inc All rights reserved 13884_05_2007_c1 scr BGP Slow Convergence Route Change Convergence Internet churn means we are constantly setting and waiting on MRAI timers One flapping prefix slows convergence for all prefixes Internet table sees roughly 6 bestpath changes per second e For iBGP and PE CE eBGP peers neighbor x x x x advertisement interval 0 Will be the default in 12 0 32 S e For regular eBGP peers Lowering to 0 may get you dampened OK to lower for eBGP peers if they are not using dampening BGP Slow Convergence Route Change Convergence e Will a MRAI of O eliminate batching Somewhat but not much happens anyway TCP the operating system and BGP code provide some batching Process all message from peer InQs Calculate bestpaths based on received messages Format UPDATEs to advertise new bestpaths What about CPU load from 0 second MRAI Internet table has 6 bestpath changes per second Easy for a router to handle 5 seconds of delay is not needed 2007 Cisco Systems Inc All rights reserved 13884 05 2007 ci scr High Utilization Tl High Utilization e High Processor

Download Pdf Manuals

image

Related Search

Related Contents

Axor Citterio 39200XX1 User's Manual  Dual PAL or NTSC Video to RGB Converter (One way)  Accidente grave con una motoniveladora, número 32 de noviembre  User`s Manual - Carl McMillan.com  User Manual - Stud Indicator  Projet Client IRC rapport  FSP/Fortron Knight KN-1103TS Tower  Remington F7790 men's shaver  1. O Guia do Cartão Visa Electron 2. O seu Cartão Visa Electron 3  Mercedes-Benz 2002 C-Class Wagon Automobile User Manual  

Copyright © All rights reserved.
Failed to retrieve file