RIP, EIGRP, OSPF, IS-IS, BGP, MPLS, VTP, STP.
User avatar
mlan
Ultimate Member
Posts:
752
Joined:
Thu Nov 17, 2011 6:09 pm

Troubleshooting TCAM saturation - hl3mm process on 3750-X

Tue Dec 06, 2011 2:36 pm

Hey all,

We are evaluating a 3750X-24S stack for closet aggregation along with some layer-3 routing/PBR (please don't ask why). With the SDM routing template enabled, this switch is maxed out at 3,072 unicast mac address in the TCAM table(s). We are over the max value according to:

Code: Select all
switch#sh plat tcam util     

CAM Utilization for ASIC# 0                      Max            Used
                                             Masks/Values    Masks/values

 Unicast mac addresses:                       3292/3292       4275/4275 
 IPv4 IGMP groups + multicast routes:         1120/1120        349/349   
 IPv4 unicast directly-connected routes:      3072/3072       1677/1677 
 IPv4 unicast indirectly-connected routes:    8144/8144        913/913   
 IPv4 policy based routing aces:               498/498          91/91   
 IPv4 qos aces:                                474/474          21/21   
 IPv4 security aces:                           972/972          36/36   

Note: Allocation of TCAM entries per feature uses
a complex algorithm. The above information is meant
to provide an abstract view of the current TCAM utilization


The symptom is very high cpu usage, specifically from the hl3mm process:

Code: Select all
switch#sh proc cpu sorted
CPU utilization for five seconds: 96%/10%; one minute: 83%; five minutes: 82%
 PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process
 136    37265887      720835      51698 23.04% 34.54% 34.18%   0 hl3mm           
 373     3788879     2516452       1505 18.56%  5.60%  4.89%   0 IGMP Input       
 374     7565174     9824531        770  6.88%  6.09%  6.40%   0 PIM Process     
 125    17045304     7400857       2303  5.44%  6.16%  6.06%   0 hpm main process
 304     1848673     2160701        855  4.80%  1.83%  1.77%   0 MDFS RP process 
 169     3950215     5517396        715  3.36%  2.78%  2.72%   0 Hulc LED Process
  91     7800944      100626      77524  3.20%  2.55%  2.56%   0 Adjust Regions   
 214     5385668    10655025        505  2.87%  2.82%  2.96%   0 IP Input         
 375      649594     2288784        283  2.24%  0.94%  0.89%   0 Mwheel Process   
 267      678915     1349807        502  1.59%  0.57%  0.58%   0 MDFS LC Process 
 232     5261724     6128425        858  1.59%  1.83%  1.80%   0 Spanning Tree   
   4      968422       58921      16435  1.43%  0.35%  0.34%   0 Check heaps   
[snip]


I understand the cause/effect, and we are looking to deploy additional gear to alleviate the situation, but what specifically is the hl3mm process? I have yet to turn it up in a search or in documentation. Thanks.

Also, note for those interested: The Cisco 3750G-12S (fiber aggregation switch) allowed for ~6k unicast mac addresses in the TCAM table(s) with the SDM routing template, yet the newer 3750X-12S and 3750X-24S (also fiber aggregation switches) only allow ~3k unicast macs with the SDM routing template enabled. Curious as to the decision there.

1KrazyFool
Member
Posts:
102
Joined:
Sat Apr 26, 2008 4:12 pm
Certs:
CCNP, CCIP, CCNA Wireless

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Tue Dec 06, 2011 11:05 pm

My guess would lean towards that being the mac learning process, which has to constantly churn the mac addresses through the TCAM.

Can you look at the "access" SDM template, which would give you more mac address space, and less indirect routes (which seems to be OK given the output)?

User avatar
mlan
Ultimate Member
Posts:
752
Joined:
Thu Nov 17, 2011 6:09 pm

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Wed Dec 07, 2011 12:28 pm

1KrazyFool wrote:My guess would lean towards that being the mac learning process, which has to constantly churn the mac addresses through the TCAM.

Can you look at the "access" SDM template, which would give you more mac address space, and less indirect routes (which seems to be OK given the output)?


The mac learning process is my guess as well. I am stumped that I can't turn up any docs about it. I see a TAC call in my future. ;)

The other SDM templates would work fine, but the 3750 series requires the routing template to enable the PBR commands. If I could migrate/ditch the PBR this would be a non-issue. Thanks for your reply.

User avatar
writeerase
Ultimate Member
Posts:
509
Joined:
Sat Apr 09, 2011 3:55 pm
Certs:
CCIE CCNP-S CCDA MCSE RHCT Sec+ A+

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Wed Dec 07, 2011 1:01 pm


1KrazyFool
Member
Posts:
102
Joined:
Sat Apr 26, 2008 4:12 pm
Certs:
CCNP, CCIP, CCNA Wireless

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Wed Dec 07, 2011 2:09 pm

If you look at the 'access' template, it has TCAM carved out for PBR entries:

https://www.cisco.com/en/US/docs/switch ... #wp1059809

User avatar
davidrothera
Ultimate Member
Posts:
992
Joined:
Thu Jan 13, 2011 5:10 pm
Certs:
CCIE R&S #38338, CCNP, CCIP

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Wed Dec 07, 2011 2:13 pm

1KrazyFool wrote:If you look at the 'access' template, it has TCAM carved out for PBR entries:

https://www.cisco.com/en/US/docs/switch ... #wp1059809


According to Cisco the commands won't be enabled unless you have the 'routing' SDM template, I don't have a 3750 handy or I would take a look.

Cisco wrote:In order to use PBR, you must first enable the routing template with the sdm prefer routing global configuration command. PBR is not supported with the VLAN or default template.
---
David
CCIE R&S #38338, CCIP, CCNP

http://networkbroadcast.co.uk - My Blog
http://twitter.com/davidrothera

User avatar
mlan
Ultimate Member
Posts:
752
Joined:
Thu Nov 17, 2011 6:09 pm

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Wed Dec 07, 2011 2:30 pm

davidrothera wrote:
1KrazyFool wrote:If you look at the 'access' template, it has TCAM carved out for PBR entries:

https://www.cisco.com/en/US/docs/switch ... #wp1059809


According to Cisco the commands won't be enabled unless you have the 'routing' SDM template, I don't have a 3750 handy or I would take a look.

Cisco wrote:In order to use PBR, you must first enable the routing template with the sdm prefer routing global configuration command. PBR is not supported with the VLAN or default template.


Interesting. I missed seeing that the Access template had PBR allocation. I can also verify davidothera's Cisco quote, which is the reason why I never considered the other template. I'm going to test this in the lab on a 3750-48TS.

writeerase, I have read through that article, and I'm running 12.2(58)SE3, so the specific bug they list should not be affecting. I also don't have any CatOS involved. That said, they list the hl2mm process as related to IGMP, so it's the best clue so far. We have a decent amount of multicast traffic traversing the switch. Thanks.

1KrazyFool
Member
Posts:
102
Joined:
Sat Apr 26, 2008 4:12 pm
Certs:
CCNP, CCIP, CCNA Wireless

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Wed Dec 07, 2011 2:33 pm

The access template is a new(ish) template [looks like it appeared between 12.2.25(SE) and 12.2.35(SE)] that was not available in the earlier IOS releases. My guess is that the docs were never updated to account for the change.

User avatar
mlan
Ultimate Member
Posts:
752
Joined:
Thu Nov 17, 2011 6:09 pm

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Wed Dec 07, 2011 2:53 pm

I confirmed the access template accepts the PBR commands and performs the functionality as expected. I switched back to the default template and it removed the route maps off the L3 interfaces. The access template might alleviate some of the high cpu usage for us, but we still push 4k unicast mac addresses during business hours. Thanks for the tip!

User avatar
mlan
Ultimate Member
Posts:
752
Joined:
Thu Nov 17, 2011 6:09 pm

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Wed Mar 14, 2012 5:47 pm

Update on this issue:

We migrated nearly all the vlan SVI's off this switch and cpu usage has dropped down to reasonable levels. The "hl3mm" process is now averaging 0.03% total cpu usage. This switch seems to be ok for layer-2 fiber aggregation, but I would not recommend it for high-traffic layer-3 aggregation/routing, especially if you have a lot of multicast endpoints aggregated to the switch. It might work for a closet that is only terminating a smaller number of vlan's and endpoint devices.

For those of you pushing layer-3 to the closet switches, what gear are you using with success?

User avatar
davidrothera
Ultimate Member
Posts:
992
Joined:
Thu Jan 13, 2011 5:10 pm
Certs:
CCIE R&S #38338, CCNP, CCIP

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Thu Mar 15, 2012 7:07 am

I think your issue may have been the multicast not just the routing.

We have some 3560's deployed that have 100's of hosts behind it on lots of SVI's and a couple of VRF's and they don't blink an eyelid at it.
---
David
CCIE R&S #38338, CCIP, CCNP

http://networkbroadcast.co.uk - My Blog
http://twitter.com/davidrothera

roggy
Senior Member
Posts:
346
Joined:
Tue Apr 08, 2008 10:09 am

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Thu Mar 15, 2012 7:27 am

same here we have 2*3750 stacked which about 100 vlans and hundreads of hosts - cpu is next to nothing - difference looks to be multicast

User avatar
mlan
Ultimate Member
Posts:
752
Joined:
Thu Nov 17, 2011 6:09 pm

Re: Troubleshooting TCAM saturation - hl3mm process on 3750-

Thu Mar 15, 2012 12:50 pm

Thanks for the feedback. That confirms some suspicions we had regarding the mcast traffic.

'

Return to Cisco Routing and Switching

Who is online

Users browsing this forum: AnthonyC, burnyd, Exstart, Ironman401, MSN [Bot], Recundis, stealth and 47 guests