networking-forum.com
Community BlogCommunity Wiki * Register  * Search  * Login
View unanswered postsView active topics

All times are UTC - 6 hours [ DST ]



Post new topic Reply to topic  [ 37 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Mon Apr 16, 2012 2:21 pm 
Offline
New Member
New Member

Joined: Fri Mar 30, 2012 3:03 pm
Posts: 13
Certs: CCNP, MCITP
First of all, thank Steve for taking the time to look at this.

9c:af:ca:64:2c:42 is the vlan10 interface on the 3750
00:1c:f6:1d:4c:11 is the 3845 router interface on vlan10.

The capture was taken in the scenario with the two routers in the mix (the diagram before last). So the frame from the workstation to the server has the source MAC of the 3845.

If I take the routers out of the mix, the capture still looks the same.

"This makes me believe the client isn't sending the ACK the server is looking for, so it retransmits. It's not like I see anything additional sent from the client capture and it's retransmitting anyway. Unless I'm missing something." >> Yup, that's the strange behavior, even though the workstation sends the ACK, once and the server receives it, the server keeps retransmitting and the workstation never ACKs on any of the subsequent packets.


Top
 Profile  
 
PostPosted: Mon Apr 16, 2012 2:31 pm 
Offline
New Member
New Member

Joined: Fri Mar 30, 2012 3:03 pm
Posts: 13
Certs: CCNP, MCITP
server is a VM with one nic. Server and Workstation captures were taken using wireshark directly off the OS of each.

Switchport capture was done by SPANing the port to a machine that wireshark is running on.


Top
 Profile  
 
PostPosted: Mon Apr 16, 2012 3:34 pm 
Offline
CCIE #24973
CCIE #24973
User avatar

Joined: Fri Mar 02, 2007 5:18 am
Posts: 196
Location: Bahrain
Certs: CCNP,CCSP,CCIE (R&S)#24973
can you change the IPsec with tunnel interface and route through the tunnel interfaces.
i think it's a L2 issue with provider not L3.


cisco_1

_________________
"Nothing Is Limited, Except Our Understanding To The Universe"


Top
 Profile  
 
PostPosted: Mon Apr 16, 2012 3:39 pm 
Offline
New Member
New Member

Joined: Fri Mar 30, 2012 3:03 pm
Posts: 13
Certs: CCNP, MCITP
Well we could do that, but we'd loose OSPF, and the throughput is around 100mbps, vs the 1gig line rate.

I tested with GRE interfaces also, and that didn't work either.


Top
 Profile  
 
PostPosted: Mon Apr 16, 2012 6:00 pm 
Offline
Post Whore
Post Whore
User avatar

Joined: Mon Nov 16, 2009 8:10 pm
Posts: 2523
Location: San Diego, CA
Certs: CCNP, BCNE, Network+, Security+
cisco_1 wrote:
can you change the IPsec with tunnel interface and route through the tunnel interfaces.
i think it's a L2 issue with provider not L3.


cisco_1


Granted the packet capture was most likely filtered out, and maybe a raw packet capture could better explain it, what do you think the problem is, and how would you identify it?

@tarjall, the reason I asked about multiple NICs is because with the packet capture provided I think it's missing something because it's filtered (Just shows comms between two IPs). I'm wondering if the missing piece is in a raw capture, showing the ACK sent to an incorrect address or something like that. I've seen that when troubleshooting web filtering problems when the server has NIC binding set up incorrectly.

_________________
Regards,

Steven King
San Diego Cisco User Group - http://www.sdcug.com
"The only time something is impossible is when you think it is." - Kevin Corbin, CCIE #11577


Top
 Profile  
 
PostPosted: Tue Apr 17, 2012 1:48 pm 
Offline
New Member
New Member

Joined: Fri Mar 30, 2012 3:03 pm
Posts: 13
Certs: CCNP, MCITP
I think I see whats causing the retransmits.

Look at packet 88 on the outlookclient cap, look at the bottom of the hex view and you see 08 bf e0, look at the same packet on the 3750 capture and its all zeros.

I took another capture from both ports that the provider plugs into, and i get zeros going out one end, and the modified value coming in.

I'm contacting the provider. What would cause this?


Top
 Profile  
 
PostPosted: Tue Apr 17, 2012 4:54 pm 
Offline
Ultimate Member
Ultimate Member
User avatar

Joined: Thu Jan 13, 2011 5:10 pm
Posts: 985
Location: Leeds, UK
Certs: CCIE R&S #38338, CCNP, CCIP
It could be that the 3750 capture is just headers only and isn't capturing the packet payload. How did you perform the 3750 capture?

_________________
---
David
CCIE R&S #38338, CCIP, CCNP

http://networkbroadcast.co.uk - My Blog
http://twitter.com/davidrothera


Top
 Profile  
 
PostPosted: Wed Apr 18, 2012 9:24 am 
Offline
New Member
New Member

Joined: Fri Mar 30, 2012 3:03 pm
Posts: 13
Certs: CCNP, MCITP
payload is included in all captures, the 3750 capture was done by running wireshark on a machine attached to a SPAN destination port.


Top
 Profile  
 
PostPosted: Mon Apr 23, 2012 2:47 pm 
Offline
New Member
New Member

Joined: Fri Mar 30, 2012 3:03 pm
Posts: 13
Certs: CCNP, MCITP
So the provider is not able to explain this at all.

We send out packet from one end, and the last section of the packet is all zeros, when we receive the packet at the other side, its changes from zeros to this 08 bf value causing the the TCP chksum to become invalid also.

Provider insists that they are tunneling only and not performing inspection at all and cannot even see anything at layer 3 or above from their end.

The appended value seems to be the same across applications that experience this issue.


Top
 Profile  
 
PostPosted: Tue Apr 24, 2012 12:41 am 
Offline
Post Whore
Post Whore
User avatar

Joined: Mon Nov 16, 2009 8:10 pm
Posts: 2523
Location: San Diego, CA
Certs: CCNP, BCNE, Network+, Security+
Whoa.... the checksum stuff you're looking at... careful with that on wireshark. You can get alot of "checksum errors" that are perfectly normal with wireshark due to checksum offloading performed at the NIC.

http://wiki.wireshark.org/TCP_Checksum_Verification

_________________
Regards,

Steven King
San Diego Cisco User Group - http://www.sdcug.com
"The only time something is impossible is when you think it is." - Kevin Corbin, CCIE #11577


Top
 Profile  
 
PostPosted: Tue Apr 24, 2012 2:03 pm 
Offline
Post Whore
Post Whore
User avatar

Joined: Mon Nov 16, 2009 8:10 pm
Posts: 2523
Location: San Diego, CA
Certs: CCNP, BCNE, Network+, Security+
tarjall wrote:
>> Yup, that's the strange behavior, even though the workstation sends the ACK, once and the server receives it, the server keeps retransmitting and the workstation never ACKs on any of the subsequent packets.


Where are you seeing the workstation sending the ACK? I don't see it until later in the capture.

_________________
Regards,

Steven King
San Diego Cisco User Group - http://www.sdcug.com
"The only time something is impossible is when you think it is." - Kevin Corbin, CCIE #11577


Top
 Profile  
 
PostPosted: Thu Apr 26, 2012 8:49 pm 
Offline
Post Whore
Post Whore

Joined: Sat Jun 07, 2008 11:06 am
Posts: 2553
Location: Grand Rapids, MI
Certs: CCNP, CCDP
Steven King wrote:
Whoa.... the checksum stuff you're looking at... careful with that on wireshark. You can get alot of "checksum errors" that are perfectly normal with wireshark due to checksum offloading performed at the NIC.

Indeed. Wireshark isn't seeing the checksum because the NIC you're using processed it. This is a red herring.


Top
 Profile  
 
PostPosted: Sat Apr 28, 2012 1:58 pm 
Offline
Post Whore
Post Whore
User avatar

Joined: Mon Nov 16, 2009 8:10 pm
Posts: 2523
Location: San Diego, CA
Certs: CCNP, BCNE, Network+, Security+
Is there a way to increase the time the application waits before retransmitting? Or is happening at L4? Would be interesting to see if you can increase it to say 500ms or so and see if it works, just slowly.

EDIT - In my limited experience, this is looking more like an application problem then a network problem, besides maybe latency. Any debug logs you can find/enable on the server?

_________________
Regards,

Steven King
San Diego Cisco User Group - http://www.sdcug.com
"The only time something is impossible is when you think it is." - Kevin Corbin, CCIE #11577


Top
 Profile  
 
PostPosted: Sun May 06, 2012 9:44 am 
Offline
New Member
New Member

Joined: Fri Mar 30, 2012 3:03 pm
Posts: 13
Certs: CCNP, MCITP
We finally solved this, took only three months :).

Thank you everyone for your time on helping try to figure this out.

By disabling tcp chksum offloading on the network adapters of the computers that were capturing the data, and enabling tcp checksum validation in wireshark, we were able to see that the packets that would retransmit carried a checksum value that became invalid after traversing the providers network. Comparing the same packet before and after traversing the provider, we notice the tail of the packet would have an all 00 00 00 representation in hex before crossing over, but would change to a random value such as b4 e2 56, when we capture it coming in from the provider at the other end.

After presenting this to the provider, they were able to determine it was a bug in their Juniper equipment, that caused a buffer overflow on packets ranging between 302 and 320 bytes, when the header of those packets exceed a certain size. The provider was able to make a modification on their end to reduce the overall header size, and prevent the bug from triggering.


Top
 Profile  
 
PostPosted: Sun May 06, 2012 4:12 pm 
Offline
Junior Member
Junior Member

Joined: Sat Mar 07, 2009 1:36 pm
Posts: 81
thats good to hear this problem solved :)

it was really a interesting discussion.


Top
 Profile  
 
PostPosted: Sun May 06, 2012 6:44 pm 
Offline
Member
Member

Joined: Tue Apr 29, 2008 7:22 pm
Posts: 184
I did not read everything above this save for a few posts but ....

Did you check your layer 1 gear? Sometimes a pinched copper or dirty fiber can lead to a hellhole of problems that you will never think about checking.

Edit*** NM read a few post north of this. Glad you got it solved!

_________________
Awesomesauce!!!!


Top
 Profile  
 
PostPosted: Tue May 08, 2012 11:31 pm 
Offline
Post Whore
Post Whore
User avatar

Joined: Mon Nov 16, 2009 8:10 pm
Posts: 2523
Location: San Diego, CA
Certs: CCNP, BCNE, Network+, Security+
Very very interesting find. Good job.

_________________
Regards,

Steven King
San Diego Cisco User Group - http://www.sdcug.com
"The only time something is impossible is when you think it is." - Kevin Corbin, CCIE #11577


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 37 posts ]  Go to page Previous  1, 2

All times are UTC - 6 hours [ DST ]


Who is online

Users browsing this forum: Bing [Bot], Exabot [Bot], jamie, Pasu, srg, telecosistem and 31 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group