tag:blogger.com,1999:blog-25010298.post-35714341066446092852007-10-25T10:55:00.000-07:002007-10-25T10:55:00.000-07:002007-10-25T10:55:00.000-07:00Nice article Matt!A couple of comments and then so...Nice article Matt!<BR/><BR/>A couple of comments and then some ideas to possibly look into. <BR/><BR/>First, I'm surprised at how few out-of-order packets one really sees at their perimeter. Note that if you trace across multiple peered ASes then you will definitely see asymmetric routing (as you would expect)... but still seldom do you actually get packets out of order even in these cases. One might ascribe this to the over provisioning that has occurred over the last several years. Of course one will see even fewer out of order packets transmitted internally on ones networks.<BR/><BR/>So my comment is that while the non-loop read is a known issue and as we are concerned about any increase in payload size, it would seem this solution would not only fail in the cases you mention (statefull devices reassembling and reordering packets within streams), it would seem to be overly noisy to passive monitoring tactics (e.g. IDS and other monitors). <BR/><BR/>Have you thought about placing some of the data within the initial SYN packet (this is not disallowed in rfc 793) and it seems some stacks handle this just fine. It should not make it to the recv() function until the three way handshake has been completed at which point the data from the mbufs will have been handed over (including the data put in the initial SYN). In essence you would be using the kernel to buffer this first amount of data for you and combine it with the data in the first PSH packet after the 3-way. Thus getting the data portion of two packets in the first userland read.<BR/><BR/>Of course, there's the chance that after moving to connected state that recv is handed the initial data and your 1 shot read returns early after the SYN w/ data and prior to the first normal data packet, but it's worth a shot checking out which stacks handle this in which ways.<BR/><BR/>Then there's T/TCP and rfc 1644 (which is normally enabled or disabled through setting a sysctl kernel state). SYN with data is nothing new here and perhaps enabling net.inet.tcp.rfc1644 would be enough to have the first two packets (syn w/ data being pkt #1 and ack|psh w/ data #2) be buffered into your single read.<BR/><BR/>These other options will appear nonstandard to monitoring devices as well, but they might make it through statefull filtering ;)<BR/><BR/>Hey, you asked for ideas so I decided to ramble a bit. Keep up the great work!<BR/><BR/>cheers,<BR/><BR/>.mudgeAnonymousnoreply@blogger.com