One of the biggest misses with IP fragmentation was not requiring each fragment ...

wmf · on Feb 17, 2024

That would be a layering violation. IP routers don't necessarily know about higher protocols.

gizmo686 · on Feb 17, 2024

You could implement it as a generic 'application metadata' field in the IP header. From the perspective of IP, it one more length prefixed field in the IP header. Routers may interpret it in conjunction with the value of the protocol field; otherwise they are just required to leave it in unchanged in the header (including in all fragments).

For packets that don't want to use it, this is just 1 byte of overhead to set the size to 0.

tptacek · on Feb 17, 2024

You could design a network protocol that fragments by capturing a variable number of bytes from the next header, and ICMP already does something like that.

(None of this would fix the real problem with fragmentation, which is that you can't efficiently segment out a large frame without having some kind of reliability layer).

raggi · on Feb 17, 2024

If I was revisiting, I'd probably eradicate the layer and pick a fixed number of flow types with distinct headers and state machines. The layers were a reasonable choice given the understanding of the time, but in hindsight I think you can make a strong case they're cut at the wrong places.

colmmacc · on Feb 17, 2024

It's just a dumb mistake. All it takes is a "next layer header length" field. It would have been very simple.

You don't even really need that, and as proof, take ICMP ... which was designed as part of IP ... actually does do this. Routers are already required to copy and include the header of the packet that triggered an ICMP error.

jandrese · on Feb 17, 2024

The IP layer doesn't have to know what is in those upper layers to include 50 or 100 bytes of it in a little trunk.

zamadatix · on Feb 17, 2024

If you always chop 100 add 100 then it's even more massively inefficient than the problem it solves. The router would at least need to have every protocol start with a header length value. Otherwise if you just take the first 100 bytes and stick it in the front of each packet and the header was only 57 bytes then you've suddenly got 43 bytes of garbage in the next layer's payload when you reassemble.

Keep in mind, most routers don't even bother supporting existing fragmentation because it's costly to implement in high speed hardware. So while you could theoretically have that dynamic next protocol header length value field it'd only be complicating something hardware makers already think is too complicated to be worth it. Making things unappealing complex is one of the common results of layering violations.

Hikikomori · on Feb 17, 2024

Theres no strict rules about layers, most routers can and do read info in tcp/udp headers.

dtech · on Feb 17, 2024

And that's how we got forever stuck with those 2 and now have to build every new protocol on top of UDP.

n2d4 · on Feb 17, 2024

Actually, that's not a bad thing. UDP is small enough to have nearly no overhead, but complex enough to let firewalls do their job. Six of the eight bytes in its header would probably be in the header of any transport layer protocol anyways (only the checksum might be unnecessary).

Wikipedia lists over 100 assigned IP protocol numbers [1], and while it would break existing firewalls, adding a new protocol would certainly require less work than the transition from IPv4 to IPv6. But UDP is already simple enough that there's very little benefit in not just building on that.

[1] https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers

ikiris · on Feb 18, 2024

No it isn't. That fault lies with nat and idiots who only open http on their firewalls.

n2d4 · on Feb 17, 2024

They can read higher layers, but they (currently) don't have to in order to implement IP correctly

cesarb · on Feb 17, 2024

> most routers can and do read info in tcp/udp headers.

Do most routers really do that, or just the ones which are also trying to act as a firewall?

wmf · on Feb 17, 2024

For example, IP routers often peek at UDP/TCP port numbers to calculate ECMP flow hashing. This is technically naughty but it's read-only and it's only an optimization that isn't required for correct forwarding.

Hikikomori · on Feb 17, 2024

Yes. I doubt you can find one that is not capable.

ardel95 · on Feb 18, 2024

Almost every modern router in a multipath network peeks at the next layer to implement flow hashing correctly.