Automotive ethernet, particularly for high data rate sensors like LiDAR or radar...

zokier · on Feb 18, 2024

But why do it on IP level instead of higher level like everyone else is doing it?

aeonik · on Feb 18, 2024

Why do it at the high level if the protocol supports it? Especially in an automotive system where you don't need to worry about rogue configurations as much.

Eduard · on Feb 18, 2024

quite the opposite:

CAN Injection: keyless car theft

https://kentindell.github.io/2023/04/03/can-injection/

aeonik · on Feb 18, 2024

I don't follow.

I meant why not leverage fragmentation at the routing+transport protocol layer if configs can be controlled to support it and not drop fragmented packets.

What advantage is there doing it at the application layer?

I'm aware that all the layers can be messed with by rogue malicious actors.

Though IIRC CAN doesn't really have a transport or routing layer; the ARB IDs are baked into the physical protocol, which is very cool, but it's fundamentally a bus architecture, and I'm not sure how it would apply to current thread.

raggi · on Feb 18, 2024

that concern with these numbers is very surprising, do you know if there’s somewhere with writing about this?

wcunning · on Feb 18, 2024

No, unfortunately I only learned about it while debugging a very annoying issue that resulted from mixing automotive modules with off the shelf data center type managed switches and had to consult one of the system design technical leads, so I only got what detail he provided, but the intuition matches some of what you can get from understanding CAN.

raggi · on Feb 18, 2024

I’m just a little surprised as economies of scale have pushed support for larger frames into most of the hardware you can purchase. I can absolutely believe somebody saying this, but intuition is telling me that it’s a misdiagnosed problem - if it isn’t there’s something interesting I’d love to hear about!

wcunning · on Feb 19, 2024

That's for throughput not latency, in automotive/realtime systems, you care about the latter and only slightly about the former. The real piece to that is that the modules themselves have to do the fragmentation, so their send queue can reorder things since each message is "pre" broken up, meaning that something that has a lot of sensor data and also some keep alive signals and some safety critical signals can interleave the safety and keep alive with the fragmented raw data by prioritizing at the send queue level, whereas if the data was being fragmented at the protocol level, that would potentially clog the queue when something even more time critical should be going out.

raggi · on Feb 20, 2024

in a lot of these systems I interact with they're limited by packet rate more than packet size. I'm not disputing latency sensitivity in the application, I'm surprised that frame size in the normal range has measured to introduce the kinds of problems you're describing. The cost of per-packet parsing and reassembly is typically a higher proportion than would lead to a need to strictly pin to 1500 byte ethernet frames.

now if you have one system trying to send really oversized frames constantly, creating lots of fragmentation, and you have limited sized buffers, I could see a situation where the fragments are too regularly eating up buffer space, but that's even a slightly different problem too. I would expect inexpensive hardware to be able to handle 9k packets without introducing queue delay.

I heard upcoming networks are heading over 10gbps, which means their packet processing capabilities are necessarily well over 1Mpps, more than sufficient for latency concerns in the kinds of control systems hanging off of this.

Looking up some basic (possibly inaccurate) specs for the kinds of lidar used, frame samples won't fit in a packet, so we're talking in the range of 500 packets at 1500 bytes, or less than 100 packets at 9k, at around 30hz per sensor?

Clearly the deployed system works, we're dealing with 1500 bytes in a lot of places where it being the standard introduces a lot of inefficiency/hard optimization work, so it's hardly abnormal. As I said originally, I'm curious about what the factors are, they seem surprising.

wcunning · on Feb 21, 2024

I suspect, but cannot confirm or even really look up, that they cheaped out on some of the switch infrastructure in the middle. Also, at least some of the radars I dealt with produced object data and that meant that they had a defined "frame" size of 65536, even though it would often be mostly zeros, so it had to do fragmentation no matter how you set up jumbo frames and that needed to be handled at the sensor firmware level. There's also some tradeoffs inherent in gigabit speed T1 automotive ethernet and some stuff that was defined at the mac layer rather than the packet layer. Happy to chat more about it, feel free to look up my email in my profile.