> You need way more processing power than an RPi to do this at 30fps, and C/C++, not Python. (There are literally dozens of projects for the RPi and TFlow online but they all get like 0.1 fps or less by using Flask and browser reload of a PNG... great for POC but not for real video)
I think 8 streams at 15 fps (aka 120 fps total) is possible with a ($35) Raspberry Pi 4 + ($75) Coral USB Accelerator. I say "I think" because I haven't tested on this exact setup yet. My Macbook Pro and Intel NUC are a lot more pleasant to experiment on (much faster compilation times). A few notes:
* I'm currently just using the coral.ai prebuilt 300x300 MobileNet SSD v2 models. I haven't done much testing but can see it has notable false negatives and positives. It'd be wonderful to put together some shared training data [1] to use for transfer learning. I think then results could be much better. Anyone interested in starting something? I'd be happy to contribute!
* iirc, I got the Coral USB Accelerator to do about 180 fps with this model. [edit: but don't trust my memory—it could have been as low as 100 fps.] It's easy enough to run the detection at a lower frame rate than the input as well—do the H.264 decoding on every frame but only do inference at fixed pts intervals.
* You can also attach multiple Coral USB Accelerators to one system and make use of all of them.
* Decoding the 8 streams is likely possible on the Pi 4 depending on your resolution. I haven't messed with this yet, but I think it might even be possible in software, and the Pi has hardware H.264 decoding that I haven't tried to use yet.
* I use my cameras' 704x480 "sub" streams for motion detection and downsample that full image to the model's expected 300x300 input. Apparently some people do things like multiple inference against tiles of the image or running a second round of inference against a zoomed-in object detection region to improve confidence. That obviously increases the demand on both the CPU and TPU.
* The Orange Pi AI Stick Lite is crazy cheap ($20) and supposedly comparable to the Coral USB Accelerator in speed. At that price if it works buying one per camera doesn't sound too crazy. But I'm not sure if drivers/toolchain support are any good. I have a PLAI Plug (basically the same thing but sold by the manufacturer). The PyTorch-based image classification on a prebuilt model works fine. I don't have the software to build models or do object detection so it's basically useless right now. They want to charge an unknown price for the missing software, but I think Orange Pi's rebrand might include it with the device?
>* I use my cameras' 704x480 "sub" streams for motion detection and downsample..
i've encountered cheap IPTV cameras where the main high-res stream was actually being offered with a time-shift compared to the sub-stream.
weird shit happens when you have a camera that does that, then you act on data from the sub-stream to work with data on the main stream. I played with a 'Chinesium' cctv with generic firmware that had such a bad offset that I could actually use a static offset to remediate it.
I assumed it was just a firmware bug, since the offsets didn't seem to move around as if it was a decode/encode lag or anything of that sort.
Did the camera send SEI Picture Timing messages? RTCP Sender Reports with NTP timestamps? Either could potentially help matters if they're trustworthy.
I haven't encountered that exact problem (large fixed offset between the streams), but I agree in general these cameras' time support is poor and synchronizing streams (either between main/sub of a single camera or across cameras) is a pain point. Here's what my software is doing today:
Any of several changes to the camera would improve matters a lot:
* using temporal/spatial/quality SVC (Scalable Video Coding) so you can get everything you need from a single video stream
* exposing timestamps relative to the camera's uptime (CLOCK_MONOTONIC) somehow (not sure where you'd cram this into a RTSP session) along with some random boot id
* allow fetching both the main and sub video streams in a single RTSP session
* reliably slewing the clock like a "real" NTP client rather than stepping with SNTP
but I'm not exactly in a position to make suggestions that the camera manufacturers jump to implement...
I started with an Rpi by itself. Then I tried a Coral USB stick. I also tried the Intel Neural Compute Stick 2. The Coral USB accelerator doesn't accelerate all of the layers, only some of them. The CPU has to do the rest of the work. Plus, you only get this speed if you preload an image into memory and blast it through the accelerator in a loop. This ignores getting the image INTO the accelerator, which requires reshaping and shipping across USB. It fell to pieces with -one- 720P video stream. The NCS is worse.
I didn't bother with multiple $100 coral accelerators because why when I already have a Xavier?
As I said, my goal was 20-30fps with HD streams. Sure I could drop the quality, but I didn't want to, that was the point.
> The Coral USB accelerator doesn't accelerate all of the layers, only some of them.
My understanding is that with the pretrained models, everything happens on the TPU. If you use some lightweight transfer learning techniques to tweak the model [1], the last layer happens on the CPU. That's supposed to be insignificant, but I haven't actually tried it.
I'm very curious what you're using for a model. You're clearly further along than I am. Did you use your own cameras' data? Did you do transfer learning? (If so, what did you start from? you mentioned SSDMobileNet and Yolo3. Do you have a favorite?) Did you build a model from scratch?
Anyway, my point is that a similar project seems doable on a Raspberry Pi 4 with some extra hardware. I don't mean to say that you're Doing It Wrong for using a Xavier. I've thought about buying one of those myself...
Yeah, that's promising, although I don't think there's much hope of support if it doesn't work as promised. And I have doubts about the software quality. As a small example: if you follow Gyrfalcon's installation instructions for the basic Plai Builder, it sets up a udev rule that makes every SCSI device world-writeable. I realized that by accident later. And of course everything is closed-source.
Gyrfalcon's own site is actively hostile to hobbyists. They only want to deal with researchers and folks preparing to package their chips into volume products. Signing up with a suitable email address and being manually approved lets you buy the device. You then have to negotiate to buy the Model Development Kits.
Hardware-wise, their stuff looks really neat. The $20 Orange Pi AI Stick Lite has the 2801 chip at 5.6 TOPS. Gyrfalcon's version of it costs $50. The 2803 chip does 16.8 TOPS. Gyrfalcon's USB-packaged version costs $70. That'd be a fantastic deal if the software situation were satisfactory, and a future Orange Pi version might be even cheaper.
This is sadly typical, and while I understand they don't want the support burden of hobbyists I would have thought the OrangePI would ship in interesting enough numbers for there to be some kind of support.
It looks like the OrangePi 4B includes ones of these chips on board?
> It looks like the OrangePi 4B includes ones of these chips on board?
Yes, it has a 2801S.
And the SolidRun Hummingboard Ripple has a 2803S. Seems a little pricy compared to a Raspberry Pi 4 + USB PLAI Plug 2803, but maybe worth it if you can actually get the software...(and I don't think they just give you one download that supports both models)
> * iirc, I got the Coral USB Accelerator to do about 180 fps with this model. [edit: but don't trust my memory—it could have been as low as 100 fps.]
Just dusted off my test program. 115.5 fps on my Intel NUC. I think that's the limit of this model on the Coral USB Accelerator, or very close to it.
My Raspberry Pi 4 is still compiling...I might update with that number in a bit. Likely the H.264 decoding will be the bottleneck, as I haven't set up hardware decoding.
72.2 fps on the Raspberry Pi 4 right now, with CPU varying between 150%–220%. I expect with some work I could max out the Coral USB Accelerator as the Intel NUC is likely doing already.
I think 8 streams at 15 fps (aka 120 fps total) is possible with a ($35) Raspberry Pi 4 + ($75) Coral USB Accelerator. I say "I think" because I haven't tested on this exact setup yet. My Macbook Pro and Intel NUC are a lot more pleasant to experiment on (much faster compilation times). A few notes:
* I'm currently just using the coral.ai prebuilt 300x300 MobileNet SSD v2 models. I haven't done much testing but can see it has notable false negatives and positives. It'd be wonderful to put together some shared training data [1] to use for transfer learning. I think then results could be much better. Anyone interested in starting something? I'd be happy to contribute!
* iirc, I got the Coral USB Accelerator to do about 180 fps with this model. [edit: but don't trust my memory—it could have been as low as 100 fps.] It's easy enough to run the detection at a lower frame rate than the input as well—do the H.264 decoding on every frame but only do inference at fixed pts intervals.
* You can also attach multiple Coral USB Accelerators to one system and make use of all of them.
* Decoding the 8 streams is likely possible on the Pi 4 depending on your resolution. I haven't messed with this yet, but I think it might even be possible in software, and the Pi has hardware H.264 decoding that I haven't tried to use yet.
* I use my cameras' 704x480 "sub" streams for motion detection and downsample that full image to the model's expected 300x300 input. Apparently some people do things like multiple inference against tiles of the image or running a second round of inference against a zoomed-in object detection region to improve confidence. That obviously increases the demand on both the CPU and TPU.
* The Orange Pi AI Stick Lite is crazy cheap ($20) and supposedly comparable to the Coral USB Accelerator in speed. At that price if it works buying one per camera doesn't sound too crazy. But I'm not sure if drivers/toolchain support are any good. I have a PLAI Plug (basically the same thing but sold by the manufacturer). The PyTorch-based image classification on a prebuilt model works fine. I don't have the software to build models or do object detection so it's basically useless right now. They want to charge an unknown price for the missing software, but I think Orange Pi's rebrand might include it with the device?
[1] https://groups.google.com/g/moonfire-nvr-users/c/ZD1uS7kL7tc...