Locating Objects with DetectNet
一、DetectNet
之前使用的模型是ImageNet跟GoogleNet的resNet-18,都是針對整張圖片或影片去做辨識,如果要辨識動態或一張照片上多個物體,我們就先來體驗detectNet吧,detectNet可以在一個frame偵測許多物件,很期待吧。相同的Jetson nano也提供了兩種版本的程式碼,分別是python跟c++。
以下是官方說明,其實說的蠻清楚的,高達91種分類別可以被辨識出來。
The detectNet object accepts an image as input, and outputs a list of coordinates of the detected bounding boxes along with their classes and confidence values. detectNet is available to use from Python and C++. See below for various pre-trained detection models available for download. The default model used is a 91-class SSD-Mobilenet-v2 model trained on the MS COCO dataset, which achieves realtime inferencing performance on Jetson with TensorRT.
As examples of using the detectNet class, we provide sample programs for C++ and Python:
detectnet.cpp (C++)
detectnet.py (Python)
從偵測影像開始吧
there are some additional command-line options:
optional --network flag which changes the detection model being used (the default is SSD-Mobilenet-v2).
optional --overlay flag which can be comma-separated combinations of box, labels, conf, and none
The default is --overlay=box,labels,conf which displays boxes, labels, and confidence values
optional --alpha value which sets the alpha blending value used during overlay (the default is 120).
optional --threshold value which sets the minimum threshold for detection (the default is 0.5).
我們先進入Docker container吧
$ cd jetson-inference
$ ./docker/run.sh
$ cd build/aarch64/bin/
# Python
$ ./detectnet.py --network=ssd-mobilenet-v2 images/peds_0.jpg images/test/output.jpg # --network flag is optional
可以一次辨識多個物件
# Python
$ ./detectnet.py images/peds_1.jpg images/test/output.jpg
二、一次辨識多個檔案
The 91-class MS COCO dataset that the SSD-based models were trained on include people, vehicles, animals, and assorted types of household objects to detect.
看來可以多玩玩看其他網路下載的照片
# Python
$ ./detectnet.py "images/peds_*.jpg" images/test/peds_output_%i.jpg
利用雙引號把images資料夾內所有peds_開頭的jpg檔進行辨識
輸出到images/test資料夾,輸出的檔案為peds_output_流水號.jpg檔,編號從0開始
產出結果都在資料夾裡
三、來試試看多人的吧
$ ./detectnet.py images/a001.jpg images/test/a0001.jpg
是抓到不少人啦...
但是也蠻多沒抓到的
四、測試影片
$ ./detectnet.py images/city.mp4 /images/test/city_out.mp4
影片太快的話...辨識還是沒那麼精準,而且Jetson nano也會有點lag,可能需要升級了
五、Running Different Detection Models
使用ssd-inception-v2 模組
切換到以下資料夾,跟著執行安裝 ssd-inception-v2 模組
$ cd jetson-inference/tools
$ ./download-models.sh
選擇 SSD-Inception-v2 ,然後Press Enter
選擇 SSD-Inception-v2
安裝中
$ cd /
$ cd /jetson-inference/build/aarch64/bin/
$ ./detectnet.py --network=ssd-inception-v2 images/object_5.jpg images/test/output_5.jpg
一樣,第一次執行要跑很久喔
六、程式碼
先腦補python程式 import sys
跟之前的自己寫程式不一樣的有多了sys模組,sys.argv 是用來取得執行 Python 文件時命令列參數的方法 。sys.argv 其實就是個 list,除了sys.argv[0]以外,sys.argv 裡面存放著執行程式時帶入的外部參數
可以參考 https://shengyu7697.github.io/blog/2019/12/28/Python-sys-argv/
再腦補 if __name__ == '__main__' -----這一段與下面程式無關啦...哈哈
就想放在這裡
if __name__ == '__main__' 涵義 這篇寫得很好 http://blog.castman.net/%E6%95%99%E5%AD%B8/2018/01/27/python-name-main.html
我把這一篇的排版重新順一下,超好理解
import jetson.inference
import jetson.utils
import argparse
import sys
# parse the command line
parser = argparse.ArgumentParser(description="Locate objects in a live camera stream using an object detection DNN.")
parser.add_argument("input_URI", type=str, default="", nargs='?', help="URI of the input stream")
parser.add_argument("output_URI", type=str, default="", nargs='?', help="URI of the output stream")
parser.add_argument("--network", type=str, default="ssd-mobilenet-v2", help="pre-trained model to load (see below for options)")
parser.add_argument("--overlay", type=str, default="box,labels,conf", help="detection overlay flags (e.g. --overlay=box,labels,conf)\nvalid combinations are: 'box', 'labels', 'conf', 'none'")
parser.add_argument("--threshold", type=float, default=0.5, help="minimum detection threshold to use")
try:
opt = parser.parse_known_args()[0]
except:
print("")
parser.print_help()
sys.exit(0)
# load the object detection network
net = jetson.inference.detectNet(opt.network, sys.argv, opt.threshold)
# create video sources & outputs
input = jetson.utils.videoSource(opt.input_URI, argv=sys.argv)
output = jetson.utils.videoOutput(opt.output_URI, argv=sys.argv)
# process frames until the user exits
while True:
# capture the next image
img = input.Capture()
# detect objects in the image (with overlay)
detections = net.Detect(img, overlay=opt.overlay)
# print the detections
print("detected {:d} objects in image".format(len(detections)))
for detection in detections:
print(detection)
# render the image
output.Render(img)
# update the title bar
output.SetStatus("{:s} | Network {:.0f} FPS".format(opt.network, net.GetNetworkFPS()))
# print out performance info
net.PrintProfilerTimes()
# exit on input/output EOS
if not input.IsStreaming() or not output.IsStreaming():
break