尊重原創著作權: https://www.gewuweb.com/hot/17031.html
目標跟蹤(7)使用 OpenCV 進行簡單的物件跟蹤
尊重原創著作權: https://www.gewuweb.com/sitemap.html
1.簡述
目標跟蹤的程序是:
- 1.獲取物件檢測的初始集(例如邊界框坐標的輸入集)
- 2.為每個初始檢測創建唯一的ID
- 3.然后跟蹤每一個在視頻中移動的物件,保持唯一ID的分配
此外,物件跟蹤允許我們為每個跟蹤物件應用唯一 ID,從而使我們能夠計算視頻中的唯一物件,物件跟蹤對于構建人員計數器至關重要,
理想的目標跟蹤演算法是:
- 1.只要求一次物件檢測階段(即,當物件最初被檢測時)
- 2.將非常快-比運行實際的物體檢測器本身快得多
- 3.能夠處理當跟蹤物件“消失”或移動到視頻幀的邊界之外的情況
- 4.抗遮擋能力強
- 5.能夠拾取它在幀間“丟失”的物件
這對于任何計算機視徑訓影像處理演算法來說都是一項艱巨的任務,我們可以使用各種技巧來幫助改進我們的物件跟蹤器,
在今天的博文中,你將學習如何使用OpenCV實作質心跟蹤,質心跟蹤是一種簡單易懂但高效的跟蹤演算法,
質心跟蹤依賴于視頻中(1)已有的目標質心(即質心跟蹤器已經見過的目標)與(2)后續幀之間的新目標質心之間的歐氏距離,
我們將在下一節更深入地回顧質心演算法,從那里我們將實作一個 Python 類來包含我們的質心跟蹤演算法,然后創建一個 Python
腳本來實際運行物件跟蹤器并將其應用于輸入視頻,
最后,我們將運行我們的物件跟蹤器并檢查結果,同時指出該演算法的優點和缺點,
2.質心跟蹤演算法
質心跟蹤演算法是一個多步驟的程序,我們將回顧本節中的每個跟蹤步驟,
步驟#1:接受邊界框坐標并計算質心
要使用質心跟蹤構建簡單的物件跟蹤演算法,第一步是從物件檢測器獲得邊界框坐標并使用它們來計算質心,
質心跟蹤演算法假設我們為每一幀中的每個檢測到的物件傳入一組邊界框 (x, y) 坐標,
這些邊界框可以由您想要的任何型別的物件檢測器(顏色閾值 + 輪廓提取、Haar 級聯、HOG + 線性 SVM、SSD、Faster R-CNN 等)生成,
一旦我們有了邊界框坐標,我們就必須計算“質心”,或者更簡單地說,計算邊界框的中心 (x, y) 坐標,上面的圖演示了接受一組邊界框坐標并計算質心,
由于這些是呈現給我們演算法的第一組初始邊界框,我們將為它們分配唯一的 ID,
步驟#2:計算新邊界框和現有物件之間的歐幾里得距離
此影像中存在三個物件,我們需要計算每對原始質心(紫色)和新質心(黃色)之間的歐幾里得距離,
對于視頻流中隨后的每一幀,我們應用步驟#1計算物件質心;然而,我們首先需要確定是否可以將新的物件質心(黃色)與舊的物件質心(紫色)相關聯,而不是為每個檢測到的物件分配一個新的唯一ID(這將違背物件跟蹤的目的),為了完成這個程序,我們計算每對現有物件質心和輸入物件質心之間的歐幾里德距離(綠色或紅色箭頭突出顯示),
然后我們計算每對原始質心(紫色)和新質心(黃色)之間的歐幾里得距離,但是我們如何使用這些點之間的歐幾里得距離來實際匹配它們并關聯它們呢?
答案在Step #3,
步驟 #3:更新現有物件的 (x, y) 坐標
我們簡單的質心物件跟蹤方法將物件與最小的物件距離相關聯,我們如何處理上圖左下角的物件呢?
質心跟蹤演算法的主要假設是給定物件可能會在后續幀之間移動,但幀的質心之間的距離將小于物件之間的所有其他距離,
因此,如果我們選擇將質心與后續幀之間的最小距離相關聯,我們可以構建我們的物件跟蹤器,
但是左下角的孤獨點呢?
步驟#4:注冊新物件
在我們使用 Python 和 OpenCV 進行物件跟蹤的示例中,我們有一個與現有物件不匹配的新物件,因此它被注冊為物件 ID #3,
如果輸入檢測比跟蹤的現有物件多,我們需要注冊新物件, “注冊”只是意味著我們通過以下方式將新物件添加到我們的跟蹤物件串列中:
- 為其分配一個新的物件 ID
- 存盤該物件的邊界框坐標的質心
然后我們可以回傳到步驟#2,并為視頻流中的每一幀重復步驟管道,
步驟#5:注銷舊物件
任何合理的物件跟蹤演算法都需要能夠處理物件丟失、消失或離開視野的情況,
您如何處理這些情況實際上取決于您的物件跟蹤器的部署位置,但是對于此實作,當舊物件無法與任何現有物件匹配總共 N 個后續幀時,我們將取消注冊,
3.物件跟蹤專案結構
要在終端中查看今天的專案結構,只需使用 tree 命令:
$ tree --dirsfirst
.
├── pyimagesearch
│ ├── __init__.py
│ └── centroidtracker.py
├── object_tracker.py
├── deploy.prototxt
└── res10_300x300_ssd_iter_140000.caffemodel
4.使用 OpenCV 實作質心跟蹤
在我們可以對輸入視頻流應用物件跟蹤之前,我們首先需要實作質心跟蹤演算法,當您消化這個質心跟蹤器腳本時,請記住上面的步驟 1-5,并根據需要查看這些步驟,
正如您將看到的,將步驟轉換為代碼需要很多思考,雖然我們執行所有步驟,但由于我們各種資料結構和代碼結構的性質,它們不是線性的,
我會建議 :
- 閱讀上面的步驟
- 閱讀質心跟蹤器的代碼說明
- 最后再次閱讀上述步驟
一旦你確定你理解了質心跟蹤演算法的步驟,打開 pyimagesearch 模塊中的 centroidtracker.py,讓我們回顧一下代碼:
# import the necessary packages
from scipy.spatial import distance as dist
from collections import OrderedDict
import numpy as np
class CentroidTracker():
def __init__(self, maxDisappeared=50):
# initialize the next unique object ID along with two ordered
# dictionaries used to keep track of mapping a given object
# ID to its centroid and number of consecutive frames it has
# been marked as "disappeared", respectively
self.nextObjectID = 0
self.objects = OrderedDict()
self.disappeared = OrderedDict()
# store the number of maximum consecutive frames a given
# object is allowed to be marked as "disappeared" until we
# need to deregister the object from tracking
self.maxDisappeared = maxDisappeared
我們匯入我們需要的包和模塊—— distance 、 OrderedDict 和 numpy ,
首先我們定義CentroidTracker 類,建構式接受一個引數,即跟蹤器可以容忍的給定物件丟失/消失的最大連續幀數,
我們的建構式構建了四個類變數:
- nextObjectID:用于為每個物件分配唯一 ID 的計數器,如果物件離開幀并且在 maxDisappeared 幀中沒有回傳,則將分配一個新的(下一個)物件 ID,
- objects:物件 ID 作為鍵和質心 (x, y) 坐標作為值的字典
- disappeared:保存特定物件 ID(鍵)已被標記為“丟失”的連續幀數(值)
- maxDisappeared:在我們取消注冊該物件之前,允許將物件標記為“丟失/消失”的連續幀數,
讓我們定義負責向我們的跟蹤器添加新物件的 register 方法:
def register(self, centroid):
# when registering an object we use the next available object
# ID to store the centroid
self.objects[self.nextObjectID] = centroid
self.disappeared[self.nextObjectID] = 0
self.nextObjectID += 1
定義register 方法,它接受一個質心centroid,然后使用下一個可用的物件 ID 將其添加到objects字典中,
物件消失的次數在disappeared字典中初始化為 0, 最后,我們遞增 nextObjectID,這樣如果一個新物件進入視野,它將與一個唯一 ID
相關聯, 與我們的register方法類似,我們也需要一個deregister方法:
def deregister(self, objectID):
# to deregister an object ID we delete the object ID from
# both of our respective dictionaries
del self.objects[objectID]
del self.disappeared[objectID]
就像我們可以向跟蹤器添加新物件一樣,我們還需要能夠從輸入幀中洗掉丟失或消失的舊物件,
定義deregister 方法,它簡單地分別洗掉objects和disappeared字典中的 objectID,
我們的質心跟蹤器實作的核心位于update方法中:
def update(self, rects):
# check to see if the list of input bounding box rectangles
# is empty
if len(rects) == 0:
# loop over any existing tracked objects and mark them
# as disappeared
for objectID in list(self.disappeared.keys()):
self.disappeared[objectID] += 1
# if we have reached a maximum number of consecutive
# frames where a given object has been marked as
# missing, deregister it
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
# return early as there are no centroids or tracking info
# to update
return self.objects
定義的更新方法接受邊界框矩形串列,可能來自物件檢測器(Haar 級聯、HOG + 線性 SVM、SSD、Faster R-CNN 等), rects
引數的格式假定為具有以下結構的元組: (startX, startY, endX, endY) ,
如果沒有檢測到,我們將遍歷所有物件 ID
并增加它們的disappeared計數,我們還將檢查是否已達到給定物件被標記為丟失的最大連續幀數,如果是這種情況,我們需要將其從我們的跟蹤系統中洗掉,由于沒有要更新的跟蹤資訊,我們繼續前進并提前return,
否則,在接下來的7個update方法的代碼塊中,我們有很多作業要做:
# initialize an array of input centroids for the current frame
inputCentroids = np.zeros((len(rects), 2), dtype="int")
# loop over the bounding box rectangles
for (i, (startX, startY, endX, endY)) in enumerate(rects):
# use the bounding box coordinates to derive the centroid
cX = int((startX + endX) / 2.0)
cY = int((startY + endY) / 2.0)
inputCentroids[i] = (cX, cY)
我們將初始化一個 NumPy 陣列inputCentroids 來存盤每個 rect 的質心,
然后,我們遍歷邊界框矩形并計算質心并將其存盤在 inputCentroids 串列中,
如果當前沒有我們正在跟蹤的物件,我們將注冊每個新物件:
# if we are currently not tracking any objects take the input
# centroids and register each of them
if len(self.objects) == 0:
for i in range(0, len(inputCentroids)):
self.register(inputCentroids[i])
否則,我們需要根據最小化它們之間歐幾里得距離的質心位置來更新任何現有物件 (x, y) 坐標:
# otherwise, are are currently tracking objects so we need to
# try to match the input centroids to existing object
# centroids
else:
# grab the set of object IDs and corresponding centroids
objectIDs = list(self.objects.keys())
objectCentroids = list(self.objects.values())
# compute the distance between each pair of object
# centroids and input centroids, respectively -- our
# goal will be to match an input centroid to an existing
# object centroid
D = dist.cdist(np.array(objectCentroids), inputCentroids)
# in order to perform this matching we must (1) find the
# smallest value in each row and then (2) sort the row
# indexes based on their minimum values so that the row
# with the smallest value is at the *front* of the index
# list
rows = D.min(axis=1).argsort()
# next, we perform a similar process on the columns by
# finding the smallest value in each column and then
# sorting using the previously computed row index list
cols = D.argmin(axis=1)[rows]
對現有跟蹤物件的更新從else 開始,目標是跟蹤物件并保持正確的物件 ID——這個程序是通過計算所有 objectCentroids 和
inputCentroids 對之間的歐幾里德距離來完成的,然后關聯最小化歐幾里得距離的物件 ID,
在else 塊中,我們將:
- 獲取 objectID 和 objectCentroid 值
- 計算每對現有物件質心和新輸入質心之間的距離,我們的距離圖 D 的輸出形狀將是 (# of object centroids, # of input centroids) ,
- 要執行匹配,我們必須 (1) 找到每行中的最小值,以及 (2) 根據最小值對行索引進行排序,我們對列執行非常相似的程序,在每列中找到最小值,然后根據有序行對它們進行排序,我們的目標是在串列的前面有最小對應距離的索引值,
下一步是使用距離來查看我們是否可以關聯物件 ID:
# in order to determine if we need to update, register,
# or deregister an object we need to keep track of which
# of the rows and column indexes we have already examined
usedRows = set()
usedCols = set()
# loop over the combination of the (row, column) index
# tuples
for (row, col) in zip(rows, cols):
# if we have already examined either the row or
# column value before, ignore it
# val
if row in usedRows or col in usedCols:
continue
# otherwise, grab the object ID for the current row,
# set its new centroid, and reset the disappeared
# counter
objectID = objectIDs[row]
self.objects[objectID] = inputCentroids[col]
self.disappeared[objectID] = 0
# indicate that we have examined each of the row and
# column indexes, respectively
usedRows.add(row)
usedCols.add(col)
在上面的代碼塊中,我們:
初始化兩個集合以確定我們已經使用了哪些行和列索引,請記住,集合類似于串列,但它只包含唯一值,
然后我們遍歷 (row, col) 索引元組的組合以更新我們的物件質心:
如果我們已經使用了此行或列索引,請忽略它并繼續回圈,
否則,我們找到了一個輸入質心:
1.到現有質心的歐幾里得距離最小
2.并且沒有與任何其他物件匹配
在這種情況下,我們更新物件質心并確保將 row 和 col 添加到它們各自的 usedRows 和 usedCols 集中
在我們的 usedRows + usedCols 集合中可能有我們尚未檢查的索引:
# compute both the row and column index we have NOT yet
# examined
unusedRows = set(range(0, D.shape[0])).difference(usedRows)
unusedCols = set(range(0, D.shape[1])).difference(usedCols)
所以我們必須確定哪些質心索引我們還沒有檢查,并將它們存盤在兩個新的集合中(unusedRows 和unusedCols)
我們的最后處理任何丟失或可能消失的物件:
# in the event that the number of object centroids is
# equal or greater than the number of input centroids
# we need to check and see if some of these objects have
# potentially disappeared
if D.shape[0] >= D.shape[1]:
# loop over the unused row indexes
for row in unusedRows:
# grab the object ID for the corresponding row
# index and increment the disappeared counter
objectID = objectIDs[row]
self.disappeared[objectID] += 1
# check to see if the number of consecutive
# frames the object has been marked "disappeared"
# for warrants deregistering the object
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
最后:
如果物件質心的數量大于或等于輸入質心的數量:
我們需要通過遍歷未使用的行索引(如果有)來驗證這些物件是否丟失或消失,
在回圈中,我們將:
1.增加他們在字典中disappeared次數,
2.檢查disappeared計數是否超過 maxDisappeared 閾值,如果是,我們將注銷該物件,
否則,輸入質心的數量大于現有物件質心的數量,因此我們有新的物件要注冊和跟蹤:
# otherwise, if the number of input centroids is greater
# than the number of existing object centroids we need to
# register each new input centroid as a trackable object
else:
for col in unusedCols:
self.register(inputCentroids[col])
# return the set of trackable objects
return self.objects
我們回圈遍歷unusedCols 索引并注冊每個新質心,最后,我們將可跟蹤物件集回傳給呼叫方法,
5.了解質心跟蹤距離關系
我們的質心跟蹤實作很長,誠然,這是演算法中最令人困惑的方面,
如果您在跟隨該代碼的操作時遇到問題,您應該考慮打開 Python shell 并執行以下實驗:
>>> from scipy.spatial import distance as dist
>>> import numpy as np
>>> np.random.seed(42)
>>> objectCentroids = np.random.uniform(size=(2, 2))
>>> centroids = np.random.uniform(size=(3, 2))
>>> D = dist.cdist(objectCentroids, centroids)
>>> D
array([[0.82421549, 0.32755369, 0.33198071],
[0.72642889, 0.72506609, 0.17058938]])
結果是具有兩行(# of existing object centroids)和三列(# of new input centroids)的距離矩陣 D,
就像我們之前在腳本中所做的那樣,讓我們?找到每行中的最小距離并根據該值對索引進行排序:
>>> D.min(axis=1)
array([0.32755369, 0.17058938])
>>> rows = D.min(axis=1).argsort()
>>> rows
array([1, 0])
首先,我們找到每一行的最小值,讓我們能夠確定哪個現有物件最接近新的輸入質心,然后對這些值進行排序,我們可以獲得這些行的索引,
對列使用類似的程序:
>>> D.argmin(axis=1)
array([1, 2])
>>> cols = D.argmin(axis=1)[rows]
>>> cols
array([2, 1])
我們首先檢查列中的值并找到具有最小列的值的索引,然后,我們使用現有的rows對這些值排序,
讓我們列印結果并分析它們:
>>> print(list(zip(rows, cols)))
[(1, 2), (0, 1)]
分析結果,我們發現:
- D[1, 2] 具有最小的歐幾里得距離,這意味著第二個現有物件將與第三個輸入質心匹配,
- 并且 D[0, 1] 具有下一個最小的歐幾里德距離,這意味著第一個現有物件將與第二個輸入質心匹配,
6.實作物件跟蹤程式腳本
現在我們已經實作了 CentroidTracker 類,讓我們將其與物件跟蹤程式腳本一起使用,
在程式腳本中,您可以使用自己喜歡的物件檢測器,前提是它生成一組包圍框,這可以是Haar級聯,HOG +線性支持向量機,YOLO, SSD, Faster
R-CNN等,對于這個示例腳本,我將使用OpenCV的深度學習人臉檢測器,但您可以自行制作實作不同檢測器的腳本版本,
在這個腳本中,我們將:
- 使用實時 VideoStream 物件從您的網路攝像頭中抓取幀
- 加載并使用 OpenCV 的深度學習人臉檢測器
- 實體化我們的 CentroidTracker 并使用它來跟蹤視頻流中的人臉物件
- 并顯示我們的結果,其中包括覆寫在幀上的邊界框和物件 ID 注釋
當你準備好了,打開object_tracker.py,然后繼續:
# import the necessary packages
from pyimagesearch.centroidtracker import CentroidTracker
from imutils.video import VideoStream
import numpy as np
import argparse
import imutils
import time
import cv2
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.5,
help="minimum probability to filter weak detections")
args = vars(ap.parse_args())
首先,我們指定我們的匯入,最值得注意的是,我們正在使用我們剛付訓顧過的 CentroidTracker 類,我們還將使用來自 imutils 和
OpenCV 的 VideoStream,
我們有三個命令列引數,它們都與我們的深度學習人臉檢測器相關:
- --prototxt :Caffe 部署prototxt 的路徑,
- --model :預訓練模型模型的路徑,
- --confidence :我們過濾弱檢測的概率閾值,我發現默認值 0.5 就足夠了,
接下來,讓我們執行我們的初始化:
# initialize our centroid tracker and frame dimensions
ct = CentroidTracker()
(H, W) = (None, None)
# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
# initialize the video stream and allow the camera sensor to warmup
print("[INFO] starting video stream...")
vs = VideoStream(src=https://www.cnblogs.com/lihanlin/archive/2022/04/28/0).start()
time.sleep(2.0)
在上面的塊中,我們:
- 實體化我們的 CentroidTracker , ct,回想一下上一節的解釋,這個物件有三個方法:(1) register , (2) deregister ,和 (3) update ,我們只會使用 update 方法,因為它會自動注冊和注銷物件,我們還將 H 和 W(我們的幀尺寸)初始化為 None,
- 使用 OpenCV 的 DNN 模塊從磁盤加載我們的序列化深度學習人臉檢測器模型
- 啟動我們的 VideoStream , vs,使用 vs ,我們將能夠在下一個 while 回圈中從我們的相機中捕獲幀,我們將讓我們的相機預熱 2.0 秒,
現在讓我們開始我們的 while 回圈并開始跟蹤面部物件:
# loop over the frames from the video stream
while True:
# read the next frame from the video stream and resize it
frame = vs.read()
frame = imutils.resize(frame, width=400)
# if the frame dimensions are None, grab them
if W is None or H is None:
(H, W) = frame.shape[:2]
# construct a blob from the frame, pass it through the network,
# obtain our output predictions, and initialize the list of
# bounding box rectangles
blob = cv2.dnn.blobFromImage(frame, 1.0, (W, H),
(104.0, 177.0, 123.0))
net.setInput(blob)
detections = net.forward()
rects = []
我們遍歷幀并將它們調整為固定寬度(同時保持縱橫比),我們的幀尺寸根據需要設定,
然后我們將幀通過 CNN 物件檢測器來獲得預測和物件位置,我們初始化一個矩形串列來保存我們的邊界框矩形,
# loop over the detections
for i in range(0, detections.shape[2]):
# filter out weak detections by ensuring the predicted
# probability is greater than a minimum threshold
if detections[0, 0, i, 2] > args["confidence"]:
# compute the (x, y)-coordinates of the bounding box for
# the object, then update the bounding box rectangles list
box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
rects.append(box.astype("int"))
# draw a bounding box surrounding the object so we can
# visualize it
(startX, startY, endX, endY) = box.astype("int")
cv2.rectangle(frame, (startX, startY), (endX, endY),
(0, 255, 0), 2)
我們開始回圈檢測,如果檢測結果超過我們的置信度閾值,表明檢測有效,我們:
- 計算邊界框坐標并將它們附加到 rects 串列中
- 在物件周圍繪制一個邊界框
最后,讓我們在質心跟蹤器物件 ct 上呼叫 update :
# update our centroid tracker using the computed set of bounding
# box rectangles
objects = ct.update(rects)
# loop over the tracked objects
for (objectID, centroid) in objects.items():
# draw both the ID of the object and the centroid of the
# object on the output frame
text = "ID {}".format(objectID)
cv2.putText(frame, text, (centroid[0] - 10, centroid[1] - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
cv2.circle(frame, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)
# show the output frame
cv2.imshow("Frame", frame)
key = cv2.waitKey(1) & 0xFF
# if the `q` key was pressed, break from the loop
if key == ord("q"):
break
# do a bit of cleanup
cv2.destroyAllWindows()
vs.stop()
ct.update 呼叫處理了我們使用 Python 和 OpenCV 腳本實作的簡單物件跟蹤器中的繁重作業,
如果我們不關心可視化,我們將在這里完成并準備回圈,
我們將質心顯示為一個填充的圓和唯一的物件ID號文本,現在,我們將能夠可視化結果,并檢查CentroidTracker是否通過將正確的ID與視頻流中的物件相關聯來正確地跟蹤物件,
我們顯示幀,直到按下退出鍵(“q”),如果按下退出鍵,我們只需中斷并執行清理,
7.完整代碼
centroidtracker.py
# import the necessary packages
from scipy.spatial import distance as dist
from collections import OrderedDict
import numpy as np
class CentroidTracker():
def __init__(self, maxDisappeared=50):
# initialize the next unique object ID along with two ordered
# dictionaries used to keep track of mapping a given object
# ID to its centroid and number of consecutive frames it has
# been marked as "disappeared", respectively
self.nextObjectID = 0
self.objects = OrderedDict()
self.disappeared = OrderedDict()
# store the number of maximum consecutive frames a given
# object is allowed to be marked as "disappeared" until we
# need to deregister the object from tracking
self.maxDisappeared = maxDisappeared
def register(self, centroid):
# when registering an object we use the next available object
# ID to store the centroid
self.objects[self.nextObjectID] = centroid
self.disappeared[self.nextObjectID] = 0
self.nextObjectID += 1
def deregister(self, objectID):
# to deregister an object ID we delete the object ID from
# both of our respective dictionaries
del self.objects[objectID]
del self.disappeared[objectID]
def update(self, rects):
# check to see if the list of input bounding box rectangles
# is empty
if len(rects) == 0:
# loop over any existing tracked objects and mark them
# as disappeared
for objectID in self.disappeared.keys():
self.disappeared[objectID] += 1
# if we have reached a maximum number of consecutive
# frames where a given object has been marked as
# missing, deregister it
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
# return early as there are no centroids or tracking info
# to update
return self.objects
# initialize an array of input centroids for the current frame
inputCentroids = np.zeros((len(rects), 2), dtype="int")
# loop over the bounding box rectangles
for (i, (startX, startY, endX, endY)) in enumerate(rects):
# use the bounding box coordinates to derive the centroid
cX = int((startX + endX) / 2.0)
cY = int((startY + endY) / 2.0)
inputCentroids[i] = (cX, cY)
# if we are currently not tracking any objects take the input
# centroids and register each of them
if len(self.objects) == 0:
for i in range(0, len(inputCentroids)):
self.register(inputCentroids[i])
# otherwise, are are currently tracking objects so we need to
# try to match the input centroids to existing object
# centroids
else:
# grab the set of object IDs and corresponding centroids
objectIDs = list(self.objects.keys())
objectCentroids = list(self.objects.values())
# compute the distance between each pair of object
# centroids and input centroids, respectively -- our
# goal will be to match an input centroid to an existing
# object centroid
D = dist.cdist(np.array(objectCentroids), inputCentroids)
# in order to perform this matching we must (1) find the
# smallest value in each row and then (2) sort the row
# indexes based on their minimum values so that the row
# with the smallest value as at the *front* of the index
# list
rows = D.min(axis=1).argsort()
# next, we perform a similar process on the columns by
# finding the smallest value in each column and then
# sorting using the previously computed row index list
cols = D.argmin(axis=1)[rows]
# in order to determine if we need to update, register,
# or deregister an object we need to keep track of which
# of the rows and column indexes we have already examined
usedRows = set()
usedCols = set()
# loop over the combination of the (row, column) index
# tuples
for (row, col) in zip(rows, cols):
# if we have already examined either the row or
# column value before, ignore it
# val
if row in usedRows or col in usedCols:
continue
# otherwise, grab the object ID for the current row,
# set its new centroid, and reset the disappeared
# counter
objectID = objectIDs[row]
self.objects[objectID] = inputCentroids[col]
self.disappeared[objectID] = 0
# indicate that we have examined each of the row and
# column indexes, respectively
usedRows.add(row)
usedCols.add(col)
# compute both the row and column index we have NOT yet
# examined
unusedRows = set(range(0, D.shape[0])).difference(usedRows)
unusedCols = set(range(0, D.shape[1])).difference(usedCols)
# in the event that the number of object centroids is
# equal or greater than the number of input centroids
# we need to check and see if some of these objects have
# potentially disappeared
if D.shape[0] >= D.shape[1]:
# loop over the unused row indexes
for row in unusedRows:
# grab the object ID for the corresponding row
# index and increment the disappeared counter
objectID = objectIDs[row]
self.disappeared[objectID] += 1
# check to see if the number of consecutive
# frames the object has been marked "disappeared"
# for warrants deregistering the object
if self.disappeared[objectID] > self.maxDisappeared:
self.deregister(objectID)
# otherwise, if the number of input centroids is greater
# than the number of existing object centroids we need to
# register each new input centroid as a trackable object
else:
for col in unusedCols:
self.register(inputCentroids[col])
# return the set of trackable objects
return self.objects
object_tracker.py
# USAGE
# python object_tracker.py --prototxt deploy.prototxt --model res10_300x300_ssd_iter_140000.caffemodel
# import the necessary packages
from pyimagesearch.centroidtracker import CentroidTracker
from imutils.video import VideoStream
import numpy as np
import argparse
import imutils
import time
import cv2
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.5,
help="minimum probability to filter weak detections")
args = vars(ap.parse_args())
# initialize our centroid tracker and frame dimensions
ct = CentroidTracker()
(H, W) = (None, None)
# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
# initialize the video stream and allow the camera sensor to warmup
print("[INFO] starting video stream...")
vs = VideoStream(src=https://www.cnblogs.com/lihanlin/archive/2022/04/28/0).start()
time.sleep(2.0)
# loop over the frames from the video stream
while True:
# read the next frame from the video stream and resize it
frame = vs.read()
frame = imutils.resize(frame, width=400)
# if the frame dimensions are None, grab them
if W is None or H is None:
(H, W) = frame.shape[:2]
# construct a blob from the frame, pass it through the network,
# obtain our output predictions, and initialize the list of
# bounding box rectangles
blob = cv2.dnn.blobFromImage(frame, 1.0, (W, H),
(104.0, 177.0, 123.0))
net.setInput(blob)
detections = net.forward()
rects = []
# loop over the detections
for i in range(0, detections.shape[2]):
# filter out weak detections by ensuring the predicted
# probability is greater than a minimum threshold
if detections[0, 0, i, 2] > args["confidence"]:
# compute the (x, y)-coordinates of the bounding box for
# the object, then update the bounding box rectangles list
box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
rects.append(box.astype("int"))
# draw a bounding box surrounding the object so we can
# visualize it
(startX, startY, endX, endY) = box.astype("int")
cv2.rectangle(frame, (startX, startY), (endX, endY),
(0, 255, 0), 2)
# update our centroid tracker using the computed set of bounding
# box rectangles
objects = ct.update(rects)
# loop over the tracked objects
for (objectID, centroid) in objects.items():
# draw both the ID of the object and the centroid of the
# object on the output frame
text = "ID {}".format(objectID)
cv2.putText(frame, text, (centroid[0] - 10, centroid[1] - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
cv2.circle(frame, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)
# show the output frame
cv2.imshow("Frame", frame)
key = cv2.waitKey(1) & 0xFF
# if the `q` key was pressed, break from the loop
if key == ord("q"):
break
# do a bit of cleanup
cv2.destroyAllWindows()
vs.stop()
8.質心目標跟蹤結果
打開終端并執行以下命令:
$ python object_tracker.py --prototxt deploy.prototxt \
--model res10_300x300_ssd_iter_140000.caffemodel
[INFO] loading model...
[INFO] starting video stream...
請注意,即使當我將書籍封面移到相機視野之外時,第二張臉“丟失”了,我們的物件跟蹤也能夠在它進入視野時再次將其重新拾起,如果面部在視野之外存在超過 50
幀,則該物件將被取消注冊,
9.限制和缺點
雖然我們的質心跟蹤器在這個例子中作業得很好,但這種物件跟蹤演算法有兩個主要缺點,
首先是它要求在輸入視頻的每一幀上運行物件檢測步驟,
- 對于非常快速的目標檢測器(即顏色閾值和 Haar 級聯)來說,必須在每個輸入幀上運行檢測器可能不是問題,
- 但是,如果您在資源受限的設備上使用計算量大得多的物件檢測器,例如 HOG + 線性 SVM 或基于深度學習的檢測器,那么您的幀處理管道將大大減慢,因為您將花費整個管道運行一個非常慢的檢測器,
第二個缺點與質心跟蹤演算法本身的基本假設有關——質心必須在后續幀之間靠得很近,
- 這個假設通常成立,但請記住,我們用 2D 幀來表示我們的 3D 世界——當一個物件與另一個物件重疊時會發生什么?
- 答案是可能會發生物件 ID 切換,
- 如果兩個或多個物件相互重疊到它們的質心相交的點,并且與另一個相應的物件具有最小距離,則演算法可能(在不知不覺中)交換物件 ID,
- 重要的是要了解重疊/遮擋物件問題并非特定于質心跟蹤——它也發生在許多其他物件跟蹤器中,包括高級物件跟蹤器, 然而,質心跟蹤的問題更加明顯,因為我們嚴格依賴質心之間的歐幾里得距離,并且沒有額外的度量、啟發式或學習模式,
只要您在使用質心跟蹤時牢記這些假設和限制,該演算法就會非常適合您,
BONUS
以下實作基于YOLOV3和質心跟蹤演算法的多目標跟蹤
# import the necessary packages
from CentroidTracking.centroidtracker import CentroidTracker
from imutils.video import VideoStream
import numpy as np
import argparse
import imutils
import time
import cv2
import os
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--input", required=True,
help="path to input video")
# ap.add_argument("-o", "--output", required=True,
# help="path to output video")
ap.add_argument("-c", "--confidence", type=float, default=0.5,
help="minimum probability to filter weak detections")
ap.add_argument("-t", "--threshold", type=float, default=0.3,
help="threshold when applying non-maxima suppression")
args = vars(ap.parse_args())
ct = CentroidTracker()
# load the COCO class labels, our YOLO model was trained on
labelsPath = os.path.sep.join(["yolo-coco", "coco.names"])
LABELS = open(labelsPath).read().strip().split("\n")
# initialize a list of colors to represent each possible class label
np.random.seed(42)
COLORS = np.random.randint(0, 255, size=(len(LABELS), 3),dtype="uint8")
# derive the paths to the YOLO weights and model configuration
weightsPath = os.path.sep.join(["yolo-coco", "yolov3.weights"])
configPath = os.path.sep.join(["yolo-coco", "yolov3.cfg"])
# load our YOLO object detector trained on COCO dataset (80 classes)
print("[INFO] loading YOLO from disk...")
net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
writer = None
if args["input"] == 'camera':
cap = cv2.VideoCapture(0)
else:
cap = cv2.VideoCapture(args["input"])
# try to determine the total number of frames in the video file
try:
prop = cv2.cv.CV_CAP_PROP_FRAME_COUNT if imutils.is_cv2() \
else cv2.CAP_PROP_FRAME_COUNT
total = int(vs.get(prop))
print("[INFO] {} total frames in video".format(total))
# an error occurred while trying to determine the total
# number of frames in the video file
except:
print("[INFO] could not determine # of frames in video")
print("[INFO] no approx. completion time can be provided")
total = -1
print(cap.isOpened())
print("starting-----------------------------------------------------------")
begin = time.time()
while (cap.isOpened()):
ret, image = cap.read()
# load our input image and grab its spatial dimension
if ret == True:
(H, W) = image.shape[:2]
# determine only the *output* layer names that we need from YOLO
ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]
# construct a blob from the input image and then perform a forward
# pass of the YOLO object detector, giving us our bounding boxes and
# associated probabilities
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416),
swapRB=True, crop=False)
net.setInput(blob)
start = time.time()
layerOutputs = net.forward(ln)
end = time.time()
# show timing information on YOLO
print("[INFO] YOLO took {:.6f} seconds".format(end - start))
# initialize our lists of detected bounding boxes, confidences, and
# class IDs, respectively
boxes = []
boxes_c = []
confidences = []
classIDs = []
rects = []
# loop over each of the layer outputs
for output in layerOutputs:
# loop over each of the detections
for detection in output:
# extract the class ID and confidence (i.e., probability) of
# the current object detection
scores = detection[5:]
classID = np.argmax(scores)
confidence = scores[classID]
# filter out weak predictions by ensuring the detected
# probability is greater than the minimum probability
if confidence > args["confidence"]:
# scale the bounding box coordinates back relative to the
# size of the image, keeping in mind that YOLO actually
# returns the center (x, y)-coordinates of the bounding
# box followed by the boxes' width and height
box = detection[0:4] * np.array([W, H, W, H])
(centerX, centerY, width, height) = box.astype("int")
# use the center (x, y)-coordinates to derive the top and
# and left corner of the bounding box
x = int(centerX - (width / 2))
y = int(centerY - (height / 2))
# update our list of bounding box coordinates, confidences,
# and class IDs
boxes.append([x, y, int(width), int(height)])
boxes_c.append([centerX - int(width/2), centerY - int(height/2), centerX + int(width/2), centerY + int(height/2)])
confidences.append(float(confidence))
classIDs.append(classID)
# apply non-maxima suppression to suppress weak, overlapping bounding
# boxes
idxs = cv2.dnn.NMSBoxes(boxes, confidences, args["confidence"],args["threshold"])
if len(idxs) > 0:
for i in idxs.flatten():
rects.append(boxes_c[i])
# update our centroid tracker using the computed set of bounding
# box rectangles
objects = ct.update(rects)
# loop over the tracked objects
for (objectID, centroid) in objects.items():
# draw both the ID of the object and the centroid of the
# object on the output frame
text = "ID {}".format(objectID)
cv2.putText(image, text, (centroid[0] - 10, centroid[1] - 10),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
cv2.circle(image, (centroid[0], centroid[1]), 4, (0, 255, 0), -1)
# ensure at least one detection exists
if len(idxs) > 0:
# loop over the indexes we are keeping
for i in idxs.flatten():
# extract the bounding box coordinates
(x, y) = (boxes[i][0], boxes[i][1])
(w, h) = (boxes[i][2], boxes[i][3])
# draw a bounding box rectangle and label on the image
color = [int(c) for c in COLORS[classIDs[i]]]
cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)
text = "{}: {:.4f}".format(LABELS[classIDs[i]], confidences[i])
cv2.putText(image, text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX,
0.5, color, 2)
# writer.write(image)
# # show the output image
# cv2.imshow("Image", image)
# check if the video writer is None
if writer is None:
# initialize our video writer
fourcc = cv2.VideoWriter_fourcc(*"MJPG")
writer = cv2.VideoWriter("output.avi", fourcc, 30,(image.shape[1], image.shape[0]), True)
# some information on processing single frame
if total > 0:
elap = (end - start)
print("[INFO] single frame took {:.4f} seconds".format(elap))
print("[INFO] estimated total time to finish: {:.4f}".format(elap * total))
cv2.imshow("Live", image)
# write the output frame to disk
writer.write(image)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
# release the file pointers
print("[INFO] cleaning up...")
writer.release()
cap.release()
cv2.destroyAllWindows()
finish = time.time()
print(f"Total time taken : {finish - begin}")
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/467020.html
標籤:其他
