网站首页 > 技术文章 正文
作者 | Adam Geitgey
译者 | 风车云马
整理 | Jane
出品 | AI科技大本营(ID:rgznai100)
作者通过相机结合深度学习算法,基于 Python 语言建立一个高精度的停车位的通知系统,每当有新停车位时就会发短信提醒我。听起来好像很复杂,真的方便实用吗?但实际上所使用的工具都是现成的,只要将这些工具进行有机的组合,就可以快速、简便的实现。
1、通过 HOG(梯度方向直方图)目标检测器检测出所有的车。这种非深度学习方法运行起来相对较快,但它无法处理汽车在不同方向上的旋转问题。
2、通过 CNN(卷积神经网络)目标检测器检测所有的车。这种方法是准确的,但是效率比较低,因为同一张图像必须扫描多次,以检测到所有的汽车。虽然它可以很容易地对不同旋转方向的汽车定向,但是比 HOG 方法需要更多的训练数据。
3、使用新的深度学习方法,如 Mask R-CNN,Faster R-CNN 或者 YOLO 算法,兼容准确性和运行效率,大大加快了检测过程。一旦有大量的训练数据,在 GPU 上运行也很快。
通常来说,我们希望选择最简单可行的算法和最少的训练数据,而不是一定要用那些流行的新的算法。基于目前这种特殊场景下,Mask R- CNN 是一个不错的选择。
Mask R-CNN 架构就是在整个图像中检测对象,不使用滑动窗口的方式,所以运行速度很快。有了 GPU 处理器,我们能够每秒处理多帧的高分辨率视频,从中检测到车辆。
Mask R-CNN 为我们提供了很多检测到的信息。大多数目标检测算法只返回每个对象的边框。但是 Mask R-CNN 不仅会给我们每个对象的位置,也会给出一个对象的轮廓,像这样:
为了训练 Mask R-CNN 模型,我们需要很多这类检测物体的图片。可以花几天的时间出去拍摄照片,不过已经存在一些汽车图像的公共数据集。有一个很流行的数据集叫做COCO(Common Objects In Context的缩写),它里面已经有超过 12000 张汽车的图片。下面就是一个 COCO 数据集中的图像:
这些数据可以很好的训练 Mask R-CNN 模型,而且已经有很多人使用过 COCO数据集,并分享了训练的结果。所以我们可以直接使用一些训练好的模型,在本项目中使用 Matterport 的开源模型。
不仅能识别车辆,还能识别到交通灯和人。有趣的是,它把其中一棵树识别成“potted plant”。对于图像中检测到的每个对象,我们从 MaskR-CNN 模型得出以下 4 点:
(1)不同对象的类别,COCO 模型可以识别出 80 种不同的物体,比如小轿车和卡车。
(3)图像中物体的边界框,给定了 X/Y 像素的位置。
下面是 Python 代码,使用 Matterport 的 Mask R-CNN 的训练模型和 OpenCV 来检测汽车边框:
1import os 2import numpy as np 3import cv2 4import mrcnn.config 5import mrcnn.utils 6from mrcnn.model import MaskRCNN 7from pathlib import Path 8 9 10# Configuration that will be used by the Mask-RCNN library 11class MaskRCNNConfig(mrcnn.config.Config): 12 NAME = "coco_pretrained_model_config" 13 IMAGES_PER_GPU = 1 14 GPU_COUNT = 1 15 NUM_CLASSES = 1 + 80 # COCO dataset has 80 classes + one background class 16 DETECTION_MIN_CONFIDENCE = 0.6 17 18 19# Filter a list of Mask R-CNN detection results to get only the detected cars / trucks 20def get_car_boxes(boxes, class_ids): 21 car_boxes = [] 22 23 for i, box in enumerate(boxes): 24 # If the detected object isn't a car / truck, skip it 25 if class_ids[i] in [3, 8, 6]: 26 car_boxes.append(box) 27 28 return np.array(car_boxes) 29 30 31# Root directory of the project 32ROOT_DIR = Path(".") 33 34# Directory to save logs and trained model 35MODEL_DIR = os.path.join(ROOT_DIR, "logs") 36 37# Local path to trained weights file 38COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5") 39 40# Download COCO trained weights from Releases if needed 41if not os.path.exists(COCO_MODEL_PATH): 42 mrcnn.utils.download_trained_weights(COCO_MODEL_PATH) 43 44# Directory of images to run detection on 45IMAGE_DIR = os.path.join(ROOT_DIR, "images") 46 47# Video file or camera to process - set this to 0 to use your webcam instead of a video file 48VIDEO_SOURCE = "test_images/parking.mp4" 49 50# Create a Mask-RCNN model in inference mode 51model = MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=MaskRCNNConfig()) 52 53# Load pre-trained model 54model.load_weights(COCO_MODEL_PATH, by_name=True) 55 56# Location of parking spaces 57parked_car_boxes = None 58 59# Load the video file we want to run detection on 60video_capture = cv2.VideoCapture(VIDEO_SOURCE) 61 62# Loop over each frame of video 63while video_capture.isOpened(): 64 success, frame = 65 if not success: 66 break 67 68 # Convert the image from BGR color (which OpenCV uses) to RGB color 69 rgb_image = frame[:, :, ::-1] 70 71 # Run the image through the Mask R-CNN model to get results. 72 results = model.detect([rgb_image], verbose=0) 73 74 # Mask R-CNN assumes we are running detection on multiple images. 75 # We only passed in one image to detect, so only grab the first result. 76 r = results[0] 77 78 # The r variable will now have the results of detection: 79 # - r['rois'] are the bounding box of each detected object 80 # - r['class_ids'] are the class id (type) of each detected object 81 # - r['scores'] are the confidence scores for each detection 82 # - r['masks'] are the object masks for each detected object (which gives you the object outline) 83 84 # Filter the results to only grab the car / truck bounding boxes 85 car_boxes = get_car_boxes(r['rois'], r['class_ids']) 86 87 print("Cars found in frame of video:") 88 89 # Draw each box on the frame 90 for box in car_boxes: 91 print("Car: ", box) 92 93 y1, x1, y2, x2 = box 94 95 # Draw the box 96 cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 1) 97 98 # Show the frame of video on the screen 99 cv2.imshow('Video', frame) 100 101 # Hit 'q' to quit 102 if cv2.waitKey(1) & 0xFF == ord('q'): 103 break 104 105# Clean up everything when finished 106video_capture.release() 107cv2.destroyAllWindows()
如果假设每个边界框代表一个停车场空间,这个区域即使有车开走了,但是仍可能被另外汽车部分占据。因此我们需要一种方法来测量重叠,检查出“大部分为空”的框。我们使用的度量方法称为 Intersection Over Union(IoU)。通过计算两个物体重叠的像素量,然后除以两个物体所覆盖的像素:
有了这个值,接下来就可以很容易确定一辆车是否在停车位。如果 IoU 测量值低,比如 0.15,表示汽车并没有占据大部分的停车位空间。但是如果测量值很高,比如 0.6,就表示汽车占据了大部分的停车位,因此可以确定停车位已被占用。
IoU 是计算机视觉中常用的一种测量方法,提供了现成的代码。Matterport 的Mask R-CNN 库可以直接调用这个函数 mrcnn.utils.compute_overlaps()。假设我们有一个表示停车位边界框的列表,要检识别到的车辆是否在这些边界内框很简单,只需添加一两行代码:
1 # Filter the results to only grab the car / truck bounding boxes 2 car_boxes = get_car_boxes(r['rois'], r['class_ids']) 3 4 # See how much cars overlap with the known parking spaces 5 overlaps = mrcnn.utils.compute_overlaps(car_boxes, parking_areas) 6 7 print(overlaps)
在二维数组中,每一行表示一个停车位边界框。同样的,每一列表示停车场被汽车所覆盖的程度。1.0 分意味着汽车完全占据了,而 0.02 这样的低分数,意味着有重叠区域,但不会占据大部分空间。
要找到无人使用的停车位,只需要计算出这个数组。如果所有的数都是 0 或者很小,也就表示空间没有被占用,因此一定是空停车位。
尽管 Mask R-CNN 非常精确,但目标检测并不能做到完美。有时也会在一段视频中漏掉一两辆车。所以在定位到一个空车位时,还应该检测在一段时间内都是空的,比如 5或10帧连续视频。这也可以避免视频本身出现故障而造成误检。一旦看到几个连续视频中都有空车位,马上发送提醒通知!
最后一步是发送 SMS 提醒消息。利用 Twilio 通过 Python 发送 SMS 消息非常简单,基本上几行代码就可以实现。当然,Twilio 只是这个项目中用到的方法,你也可以用其他方式实现。
要使用 Twilio,先要注册一个试用帐户,创建一个 Twilio 电话号码并获取您的帐户凭证。然后,您需要安装 Twilio Python 客户端库:
下面是发送 SMS 消息的 Python 代码 (需用自己的帐户信息替换这些值):
1from import Client 2 3# Twilio account details 4twilio_account_sid = 'Your Twilio SID here' 5twilio_auth_token = 'Your Twilio Auth Token here' 6twilio_source_phone_number = 'Your Twilio phone number here' 7 8# Create a Twilio client object instance 9client = Client(twilio_account_sid, twilio_auth_token) 10 11# Send an SMS 12message = client.messages.create( 13 body="This is my SMS message!", 14 from_=twilio_source_phone_number, 15 to="Destination phone number here" 16)
在添加 SMS 发送功能时要注意,不要连续发送已经识别过的空车位信息。可以用一个 flag 来跟踪已经发过的短信,除非是设定一段时间后再次提醒或是检测到新的空车位。
现在将每个步骤集成一个Python脚本。下面是完整代码,要运行这段代码,需要安装Python 3.6+,Matterport 的 Mask R-CNN 和 OpenCV:
1import os 2import numpy as np 3import cv2 4import mrcnn.config 5import mrcnn.utils 6from mrcnn.model import MaskRCNN 7from pathlib import Path 8from import Client 9 10 11# Configuration that will be used by the Mask-RCNN library 12class MaskRCNNConfig(mrcnn.config.Config): 13 NAME = "coco_pretrained_model_config" 14 IMAGES_PER_GPU = 1 15 GPU_COUNT = 1 16 NUM_CLASSES = 1 + 80 # COCO dataset has 80 classes + one background class 17 DETECTION_MIN_CONFIDENCE = 0.6 18 19 20# Filter a list of Mask R-CNN detection results to get only the detected cars / trucks 21def get_car_boxes(boxes, class_ids): 22 car_boxes = [] 23 24 for i, box in enumerate(boxes): 25 # If the detected object isn't a car / truck, skip it 26 if class_ids[i] in [3, 8, 6]: 27 car_boxes.append(box) 28 29 return np.array(car_boxes) 30 31 32# Twilio config 33twilio_account_sid = 'YOUR_TWILIO_SID' 34twilio_auth_token = 'YOUR_TWILIO_AUTH_TOKEN' 35twilio_phone_number = 'YOUR_TWILIO_SOURCE_PHONE_NUMBER' 36destination_phone_number = 'THE_PHONE_NUMBER_TO_TEXT' 37client = Client(twilio_account_sid, twilio_auth_token) 38 39 40# Root directory of the project 41ROOT_DIR = Path(".") 42 43# Directory to save logs and trained model 44MODEL_DIR = os.path.join(ROOT_DIR, "logs") 45 46# Local path to trained weights file 47COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5") 48 49# Download COCO trained weights from Releases if needed 50if not os.path.exists(COCO_MODEL_PATH): 51 mrcnn.utils.download_trained_weights(COCO_MODEL_PATH) 52 53# Directory of images to run detection on 54IMAGE_DIR = os.path.join(ROOT_DIR, "images") 55 56# Video file or camera to process - set this to 0 to use your webcam instead of a video file 57VIDEO_SOURCE = "test_images/parking.mp4" 58 59# Create a Mask-RCNN model in inference mode 60model = MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=MaskRCNNConfig()) 61 62# Load pre-trained model 63model.load_weights(COCO_MODEL_PATH, by_name=True) 64 65# Location of parking spaces 66parked_car_boxes = None 67 68# Load the video file we want to run detection on 69video_capture = cv2.VideoCapture(VIDEO_SOURCE) 70 71# How many frames of video we've seen in a row with a parking space open 72free_space_frames = 0 73 74# Have we sent an SMS alert yet? 75sms_sent = False 76 77# Loop over each frame of video 78while video_capture.isOpened(): 79 success, frame = 80 if not success: 81 break 82 83 # Convert the image from BGR color (which OpenCV uses) to RGB color 84 rgb_image = frame[:, :, ::-1] 85 86 # Run the image through the Mask R-CNN model to get results. 87 results = model.detect([rgb_image], verbose=0) 88 89 # Mask R-CNN assumes we are running detection on multiple images. 90 # We only passed in one image to detect, so only grab the first result. 91 r = results[0] 92 93 # The r variable will now have the results of detection: 94 # - r['rois'] are the bounding box of each detected object 95 # - r['class_ids'] are the class id (type) of each detected object 96 # - r['scores'] are the confidence scores for each detection 97 # - r['masks'] are the object masks for each detected object (which gives you the object outline) 98 99 if parked_car_boxes is None: 100 # This is the first frame of video - assume all the cars detected are in parking spaces. 101 # Save the location of each car as a parking space box and go to the next frame of video. 102 parked_car_boxes = get_car_boxes(r['rois'], r['class_ids']) 103 else: 104 # We already know where the parking spaces are. Check if any are currently unoccupied. 105 106 # Get where cars are currently located in the frame 107 car_boxes = get_car_boxes(r['rois'], r['class_ids']) 108 109 # See how much those cars overlap with the known parking spaces 110 overlaps = mrcnn.utils.compute_overlaps(parked_car_boxes, car_boxes) 111 112 # Assume no spaces are free until we find one that is free 113 free_space = False 114 115 # Loop through each known parking space box 116 for parking_area, overlap_areas in zip(parked_car_boxes, overlaps): 117 118 # For this parking space, find the max amount it was covered by any 119 # car that was detected in our image (doesn't really matter which car) 120 max_IoU_overlap = np.max(overlap_areas) 121 122 # Get the top-left and bottom-right coordinates of the parking area 123 y1, x1, y2, x2 = parking_area 124 125 # Check if the parking space is occupied by seeing if any car overlaps 126 # it by more than 0.15 using IoU 127 if max_IoU_overlap < 0.15: 128 # Parking space not occupied! Draw a green box around it 129 cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 3) 130 # Flag that we have seen at least one open space 131 free_space = True 132 else: 133 # Parking space is still occupied - draw a red box around it 134 cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 0, 255), 1) 135 136 # Write the IoU measurement inside the box 137 font = cv2.FONT_HERSHEY_DUPLEX 138 cv2.putText(frame, f"{max_IoU_overlap:0.2}", (x1 + 6, y2 - 6), font, 0.3, (255, 255, 255)) 139 140 # If at least one space was free, start counting frames 141 # This is so we don't alert based on one frame of a spot being open. 142 # This helps prevent the script triggered on one bad detection. 143 if free_space: 144 free_space_frames += 1 145 else: 146 # If no spots are free, reset the count 147 free_space_frames = 0 148 149 # If a space has been free for several frames, we are pretty sure it is really free! 150 if free_space_frames > 10: 151 # Write SPACE AVAILABLE!! at the top of the screen 152 font = cv2.FONT_HERSHEY_DUPLEX 153 cv2.putText(frame, f"SPACE AVAILABLE!", (10, 150), font, 3.0, (0, 255, 0), 2, cv2.FILLED) 154 155 # If we haven't sent an SMS yet, sent it! 156 if not sms_sent: 157 print("SENDING SMS!!!") 158 message = client.messages.create( 159 body="Parking space open - go go go!", 160 from_=twilio_phone_number, 161 to=destination_phone_number 162 ) 163 sms_sent = True 164 165 # Show the frame of video on the screen 166 cv2.imshow('Video', frame) 167 168 # Hit 'q' to quit 169 if cv2.waitKey(1) & 0xFF == ord('q'): 170 break 171 172# Clean up everything when finished 173video_capture.release() 174cv2.destroyAllWindows()
- 2025-01-13 EfficientMod:微软出品,高效调制主干网络 | ICLR 2024
- 2025-01-13 外媒点赞百度飞桨:产业智能化进程的推动者
- 2025-01-13 [OpenCV实战]13 OpenCV中使用Mask R-CNN进行对象检测和实例分割
- 2025-01-13 春节停车难?用Python找空车位
- 2025-01-13 海康威视OCR/表格识别开源
- 2025-01-13 实战:使用Mask-RCNN的停车位检测
- 2025-01-13 英伟达A100性能实测:训练速度可达V100的3.5倍
- 2025-01-13 无人机图像处理常见问题及解决方案
- 2025-01-13 NVIDIA Jetson Nano 2GB 系列文章(56):启动器CLI指令集与配置文件
你 发表评论:
欢迎- 最近发表
- 标签列表
- oraclesql优化 (66)
- 类的加载机制 (75)
- feignclient (62)
- 一致性hash算法 (71)
- dockfile (66)
- 锁机制 (57)
- javaresponse (60)
- 查看hive版本 (59)
- phpworkerman (57)
- spark算子 (58)
- vue双向绑定的原理 (68)
- springbootget请求 (58)
- docker网络三种模式 (67)
- spring控制反转 (71)
- data:image/jpeg (69)
- base64 (69)
- java分页 (64)
- kibanadocker (60)
- qabstracttablemodel (62)
- java生成pdf文件 (69)
- deletelater (62)
- com.aspose.words (58)
- (62)
- qopengl (73)
- epoch_millis (61)