无人机检测：Mask R-CNN的分步解析

对象检测是一种计算机视觉技术，用于识别和定位图像中的对象。

MaskR-CNN是对象检测的扩展，因为它会为图像中检测到的每个对象生成边界框和分割蒙版。

本文会介绍如何使用Mask R-CNN训练自定义数据集，并希望可以帮助大家简化该过程。

库和包

该算法的主要软件包是mrcnn。首先下载并将库导入到环境中。

！pip 

从mrcnn.config导入
安装mrcnn 

从mrcnn导入配置
从utils 

导入mrcnn.model作为
从mrcnn导入的

modellib 从mrcnn.model导入可视化

到现在，我们只知道这些就是需要的import语句。

至于TensorFlow，mrcnn尚未与TensorFlow 2.0兼容，因此请确保恢复到TensorFlow 1.x

如果我没看错，则在TensorFlow 2.0中将tf.random_shuffle重命名为tf.random.shuffle，从而导致不兼容问题。通过更改mrcnn代码中的shuffle函数，可以使用TensorFlow 2.0。

！pip install keras == 2.2.5

预处理

mrcnn包在其接受的数据格式方面相当灵活。因此，由于其简单的操作，将其处理成NumPy数组。

在此之前，我意识到cv2无法正确读取video17_295和video19_1900。因此过滤掉了这些图像并创建了文件名列表。

dir = "Database1/"
# filter out image that cant be read
prob_list = ['video17_295','video19_1900'] # cant read format
txt_list = [f for f in os.listdir(dir) if f.endswith(".txt") and f[:-4] not in prob_list]
file_list = set([re.match("\w+(?=.)",f)[0] for f in txt_list])
# create data list as tuple of (jpeg,txt)
data_list = []
for f in file_list:
    data_list.append((f+".JPEG",f+".txt"))

接下来要做的事很少；

检查标签是否存在（某些图像不包含无人机）
读取并处理图像
读取和处理边界框的坐标
绘制边界框以进行可视化

X,y = [], []
img_box = []
DIMENSION = 128 # set low resolution to decrease training time
for i in range(len(data_list)):
    # get bounding box and check if label exist
    with open(dir+data_list[i][1],"rb") as f:
    box = f.read().split()
    if len(box) != 5: 
        continue # skip data if does not contain label
box = [float(s) for s in box[1:]]
# read image
img = cv2.imread(dir+data_list[i][0])
    img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
# resize img to 128 x 128
    img = cv2.resize(img, (DIMENSION,DIMENSION), interpolation= cv2.INTER_LINEAR)
# draw bounding box (for visualization purposes)
    resize1, resize2 = img.shape[0]/DIMENSION, img.shape[1]/DIMENSION
    p1,p2,p3,p4 = int(box[0]*img.shape[1]*resize2), int(box[1]*img.shape[0]*resize1) ,int(box[2]*img.shape[1]*resize2) ,int(box[3]*img.shape[0]*resize1)
ymin, ymax, xmin, xmax = p2-p4//2, p2+p4//2, p1-p3//2, p1+p3//2
draw = cv2.rectangle(img.copy(),(xmax,ymax),(xmin,ymin),color=(255,255,0),thickness =1)
# store data if range of y is at least 20 pixels (remove data with small drones)
    if ymax - ymin >=20:
        X.append(img)
        y.append([ymin, ymax, xmin, xmax])
        img_box.append(draw)
# convert to numpy arrays
X = np.array(X).astype(np.uint8)
y = np.array(y)
img_box = np.array(img_box)

在转换为NumPy数组之前，我获取了数据集的一个子填充，以减少训练时间。如果具有计算能力，可以忽略它。这是一些示例图像。

MRCNN-处理

现在，要正确看待mrcnn，我们需要在训练过程之前定义一个mrcnn Dataset类。该Dataset类提供图像的信息，例如图像所属的类以及对象在其中的位置。我们之前导入的mrcnn.utils包含此Dataset类。

这是有些棘手的问题，需要对源代码进行一些阅读。

这些是需要修改的功能；

add_class，它确定模型的类数
add_image，在其中定义image_id和图像的路径（如果适用）
load_image，图像数据加载到的位置
load_mask，它获取有关图像的蒙版/边框的信息

# define drones dataset using mrcnn utils class
class DronesDataset(utils.Dataset):
    def __init__(self,X,y): # init with numpy X,y
        self.X = X
        self.y = y
        super().__init__()
def load_dataset(self):
        self.add_class("dataset",1,"drones") # only 1 class, drones
        for i in range(len(self.X)):
            self.add_image("dataset",i,path=None)
def load_image(self,image_id):
        image = self.X[image_id] # where image_id is index of X
        return image
def load_mask(self,image_id):
    # get details of image
    info = self.image_info[image_id]
    #create one array for all masks, each on a different channel
    masks = np.zeros([128, 128, len(self.X)], dtype='uint8')
class_ids = []
    for i in range(len(self.y)):
        box = self.y[info["id"]]
        row_s, row_e = box[0], box[1]
        col_s, col_e = box[2], box[3]
        masks[row_s:row_e, col_s:col_e, i] = 1 # create mask with similar boundaries as bounding box
        class_ids.append(1)
return masks, np.array(class_ids).astype(np.uint8)

由于我们努力将图像格式化为NumPy数组，因此我们可以简单地使用数组初始化Dataset类，并通过索引数组来加载图像和边框。

接下来要做一次火车测试，以拆分传统方法，

# train test split 80:20
np.random.seed(42) # for reproducibility
p = np.random.permutation(len(X))
X = X[p].copy()
y = y[p].copy()
split = int(0.8 * len(X))
X_train = X[:split]
y_train = y[:split]
X_val = X[split:]
y_val = y[split:]

现在将数据加载到Dataset类中。

＃将数据集加载到mrcnn数据集类中
train_dataset = DronesDataset（X_train，y_train）
train_dataset.load_dataset（）
train_dataset.prepare（）val_dataset = DronesDataset（X_val，y_val）
val_dataset.load_dataset（）
val_dataset.prepare（）

prepare（）函数使用image_ids和class_ids信息为mrcnn模型准备数据，

接下来是对我们从mrcnn导入的config类的修改。Config类确定了训练中使用的变量，应根据数据集进行调整。以下这些变量并不详尽，可以参考文档以获取完整列表。

class DronesConfig(Config):
    # Give the configuration a recognizable name
    NAME = "drones"
# Train on 1 GPU and 2 images per GPU.
    GPU_COUNT = 1
    IMAGES_PER_GPU = 2
# Number of classes (including background)
    NUM_CLASSES = 1+1  # background + drones
# Use small images for faster training. 
    IMAGE_MIN_DIM = 128
    IMAGE_MAX_DIM = 128
# Reduce training ROIs per image because the images are small and have few objects.
    TRAIN_ROIS_PER_IMAGE = 20
# Use smaller anchors because our image and objects are small
    RPN_ANCHOR_SCALES = (8, 16, 32, 64, 128)  # anchor side in pixels
# set appropriate step per epoch and validation step
    STEPS_PER_EPOCH = len(X_train)//(GPU_COUNT*IMAGES_PER_GPU)
    VALIDATION_STEPS = len(X_val)//(GPU_COUNT*IMAGES_PER_GPU)
# Skip detections with < 70% confidence
    DETECTION_MIN_CONFIDENCE = 0.7
config = DronesConfig()
config.display()

根据电脑的计算能力，必须相应地调整这些变量。否则将面临卡在第1阶段而没有给出任何错误消息的问题。甚至为此问题引发了GitHub问题，并提出了许多解决方案。

MRCNN训练

mrcnn已经在COCO和I mageNet数据集中进行了训练。为了将这些预训练的权重用于转移学习，我们需要将其下载到我们的环境中（请记住首先定义ROOT_DIR）。

＃以训练的权重的文件本地路径
COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")

if not os.path.exists（COCO_MODEL_PATH）：
    utils.download_trained_weights（COCO_MODEL_PATH）

创建模型并以预先训练的权重开始。

# Create model in training mode using gpu
with tf.device("/gpu:0"):
    model = modellib.MaskRCNN(mode="training", config=config,model_dir=MODEL_DIR)
# Which weights to start with?
init_with = "imagenet"  # imagenet, coco
if init_with == "imagenet":
    model.load_weights(model.get_imagenet_weights(), by_name=True)
elif init_with == "coco":
    # Load weights trained on MS COCO, but skip layers that
    # are different due to the different number of classes
    # See README for instructions to download the COCO weights
    model.load_weights(COCO_MODEL_PATH, by_name=True,exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_bbox", "mrcnn_mask"])

最后，我们可以继续进行实际的培训。

model.train(train_dataset, val_dataset,learning_rate=config.LEARNING_RATE,epochs=5,layers='heads') # unfreeze head and just train on last layer

对于本次练习仅训练最后一层以检测数据集中的无人机。如果时间允许，还应该通过训练所有前面的层来微调模型。

model.train（train_dataset，val_dataset，
            learning_rate = config.LEARNING_RATE / 10，
            epochs = 2，
            layers =“ all”）

训练mrcnn模型就完成了。可以使用这两行代码保存模型的权重。

＃保存权重
model_path = os.path.join（MODEL_DIR，“ mask_rcnn_drones.h5”）
model.keras_model.save_weights（model_path）

MRCNN-推论

要对其他图像进行推断，将需要使用自定义Config创建一个新的推断模型。

＃创建推理类InferenceConfig（DronesConfig）：
    GPU_COUNT = 1 
    IMAGES_PER_GPU = 1 inference_config = InferenceConfig（）＃在推理模式下重新创建模型
model = modellib.MaskRCNN（mode =“ inference”，
config = inference_config，model_dir = MODEL_DIR）＃训练有素的负载权重
model_path = os.path.join（MODEL_DIR，“ mask_rcnn_drones.h5”）
model.load_weights（model_path，by_name = True）

来自mrcnn的可视化类在这里派上用场。

def get_ax(rows=1, cols=1, size=8):
    _, ax = plt.subplots(rows, cols, figsize=(size*cols, size*rows))
return ax
# Test on a random image
image_id = random.choice(val_dataset.image_ids)
original_image, image_meta, gt_class_id, gt_bbox, gt_mask =\
modellib.load_image_gt(val_dataset, inference_config,image_id, use_mini_mask=False)
results = model.detect([original_image], verbose=1)
r = results[0]
visualize.display_instances(original_image, r['rois'], r['masks'], r['class_ids'],val_dataset.class_names, r['scores'], ax=get_ax())

使用自定义数据集训练了mrcnn模型。如上图所见。

网站首页 > 技术文章正文

无人机检测:Mask R-CNN的分步解析(附代码)

库和包

预处理

MRCNN-处理

MRCNN训练

MRCNN-推论

猜你喜欢

本文暂时没有评论，来添加一个吧(●'◡'●)

取消回复欢迎你发表评论:

网站首页 > 技术文章 正文

无人机检测:Mask R-CNN的分步解析(附代码)

库和包

预处理

MRCNN-处理

MRCNN训练

MRCNN-推论

猜你喜欢

本文暂时没有评论，来添加一个吧(●'◡'●)

取消回复欢迎 你 发表评论:

网站首页 > 技术文章正文

取消回复欢迎你发表评论: