A股上市公司传智教育(股票代码 003032)旗下技术交流社区北京昌平校区

 找回密码
 加入黑马

QQ登录

只需一步,快速开始

简介TensorFlow提供了用于检测图片或视频中所包含物体的API,详情可参考以下链接
github.com/tensorflow/…
物体检测和图片分类不同
  • 图片分类是将图片分为某一类别,即从多个可能的分类中选择一个,即使可以按照概率输出最可能的多个分类,但理论上的正确答案只有一个
  • 物体检测是检测图片中所出现的全部物体并且用矩形(Anchor Box)进行标注,物体的类别可以包括多种,例如人、车、动物、路标等,即正确答案可以是多个
通过多个例子,了解TensorFlow物体检测API的使用方法
这里使用预训练好的ssd_mobilenet_v1_coco模型(Single Shot MultiBox Detector),更多可用的物体检测模型可以参考这里
github.com/tensorflow/…
举个例子加载库
# -*- coding: utf-8 -*-import numpy as npimport tensorflow as tfimport matplotlib.pyplot as pltfrom PIL import Imagefrom utils import label_map_utilfrom utils import visualization_utils as vis_util复制代码定义一些常量
PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb'PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt'NUM_CLASSES = 90复制代码加载预训练好的模型
detection_graph = tf.Graph()with detection_graph.as_default():        od_graph_def = tf.GraphDef()        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:                od_graph_def.ParseFromString(fid.read())                tf.import_graph_def(od_graph_def, name='')复制代码加载分类标签数据
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)category_index = label_map_util.create_category_index(categories)复制代码一个将图片转为数组的辅助函数,以及测试图片路径
def load_image_into_numpy_array(image):        (im_width, im_height) = image.size        return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)        TEST_IMAGE_PATHS = ['test_images/image1.jpg', 'test_images/image2.jpg']复制代码使用模型进行物体检测
with detection_graph.as_default():        with tf.Session(graph=detection_graph) as sess:            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')            detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')            detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')            detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')            num_detections = detection_graph.get_tensor_by_name('num_detections:0')            for image_path in TEST_IMAGE_PATHS:                    image = Image.open(image_path)                    image_np = load_image_into_numpy_array(image)                    image_np_expanded = np.expand_dims(image_np, axis=0)                    (boxes, scores, classes, num) = sess.run(                            [detection_boxes, detection_scores, detection_classes, num_detections],                             feed_dict={image_tensor: image_np_expanded})                                        vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8)                    plt.figure(figsize=[12, 8])                    plt.imshow(image_np)                    plt.show()复制代码检测结果如下,第一张图片检测出了两只狗狗


第二张图片检测出了一些人和风筝


摄像头检测安装OpenCV,用于实现和计算机视觉相关的功能,版本为3.3.0.10
pip install opencv-python opencv-contrib-python -i https://pypi.tuna.tsinghua.edu.cn/simple复制代码查看是否安装成功,没有报错即可
import cv2tracker = cv2.TrackerMedianFlow_create()复制代码在以上代码的基础上进行修改
  • 加载cv2并获取摄像头
  • 不断地从摄像头获取图片
  • 将检测后的结果输出
完整代码如下
# -*- coding: utf-8 -*-import numpy as npimport tensorflow as tffrom utils import label_map_utilfrom utils import visualization_utils as vis_utilimport cv2cap = cv2.VideoCapture(0)PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb'PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt'NUM_CLASSES = 90detection_graph = tf.Graph()with detection_graph.as_default():        od_graph_def = tf.GraphDef()        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:                od_graph_def.ParseFromString(fid.read())                tf.import_graph_def(od_graph_def, name='')label_map = label_map_util.load_labelmap(PATH_TO_LABELS)categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)category_index = label_map_util.create_category_index(categories)with detection_graph.as_default():        with tf.Session(graph=detection_graph) as sess:            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')            detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')            detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')            detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')            num_detections = detection_graph.get_tensor_by_name('num_detections:0')            while True:                    ret, image_np = cap.read()                    image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)                    image_np_expanded = np.expand_dims(image_np, axis=0)                    (boxes, scores, classes, num) = sess.run(                            [detection_boxes, detection_scores, detection_classes, num_detections],                             feed_dict={image_tensor: image_np_expanded})                                        vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8)                                        cv2.imshow('object detection', cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR))                    if cv2.waitKey(25) & 0xFF == ord('q'):                            cap.release()                            cv2.destroyAllWindows()                            break复制代码视频检测使用cv2读取视频并获取每一帧图片,然后将检测后的每一帧写入新的视频文件
生成的视频文件只有图像、没有声音,关于音频的处理以及视频和音频的合成,后面再进一步探索
完整代码如下
# -*- coding: utf-8 -*-import numpy as npimport tensorflow as tffrom utils import label_map_utilfrom utils import visualization_utils as vis_utilimport cv2cap = cv2.VideoCapture('绝地逃亡.mov')ret, image_np = cap.read()out = cv2.VideoWriter('output.mov', -1, cap.get(cv2.CAP_PROP_FPS), (image_np.shape[1], image_np.shape[0]))PATH_TO_CKPT = 'ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb'PATH_TO_LABELS = 'ssd_mobilenet_v1_coco_2017_11_17/mscoco_label_map.pbtxt'NUM_CLASSES = 90detection_graph = tf.Graph()with detection_graph.as_default():        od_graph_def = tf.GraphDef()        with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:                od_graph_def.ParseFromString(fid.read())                tf.import_graph_def(od_graph_def, name='')label_map = label_map_util.load_labelmap(PATH_TO_LABELS)categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)category_index = label_map_util.create_category_index(categories)with detection_graph.as_default():        with tf.Session(graph=detection_graph) as sess:            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')            detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')            detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')            detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')            num_detections = detection_graph.get_tensor_by_name('num_detections:0')            while cap.isOpened():                    ret, image_np = cap.read()                    if len((np.array(image_np)).shape) == 0:                            break                    image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)                    image_np_expanded = np.expand_dims(image_np, axis=0)                                        (boxes, scores, classes, num) = sess.run(                            [detection_boxes, detection_scores, detection_classes, num_detections],                             feed_dict={image_tensor: image_np_expanded})                                        vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8)                    out.write(cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR))                    cap.release()out.release()cv2.destroyAllWindows()复制代码播放处理好的视频,可以看到很多地方都有相应的检测结果


参考


链接:https://juejin.im/post/5ba2fa4e6fb9a05cfa2fbcc4



2 个回复

倒序浏览
奈斯
回复 使用道具 举报
回复 使用道具 举报
您需要登录后才可以回帖 登录 | 加入黑马