[学习交流] 【上海校区】深度有趣 | 12 一起来动动手

简介用TensorFlow实现一个手部实时检测器
和Inception-v3通过迁移学习实现定制的图片分类任务类似
在上节课内容的基础上，添加手部标注数据，并使用预训练好的模型完成迁移学习
数据手部检测数据来自于
vision.soic.indiana.edu/projects/eg…
图片使用Google Class拍摄，egohands_data.zip是一个压缩包，里面共有48个文件夹，分别对应48个不同场景（室内、室外、下棋等）中共计4800张标注图片，标注即全部的手部轮廓点

不过我们不需要手动解压这个压缩包，而是使用代码去完成数据的解压和整理工作
egohands_dataset_clean.py依次完成以下几项工作

如果当前目录下没有egohands_data.zip则下载，即调用download_egohands_dataset()
否则解压egohands_data.zip并得到egohands文件夹，并对其中的图片数据执行rename_files()
rename_files()会将所有的图片重命名，加上其父文件夹的名称，避免图片名重复，并调用generate_csv_files()
generate_csv_files()读取每个场景下的图片，调用get_bbox_visualize()，根据标注文件polygons.mat绘制手部轮廓和Anchor Box并显示，同时将图片标注转换并存储为csv文件，全部处理完后，再调用split_data_test_eval_train()
split_data_test_eval_train()完成训练集和测试集的分割，在images文件夹中新建train和test两个文件夹，分别存放对应的图片和csv标注
完成以上工作后，便可以手动删除一开始解压得到的egohands文件夹

也就是从egohands_data.zip得到images文件夹，在我的笔记本上共花费6分钟左右
接下来调用generate_tfrecord.py，将训练集和测试集整理成TFRecord文件
由于这里只需要检测手部，因此物体类别只有一种即hand，如果需要定制其他物体检测任务，修改以下代码即可
def class_text_to_int(row_label): if row_label == 'hand': return 1 else: None复制代码运行以下两条命令，生成训练集和测试集对应的TFRecord文件
python generate_tfrecord.py --csv_input=images/train/train_labels.csv --output_path=retrain/train.record复制代码python generate_tfrecord.py --csv_input=images/test/test_labels.csv --output_path=retrain/test.record复制代码模型依旧是上节课使用的ssd_mobilenet_v1_coco，但这里只需要检测手部，所以需要根据定制的标注数据进行迁移学习
retrain文件夹中内容如下

train.record和test.record：定制物体检测任务的标注数据
ssd_mobilenet_v1_coco_11_06_2017：预训练好的ssd_mobilenet_v1_coco模型
ssd_mobilenet_v1_coco.config：使用迁移学习训练模型的配置文件
hand_label_map.pbtxt：指定检测类别的名称和编号映射
retrain.py：迁移学习的训练代码
object_detection：一些辅助文件

配置文件ssd_mobilenet_v1_coco.config的模版在这里
github.com/tensorflow/…
按需修改配置文件，主要是包括PATH_TO_BE_CONFIGURED的配置项

num_classes：物体类别的数量，这里为1
fine_tune_checkpoint：预训练好的模型checkpoint文件
train_input_reader：指定训练数据input_path和映射文件路径label_map_path
eval_input_reader：指定测试数据input_path和映射文件路径label_map_path

映射文件hand_label_map.pbtxt内容如下，只有一个类别
item { id: 1 name: 'hand'}复制代码使用以下命令开始模型的迁移训练，train_dir为模型输出路径，pipeline_config_path为配置项路径
python retrain.py --logtostderr --train_dir=output/ --pipeline_config_path=ssd_mobilenet_v1_coco.config复制代码模型迁移训练完毕后，在output文件夹中即可看到生成的.data、.index、.meta等模型文件
使用TensorBoard查看模型训练过程，模型总损失如下
tensorboard --logdir='output'复制代码

最后，再使用export_inference_graph.py将模型打包成.pb文件

--pipeline_config_path：配置文件路径
--trained_checkpoint_prefix：模型checkpoint路径
--output_directory：.pb文件输出路径

python export_inference_graph.py --input_type image_tensor --pipeline_config_path retrain/ssd_mobilenet_v1_coco.config --trained_checkpoint_prefix retrain/output/model.ckpt-153192 --output_directory hand_detection_inference_graph复制代码运行后会生成文件夹hand_detection_inference_graph，里面可以找到一个frozen_inference_graph.pb文件
应用现在便可以使用训练好的手部检测模型，实现一个手部实时检测器
主要改动以下三行代码即可
PATH_TO_CKPT = 'hand_detection_inference_graph/frozen_inference_graph.pb'PATH_TO_LABELS = 'retrain/hand_label_map.pbtxt'NUM_CLASSES = 1复制代码完整代码如下
# -*- coding: utf-8 -*-import numpy as npimport tensorflow as tffrom utils import label_map_utilfrom utils import visualization_utils as vis_utilimport cv2cap = cv2.VideoCapture(0)PATH_TO_CKPT = 'hand_detection_inference_graph/frozen_inference_graph.pb'PATH_TO_LABELS = 'retrain/hand_label_map.pbtxt'NUM_CLASSES = 1detection_graph = tf.Graph()with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: od_graph_def.ParseFromString(fid.read()) tf.import_graph_def(od_graph_def, name='')label_map = label_map_util.load_labelmap(PATH_TO_LABELS)categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)category_index = label_map_util.create_category_index(categories)with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0') detection_scores = detection_graph.get_tensor_by_name('detection_scores:0') detection_classes = detection_graph.get_tensor_by_name('detection_classes:0') num_detections = detection_graph.get_tensor_by_name('num_detections:0') while True: ret, image_np = cap.read() image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB) image_np_expanded = np.expand_dims(image_np, axis=0) (boxes, scores, classes, num) = sess.run( [detection_boxes, detection_scores, detection_classes, num_detections], feed_dict={image_tensor: image_np_expanded}) vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8) cv2.imshow('hand detection', cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR)) if cv2.waitKey(25) & 0xFF == ord('q'): cap.release() cv2.destroyAllWindows() break复制代码运行代码后，即可看到摄像头中手部检测的结果

定制检测任务如果希望定制自己的检测任务，准备一些图片，然后手动标注，有个几百条就差不多了
使用labelImg进行图片标注，安装方法请参考以下链接
github.com/tzutalin/la…
进入labelImg文件夹，使用以下命令，两个参数分别表示图片目录和分类文件路径
python labelImg.py ../imgs/ ../predefined_classes.txt复制代码标注界面如下图所示，按w开始矩形的绘制，按Ctrl+S保存标注至xml文件夹

之后运行xml_to_csv.py即可将.xml文件转为.csv文件
总之，为了准备TFRecord数据，按照以下步骤操作

新建train和test文件夹并分配图片
分别对训练集和测试集图片手工标注
将训练集和测试集对应的多个.xml转为一个.csv
根据原始图片和.csv生成对应的TFRecord

参考

How to Build a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow：towardsdatascience.com/how-to-buil…
EgoHands - A Dataset for Hands in Complex Egocentric Interactions：vision.soic.indiana.edu/projects/eg…
How to train your own Object Detector with TensorFlow’s Object Detector API：towardsdatascience.com/how-to-trai…

链接：https://juejin.im/post/5ba2fada6fb9a05d12280322

魔都黑马少年梦 · 魔都黑马少年梦

不二晨 · 不二晨

奈斯

帐号		自动登录	找回密码
密码			加入黑马

[学习交流] 【上海校区】深度有趣 | 12 一起来动动手

2 个回复

浏览过的版块