下面的代码改写自 ,改写后的代码 被我放置在 。我的主要改进有:
- 增加对 Windows 系统的支持;
- 替换
defaultdict
为dict.get()
,解决 Windows 的编码问题。 - 跳过解压这一步骤(包括直接的或间接的解压),直接对图片数据
images
与标注数据annotations
操作。 - 因为,无需解压,所以 API 的使用更加便捷和高效。
具体的 API 使用说明见如下内容:
0 准备
为了可以使用 cocoz
,你需要下载 。之后将其放在你需要运行的项目或程序根目录,亦或者使用如下命令添加环境变量(暂时的):
import syssys.path.append('D:\API\cocoapi\PythonAPI') # 你下载的 cocoapi 所在路径from pycocotools.cocoz import AnnZ, ImageZ, COCOZ # 载入 cocoz
下面我们就可以利用这个 API 的 cocoz.AnnZ
、cocoz.ImageZ
和 cocoz.COCOZ
类来操作 COCO 图片和标注了。下面我以 Windows 系统为例说明,Linux 是类似的。
1 cocoz.AnnZ 与 cocoz.ImageZ
root = r'E:\Data\coco' # COCO 数据集所在根目录annType = 'annotations_trainval2017' # COCO 标注数据类型annZ = AnnZ(root, annType)
我们来查看一下,该标注数据所包含的标注类型:
annZ.names
['annotations/instances_train2017.json', 'annotations/instances_val2017.json', 'annotations/captions_train2017.json', 'annotations/captions_val2017.json', 'annotations/person_keypoints_train2017.json', 'annotations/person_keypoints_val2017.json']
以 dict
的形式载入 'annotations/instances_train2017.json'
的具体信息:
annFile = 'annotations/instances_val2017.json'dataset = annZ.json2dict(annFile)
Loading json in memory ...used time: 0.890035 s
dataset.keys()
dict_keys(['info', 'licenses', 'images', 'annotations', 'categories'])
dataset['images'][0] # 记录了一张图片的一些标注信息
{'license': 4, 'file_name': '000000397133.jpg', 'coco_url': 'http://images.cocodataset.org/val2017/000000397133.jpg', 'height': 427, 'width': 640, 'date_captured': '2013-11-14 17:02:52', 'flickr_url': 'http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg', 'id': 397133}
1.1 从网页获取图片
%pylab inlineimport skimage.io as siococo_url = dataset['images'][0]['coco_url']# use url to load imageI = sio.imread(coco_url)plt.axis('off')plt.imshow(I)plt.show()
Populating the interactive namespace from numpy and matplotlib
1.2 从本地读取图片
为了避免解压数据集,我使用了 zipfile
模块:
imgType = 'val2017'imgZ = ImageZ(root, imgType)I = imgZ.buffer2array(imgZ.names[0])plt.axis('off')plt.imshow(I)plt.show()
2 cocoz.COCOZ
root = r'E:\Data\coco' # COCO 数据集所在根目录annType = 'annotations_trainval2017' # COCO 标注数据类型annFile = 'annotations/instances_val2017.json'annZ = AnnZ(root, annType)coco = COCOZ(annZ, annFile)
Loading json in memory ...used time: 1.02004 sLoading json in memory ...creating index...index created!used time: 0.431003 s
如果你需要预览你载入的 COCO 数据集,可以使用 print()
来实现:
print(coco)
description: COCO 2017 Dataseturl: http://cocodataset.orgversion: 1.0year: 2017contributor: COCO Consortiumdate_created: 2017/09/01
coco.keys()
dict_keys(['dataset', 'anns', 'imgToAnns', 'catToImgs', 'imgs', 'cats'])
2.1 展示 COCO 的类别与超类
cats = coco.loadCats(coco.getCatIds())nms = set([cat['name'] for cat in cats]) # 获取 cat 的 name 信息print('COCO categories: \n{}\n'.format(' '.join(nms)))# ============================================================snms = set([cat['supercategory'] for cat in cats]) # 获取 cat 的 name 信息print('COCO supercategories: \n{}'.format(' '.join(snms)))
COCO categories: kite potted plant handbag clock umbrella sports ball bird frisbee toilet toaster spoon car snowboard banana fire hydrant skis chair tv skateboard wine glass tie cell phone cake zebra baseball glove stop sign airplane bed surfboard cup knife apple broccoli bicycle train carrot remote cat bear teddy bear person bench horse dog couch orange hair drier backpack giraffe sandwich book donut sink oven refrigerator boat mouse laptop toothbrush keyboard truck motorcycle bottle pizza traffic light cow microwave scissors bus baseball bat elephant fork bowl tennis racket suitcase vase sheep parking meter dining table hot dogCOCO supercategories: accessory furniture sports vehicle appliance electronic animal indoor outdoor person kitchen food
2.2 通过给定条件获取图片
获取包含给定类别的所有图片
# get all images containing given categories, select one at randomcatIds = coco.getCatIds(catNms=['cat', 'dog', 'snowboar']) # 获取 Cat 的 IdsimgIds = coco.getImgIds(catIds=catIds ) # img = coco.loadImgs(imgIds)
随机选择一张图片的信息:
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]img
{'license': 4, 'file_name': '000000318238.jpg', 'coco_url': 'http://images.cocodataset.org/val2017/000000318238.jpg', 'height': 640, 'width': 478, 'date_captured': '2013-11-21 00:01:06', 'flickr_url': 'http://farm8.staticflickr.com/7402/9964003514_84ce7550c9_z.jpg', 'id': 318238}
2.2.1 获取图片
从网络获取图片:
coco_url = img['coco_url']I = sio.imread(coco_url)plt.axis('off')plt.imshow(I)plt.show()
从本地获取图片:
这里有一个梗:cv2
的图片默认模式是 BGR 而不是 RGB,所以,将 I
直接使用 plt
会改变原图的颜色空间,为此我们可以使用 cv2.COLOR_BGR2RGB
.
imgType = 'val2017'imgZ = ImageZ(root, imgType)I = imgZ.buffer2array(img['file_name'])plt.axis('off')plt.imshow(I)plt.show()
2.3 将图片的 anns 信息标注在图片上
# load and display instance annotationsplt.imshow(I)plt.axis('off')annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)anns = coco.loadAnns(annIds)coco.showAnns(anns)
2.4 关键点检测
# initialize COCO api for person keypoints annotationsroot = r'E:\Data\coco' # COCO 数据集所在根目录annType = 'annotations_trainval2017' # COCO 标注数据类型annFile = 'annotations/person_keypoints_val2017.json'annZ = AnnZ(root, annType)coco_kps = COCOZ(annZ, annFile)
Loading json in memory ...used time: 0.882997 sLoading json in memory ...creating index...index created!used time: 0.368036 s
先选择一张带有 person
的图片:
catIds = coco.getCatIds(catNms=['person']) # 获取 Cat 的 IdsimgIds = coco.getImgIds(catIds=catIds) img = coco.loadImgs(imgIds)[77]
# use url to load imageI = sio.imread(img['coco_url'])plt.axis('off')plt.imshow(I)plt.show()
# load and display keypoints annotationsplt.imshow(I); plt.axis('off')ax = plt.gca()annIds = coco_kps.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)anns = coco_kps.loadAnns(annIds)coco_kps.showAnns(anns)
2.5 看图说话
# initialize COCO api for person keypoints annotationsroot = r'E:\Data\coco' # COCO 数据集所在根目录annType = 'annotations_trainval2017' # COCO 标注数据类型annFile = 'annotations/captions_val2017.json'annZ = AnnZ(root, annType)coco_caps = COCOZ(annZ, annFile)
Loading json in memory ...used time: 0.435748 sLoading json in memory ...creating index...index created!used time: 0.0139964 s
# load and display caption annotationsannIds = coco_caps.getAnnIds(imgIds=img['id']);anns = coco_caps.loadAnns(annIds)coco_caps.showAnns(anns)plt.imshow(I)plt.axis('off')plt.show()
show:
A brown horse standing next to a woman in front of a house.a person standing next to a horse next to a buildingA woman stands beside a large brown horse.The woman stands next to the large brown horse.A woman hold a brown horse while a woman watches.
如果你需要使用官方 API, 可以参考 。
如果你觉得对你有帮助,请帮忙在 Github 上点个 star:。该教程的代码我放在了 GitHub: 。