宗教画テーマの分類

この絵には何が書かれている？

賞金: 100,000 円参加ユーザー数: 256 3年以上前に終了

参加する

Oregin

[運営担当者様へ]学習済みのYOLOv5による物体検出結果の特徴量への追加について

学習済みのYOLOv5による物体検出結果の特徴量への追加について

[運営担当者様へ]

今回のコンペでは、事前学習済みモデルの使用を許可いただいていますが、以下の通りの物体検出のYOLOv5の学習済みモデルを利用して、特徴量を追加させていただいてもよいでしょうか。
お忙しいところ恐縮ですが、ご確認のほどよろしくお願いいたします。

import numpy as np
import pandas as pd
from pathlib import Path

from glob import glob
from matplotlib import pyplot as plt

import os, random, gc
import re, time, json

from tqdm.notebook import tqdm

物体検出 YOLOv5

はじめに

このnotebookでは,学習済みのYOLOを用いた物体検出を行い検出した物体の数を各画像の特徴量として追加する方法を紹介します。
皆さんの学習モデルに入力する特徴量としてご活用ください。
なお、このnotebookは、Google Colabで作成・実行したものです。

ColabでYOLOv5実行環境構築

以下のgithubよりyolov5レポジトリをクローンしrequirements.txtに記載のライブラリのインストールする

https://github.com/ultralytics/yolov5

!git clone https://github.com/ultralytics/yolov5

%cd yolov5/
!pip install -qr requirements.txt

!ls "/content/yolov5"

画像データをマイドライブからcontentへ展開

numpy形式の画像データをマイドライブから読み込んでcontentディレクトリ配下にJPEG形式で保存します。

## Google Driveに画像データが保存されているディレクトリ指定
## ※ご自身の環境にあわせて設定してください。

INPUT_DIR = f"※ご自身の環境にあわせて設定してください。"

## Google Driveに結果ファイルを保存するディレクトリ指定
## ※ご自身の環境にあわせて設定してください。

OUTPUT_DIR = f"※ご自身の環境にあわせて設定してください。"

## 画像データが保存されているディレクトリ内のファイル名を取得します。
file_list = sorted(glob(os.path.join(INPUT_DIR, '*')))

for i, file in enumerate(file_list):
    
    print(f'{i} | {file}')

## ファイルを読み込みます。
import numpy as np
train_label = np.load(file_list[2])['arr_0']
train_image = np.load(file_list[1])['arr_0']
test_image = np.load(file_list[0])['arr_0']

# 読み込んだ結果の確認
train_label.shape, train_image.shape, test_image.shape

# 訓練データとテストデータを結合します。
all_image = np.concatenate([train_image, test_image], 0)
all_image.shape

# JPEGに変換した画像ファイルの保存先を作成
!mkdir /content/datasets
!mkdir /content/datasets/all

# JPEGファイルの保存
import numpy as np
import matplotlib
matplotlib.use('Agg') # -----(1)
import matplotlib.pyplot as plt

for i, idx in enumerate(range(len(all_image))):
  fig = plt.figure()
  ax = fig.add_subplot(1, 1, 1)
  img = all_image[idx]
  ax.axis("off")
  ax.imshow(img, cmap='gray')
  ax.set_title(i)
  
  # save as jpg
  plt.savefig(f'/content/datasets/all/figure-{i:03d}.jpg') # -----(2)

YOLO で物体検出

detect.pyで物体検出を行います。

オプションについて

--source : 物体検出したいデータの保存先を指定
--weights : 重みデータを指定します。
- 今回は学習済みの重みyolov5x.ptを使用
--img : 画像サイズの指定
--conf : 物体を検出するしきい値の設定
- 大きな値を設定してしまうと検出されないケースもあるため0.05と低い値を設定
--save-txt : txtファイルで検出結果を出力するためのオプション。特にパラメータ設定はいらず--save-txtと記述するだけでOKです。
- 物体検出後のディレクトリ構成は以下の様になり、jpgには物体検出結果つきの画像、txtにラベルと座標データが出力されます。txtは、物体が検出された場合のみ出力されます。
```
/content/yolov5/runs/detect/
└── exp
  ├── ファイル名001.jpg
  └── ファイル名002.jpg
  └── labels
          ├── ファイル名001.txt
          └── ファイル名002.txt
```
--save-conf : 上記のtxtファイルに検出された物体のconfidence（確からしさ）が出力されます。

# 物体検出の実行
!python detect.py --weights yolov5x.pt --img 224 --conf 0.05 --source /content/datasets/all --save-txt --save-conf

物体検出結果から検出した物体の数を特徴量として保存

# 検出結果のtxtから、検出した物体の数をカウント
diclist = []
for file_path in tqdm(glob('/content/yolov5/runs/detect/exp/labels/*.txt')):
    file_id = file_path.split('/')[-1].split('.')[0]
    file_num = file_id.split('-')[-1]
    #print(file_num)
    w, h = 224,224
    f = open(file_path, 'r')
    data = np.array(f.read().replace('\n', ' ').strip().split(' ')).astype(np.float32).reshape(-1, 6)
    a,b = np.unique(data[:,0], return_counts=True)
    #print(a)
    #print(b)
    dicdic = {}
    dicdic['index'] = int(file_num)
    for i in range(len(a)):
      dicdic[int(a[i])] = b[i]
    diclist.append(dicdic)

# カウントした結果をもとに、検出した物体のラベルを列とするデータフレームを作成
df = pd.DataFrame(diclist)
yolo_tmp = pd.DataFrame({'tmp':np.zeros(len(all_image))})
yolo_result = df.set_index('index')
all_df = pd.concat([yolo_tmp,yolo_result],axis=1).fillna(0).drop('tmp',axis=1)

# train と　test に分割してファイルに出力
train_df = all_df.iloc[:len(train_image)]
test_df = all_df.iloc[len(train_image):]

train_df.to_csv(f'{OUTPUT_DIR}/train_feature.csv',index=False)
test_df.to_csv(f'{OUTPUT_DIR}/test_feature.csv',index=False)

以上

添付データ

Add_features_by_Yolov5.ipynb?X-Amz-Expires=10800&X-Amz-Date=20250218T231828Z&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIP7GCBGMWPMZ42PQ