Caffeの学習済みモデルによる画像特徴抽出

Caffe | Deep Learning Framework

The Berkeley Vision and Learning Centerが開発しているディープラーニング用のライブラリ

特に画像認識分野で近年成果を上げているConvolutional Neural Network (CNN)に特化している

CPUとGPUどちらでも動かすことができる*1 *2

学習済みのリファレンスモデルがいくつか公開されているので、大量の画像データセットを用意できない人やコンピュータの計算能力が限られている人にも手軽に使うことができる。

インストール

Ubuntuにインストールする場合は基本的にCaffeのサイトの手順で行えばできる
以下はUbuntu14.04の場合

git clone https://github.com/BVLC/Caffe.git

sudo apt-get install g++-4.6
sudo apt-get install libatlas-base-dev
sudo apt-get install python-dev python-pip

sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler

sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev

cd caffe
cp Makefile.config.example Makefile.config

コピーしたMakefile.configを編集

## Refer to http://caffe.berkeleyvision.org/installation.html

# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).

# USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).

CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers

# USE_OPENCV := 0

# USE_LEVELDB := 0

# USE_LMDB := 0

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)

# You should not set this flag if you will be reading LMDBs with any

# possibility of simultaneous read and write

# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3

OPENCV_VERSION := 3

# To customize your choice of compiler, uncomment and set the following.

# N.B. the default for Linux is g++ and the default for OSX is clang++

CUSTOM_CXX := g++-4.6

# CUDA directory contains bin/ and lib/ directories that we need.

#CUDA_DIR := /usr/local/cuda

# On Ubuntu 14.04, if cuda tools are installed via

# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:

CUDA_DIR := /usr

Makefile.configは環境に応じて変更すること

make
make test
make runtest

エラーが出なければインストール成功

Pythonでimport caffeしたい場合は、

Ubuntu14.04へのCaffeのインストール - kivantium活動日記

のPython関係のインストールの項目を参照

sudo apt-get install python-dev python-numpy python-skimage

make pycaffe

.bashrcにパスを通す

export PYTHONPATH=~/Caffe/python/:$PYTHONPATH

source ~/.bashrc

numpyのバージョンが違うとエラーが出た場合

RuntimeError: module compiled against API version a but this version of numpy is 9

http://nonbiri-tereka.hatenablog.com/entry/2015/04/27/114536

sudo pip install numpy --upgrade

リファレンスモデルの導入&中間層の特徴を抽出

caffeのビルドしたディレクトリ上で

./data/ilsvrc12/get_ilsvrc_aux.sh

ここからが少しややこしい。私がインストールした時点（2016/6/21）ではCaffeのファイル構成が微妙に参考サイトのものと違っていたりした。

examples/imagenetディレクトリでget_caffe_reference_imagenet_model.shを実行

→ないので、下記を実行

wget https://raw.githubusercontent.com/sguada/caffe-public/master/models/get_caffe_reference_imagenet_model.sh

chmod u+x get_caffe_reference_imagenet_model.sh

./get_caffe_reference_imagenet_model.sh

このコードで取得するのはILSVRC2012データセットのImageNetモデル

cd ~/Caffe/data/ilsvrc12/

./get_ilsvrc_aux.sh

cd ~/Caffe/examples/imagenet/

wget https://raw.githubusercontent.com/aybassiouny/wincaffe-cmake/master/examples/imagenet/imagenet_deploy.prototxt

cp imagenet_deploy.prototxt imagenet_feature.prototxt

emacs ./imagenet_feature.prototxt &

imagenet_feature.prototxtを以下のように編集*3

layersのnameが"fc6" ⇒ topの値を"fc6"から"fc6wi"に変更
layersのnameが"relu6" ⇒ bottomの値を"fc6"から"fc6wi"に変更

ちなみに、ImageNetモデルでは中間層はfc6とfc7がある。

文献によっては、fc7の層から特徴を抽出している場合もあるようだ。

Caffeで手軽に画像分類 - Yahoo! JAPAN Tech Blogのサイトにあったfeature.pyを実行してみたが、caffeのバージョンの問題で動かなかった関数があったため、以下のように変更した。

feature.py

#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys, os, os.path, numpy, caffe

#FULL PATH
MEAN_FILE = '~/Caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy'
MODEL_FILE = '~/Caffe/examples/imagenet/imagenet_feature.prototxt'
PRETRAINED = '~/Caffe/examples/imagenet/caffe_reference_imagenet_model'
LAYER = 'fc6wi'
INDEX = 4

net = caffe.Classifier(MODEL_FILE, PRETRAINED)
caffe.set_mode_cpu()
net.transformer.set_mean('data', numpy.load(MEAN_FILE))
net.transformer.set_raw_scale('data', 255)
net.transformer.set_channel_swap('data', (2,1,0))

image = caffe.io.load_image(sys.argv[1])
net.predict([ image ])
feat = net.blobs[LAYER].data[INDEX].flatten().tolist()
print(' '.join(map(str, feat)))

テスト用にダウンロードする画像データセット *4

wget http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz

tar xf 101_ObjectCategories.tar.gz

実行例

python feature.py 101_ObjectCategories/airplanes/image_0001.jpg > tmp.txt

tmp.txtに4096次元の数値データが保存できていれば成功

feature.pyを実行した際のエラー対処にはこのサイトが役立った

OSX10.10でCaffeをインストール、リファレンスモデルで画像を分類 - Qiita

classify.pyを使用して画像分類の項目を参照。

caffe/python/caffe/io.pyの該当部分を古いバージョンのCaffeのコードに直せばよいらしい。

公開されているリファレンスモデル（学習済みモデル）は以下のサイトから探すことができる

Model Zoo · BVLC/caffe Wiki · GitHub

参考サイト

*1:CPUで動かしたい場合、インストール時に設定が必要

*2:GPU使用の場合はCUDAが必要だが、CPUのみ使用する場合はCUDAをインストールする必要はない。UbuntuではCUDAのライブラリをインストールするとディスプレイ関係（？）にエラーが出る場合があるらしいので注意。

*3:参考サイトで行われていた方法。必ずこの変更が必要かどうかは要検証。

*4:CaffeはOpenCVの関数を使用して画像を開くため、OpenCVに対応した画像フォーマットであればCaffeで使用可能のはず。