手書き数字の画像認識 深層学習モデルの実装例

create date : Aug. 8, 2018 at 23:00:00

本チュートリアルでは【練習問題】手書き数字の画像認識コンペの簡単な深層学習モデルの実装例を示します.
本コンペは0~9の手書き数字が写った画像に対して0~9の数字を割り当てる問題です. 学習用として与えられる画像データは60,000枚で
評価用として与えられる画像データは10,000枚です. 評価指標は正解率(Accuracy)です.
7層の比較的浅いモデル(LeNet[1])に対して実験した結果、ほぼ99%の精度を出すことが可能であることがわかりました。

分析環境は以下のようなものを想定しています。

  • OS: Windows 10 Pro
  • 言語: Python==3.6.3
  • ライブラリ
    • pandas==0.23.0
    • numpy==1.14.3
    • Pillow==5.1.0
    • chainer==4.0.0
    • cupy==4.0.0
  • GPU: Quadro M1200

はじめに

まずは画像データ(train.zip, test.zip)とメタデータ(train_master.tsv)をダウンロードし, 画像データを解凍します.
下記のようにそれぞれ自身で設定したディレクトリに置きます.

In [1]:
import os

_train_images_path = os.path.join('..', 'data', 'image','mnist','open','train')
_test_images_path = os.path.join('..', 'data', 'image', 'mnist','open','test')
_train_meta_path = os.path.join('..','data','image','mnist','open','train_master.tsv')

学習用画像データと数字の対応表を確認してみます.

In [2]:
import pandas as pd
In [3]:
train_master = pd.read_csv(_train_meta_path, sep='\t')
train_master.head()
Out[3]:
file_name category_id
0 train_0.jpg 5
1 train_1.jpg 0
2 train_2.jpg 4
3 train_3.jpg 1
4 train_4.jpg 9

実際に画像を確認してみます.

In [4]:
from PIL import Image
In [5]:
image = Image.open(os.path.join(_train_images_path, 'train_0.jpg'))
print('width, height:', image.size)
image
width, height: (28, 28)
Out[5]:
In [6]:
image = Image.open(os.path.join(_train_images_path, 'train_1.jpg'))
print('width, height:', image.size)
image
width, height: (28, 28)
Out[6]:
In [7]:
image = Image.open(os.path.join(_train_images_path, 'train_2.jpg'))
print('width, height:', image.size)
image
width, height: (28, 28)
Out[7]:
In [8]:
image = Image.open(os.path.join(_train_images_path, 'train_3.jpg'))
print('width, height:', image.size)
image
width, height: (28, 28)
Out[8]:
In [9]:
image = Image.open(os.path.join(_train_images_path, 'train_4.jpg'))
print('width, height:', image.size)
image
width, height: (28, 28)
Out[9]:

それぞれ画像の大きさは横×縦が28×28で, グレースケール画像であることが確認できました.
それぞれの数字で特徴が出ています. この例だと4と9が若干似ています.

学習用画像データにおけるそれぞれの数字の分布をみてみます.

In [10]:
train_master['category_id'].value_counts().sort_index()
Out[10]:
0    5923
1    6742
2    5958
3    6131
4    5842
5    5421
6    5918
7    6265
8    5851
9    5949
Name: category_id, dtype: int64

ほぼ均等ですが, 1が多めで5が少なめでした.

手法

次に実際にモデリングを行い, 評価用画像データに対して予測をできるようにします.
今回は7層の比較的浅い畳み込みニューラルネットワークモデル(LeNet[1])によりモデリングを試みます.
実装はchainerを用います.

まず60,000枚の学習用画像データを読み込みます. その際, あらかじめデータの数値の範囲が[0, 1]となるように正規化しておきます.
chainerではモデルに学習データを渡す際は(サンプル数, チャンネル数, 縦, 横)の形でなければならないので, そのように変形しておきます.

In [11]:
import numpy as np

num_train = len(train_master)
image_size = (28,28)
train_X = np.zeros((num_train,)+(1,)+image_size)
train_Y = np.zeros((num_train,))
for data in train_master.iterrows():
    image = Image.open(os.path.join(_train_images_path, data[1]['file_name']))
    image_array = np.array(image).reshape((1,)+image_size)
    train_X[data[0],:] = image_array/255.
    train_Y[data[0]] = data[1]['category_id']
In [12]:
train_X = train_X.astype(np.float32)
train_Y = train_Y.astype(np.int32)
In [13]:
print(train_X.shape)
print(train_Y.shape)
(60000, 1, 28, 28)
(60000,)

畳み込みニューラルネットワークモデルLeNetを構築する抽象クラスを定義します.
活性化関数はtanh, relu, sigmoidを指定できるようにします. もともとはtanhです.

In [15]:
from chainer import Chain
import chainer.links as L

class LeNet(Chain):
    def __init__(self, out_size=10, act_func = ''):
        super(LeNet, self).__init__(
                    conv1 = L.Convolution2D(None, 6, 5, stride=1),
                    conv2 = L.Convolution2D(None, 16, 5, stride=1),
                    fc3 = L.Linear(None, 120),
                    fc4 = L.Linear(None, 84),
                    fc5 = L.Linear(None, out_size))
        self.train = True
        if act_func == 'sigmoid':
            print('activation function is', act_func)
            self.act_func = F.sigmoid
        elif act_func == 'relu':
            print('activation function is', act_func)
            self.act_func = F.relu
        else:
            print('activation function is', 'tanh')
            self.act_func = F.tanh
    
    def __call__(self, x):
        with chainer.using_config('enable_backprop', self.train):
            h = F.max_pooling_2d(self.act_func(self.conv1(x)), 2, stride=2)
            h = F.max_pooling_2d(self.act_func(self.conv2(x)), 2, stride=2)
            h = self.act_func(self.fc3(h))
            h = self.act_func(self.fc4(h))
            y = self.fc5(h)
        
            return y

学習データとモデルオブジェクトを渡してモデルを学習させる関数を定義します.

In [16]:
import chainer
import chainer.functions as F
import chainer.cuda as cuda
from chainer import Chain, optimizers, serializers
from time import clock

def train_model(model, train, model_name, val = False, batchsize = 128, init_lr = 0.01, num_epochs = 50, gpu = True):
    """
    train: (np.array, np.array)
    val: (np.array, np.array)
    """
    X_train, Y_train = train
    
    if val:
        X_train, X_val = np.split(X_train,[54000])
        Y_train, Y_val = np.split(Y_train,[54000])
        n_val = len(X_val)
    n_train = len(X_train)
    optimizer = optimizers.NesterovAG(lr=init_lr)
    optimizer.setup(model)
    
    
    if gpu:
        model.to_gpu()
    update_start = clock()
    for epoch in range(num_epochs):
        print('-'*20, 'epoch:', epoch+1, '-'*20)
        # training
        count = 0
        num_samples = 0
        train_loss = 0
        train_acc = 0
        train_start = clock()
        print('training...')
        for t in range(0, n_train, batchsize):
            model.cleargrads()
            minibatch, labels = X_train[t:t+batchsize], Y_train[t:t+batchsize]
            if gpu:
                minibatch = cuda.to_gpu(minibatch)
                labels = cuda.to_gpu(labels)
            y = model(minibatch)
            loss = F.softmax_cross_entropy(y, labels)
            loss.backward()
            optimizer.update()
            y_pred = F.softmax(y)
            preds = cuda.to_cpu(y_pred.data).argmax(axis=1)
            labels = cuda.to_cpu(labels)
            train_loss += float(loss.data)*len(minibatch)
            train_acc += (labels==preds).sum()
            count += 1
            num_samples += len(minibatch)
        print('train_acc:', round(train_acc/num_samples, 4), 'train_loss:', round(train_loss/num_samples, 4))
        print('Took', clock() - train_start, 'seconds.')
        
        index = np.random.permutation(np.arange(n_train))
        X_train = X_train[index]
        Y_train = Y_train[index]

        if val:
            # validation
            val_acc = 0
            val_loss = 0
            num_samples = 0
            count = 0
            model.train = False
            val_start = clock()
            print('\n')
            print('validating...')
            for v in range(0, n_val, batchsize):
                minibatch, labels = X_val[v:v+batchsize], Y_val[v:v+batchsize]
                if gpu:
                    minibatch = cuda.to_gpu(minibatch)
                    labels = cuda.to_gpu(labels)
                y = model(minibatch)
                loss = F.softmax_cross_entropy(y, labels)
                y_pred = F.softmax(y)
                preds = cuda.to_cpu(y_pred.data).argmax(axis=1)
                val_loss += float(loss.data)*len(minibatch)
                labels = cuda.to_cpu(labels)
                val_acc += (labels==preds).sum()
                count+=1
                num_samples += len(minibatch)
            val_acc /= num_samples
            print('val_acc:', round(val_acc, 4), 'val_loss:', round(val_loss/num_samples, 4))
            print('Took', clock()-val_start, 'seconds.')
            model.train = True
        serializers.save_npz(model_name+'.npz', model)
        
    print('\nTook', clock()-update_start, 'seconds.')

LeNetモデルを学習させてみます. まずは活性化関数はデフォルトで設定しておきます.

In [17]:
lenet = LeNet()
activation function is tanh
In [18]:
train = train_X, train_Y
In [19]:
train_model(lenet, train, 'lenet')
-------------------- epoch: 1 --------------------
training...
train_acc: 0.9087 train_loss: 0.3427
Took 5.036974097345552 seconds.
-------------------- epoch: 2 --------------------
training...
train_acc: 0.9656 train_loss: 0.1204
Took 3.213466973477403 seconds.
-------------------- epoch: 3 --------------------
training...
train_acc: 0.9783 train_loss: 0.078
Took 3.2095018944712077 seconds.
-------------------- epoch: 4 --------------------
training...
train_acc: 0.9838 train_loss: 0.0587
Took 3.2427041889876342 seconds.
-------------------- epoch: 5 --------------------
training...
train_acc: 0.9872 train_loss: 0.0464
Took 3.206259230760814 seconds.
-------------------- epoch: 6 --------------------
training...
train_acc: 0.9897 train_loss: 0.038
Took 3.1867558411342785 seconds.
-------------------- epoch: 7 --------------------
training...
train_acc: 0.9918 train_loss: 0.0319
Took 3.2103176658072563 seconds.
-------------------- epoch: 8 --------------------
training...
train_acc: 0.9928 train_loss: 0.0274
Took 3.206195048483149 seconds.
-------------------- epoch: 9 --------------------
training...
train_acc: 0.9942 train_loss: 0.0234
Took 3.2008657314044626 seconds.
-------------------- epoch: 10 --------------------
training...
train_acc: 0.9956 train_loss: 0.0198
Took 3.220095252334815 seconds.
-------------------- epoch: 11 --------------------
training...
train_acc: 0.9964 train_loss: 0.017
Took 3.24847731192952 seconds.
-------------------- epoch: 12 --------------------
training...
train_acc: 0.9968 train_loss: 0.0148
Took 3.211767237135284 seconds.
-------------------- epoch: 13 --------------------
training...
train_acc: 0.9978 train_loss: 0.0124
Took 3.197559614760465 seconds.
-------------------- epoch: 14 --------------------
training...
train_acc: 0.9979 train_loss: 0.0111
Took 3.1742399323168726 seconds.
-------------------- epoch: 15 --------------------
training...
train_acc: 0.9984 train_loss: 0.0098
Took 3.1816726776773265 seconds.
-------------------- epoch: 16 --------------------
training...
train_acc: 0.9988 train_loss: 0.0083
Took 3.1852198425346216 seconds.
-------------------- epoch: 17 --------------------
training...
train_acc: 0.999 train_loss: 0.0073
Took 3.219322512298561 seconds.
-------------------- epoch: 18 --------------------
training...
train_acc: 0.999 train_loss: 0.0067
Took 3.1950885970702316 seconds.
-------------------- epoch: 19 --------------------
training...
train_acc: 0.9993 train_loss: 0.0057
Took 3.193577031496716 seconds.
-------------------- epoch: 20 --------------------
training...
train_acc: 0.9994 train_loss: 0.0052
Took 3.218616871916254 seconds.
-------------------- epoch: 21 --------------------
training...
train_acc: 0.9996 train_loss: 0.0045
Took 3.174630860735391 seconds.
-------------------- epoch: 22 --------------------
training...
train_acc: 0.9996 train_loss: 0.004
Took 3.2072547854087503 seconds.
-------------------- epoch: 23 --------------------
training...
train_acc: 0.9997 train_loss: 0.0036
Took 3.223209551489859 seconds.
-------------------- epoch: 24 --------------------
training...
train_acc: 0.9997 train_loss: 0.0034
Took 3.2028251142335193 seconds.
-------------------- epoch: 25 --------------------
training...
train_acc: 0.9998 train_loss: 0.003
Took 3.178631312928715 seconds.
-------------------- epoch: 26 --------------------
training...
train_acc: 0.9998 train_loss: 0.0028
Took 3.22452127678973 seconds.
-------------------- epoch: 27 --------------------
training...
train_acc: 0.9999 train_loss: 0.0025
Took 3.1835667842126156 seconds.
-------------------- epoch: 28 --------------------
training...
train_acc: 0.9998 train_loss: 0.0023
Took 3.1941725409253223 seconds.
-------------------- epoch: 29 --------------------
training...
train_acc: 0.9999 train_loss: 0.0022
Took 3.2216815756749213 seconds.
-------------------- epoch: 30 --------------------
training...
train_acc: 0.9999 train_loss: 0.0019
Took 3.204356736768787 seconds.
-------------------- epoch: 31 --------------------
training...
train_acc: 1.0 train_loss: 0.0018
Took 3.184928469580882 seconds.
-------------------- epoch: 32 --------------------
training...
train_acc: 0.9999 train_loss: 0.0017
Took 3.200788420933634 seconds.
-------------------- epoch: 33 --------------------
training...
train_acc: 0.9999 train_loss: 0.0016
Took 3.175166563950711 seconds.
-------------------- epoch: 34 --------------------
training...
train_acc: 1.0 train_loss: 0.0015
Took 3.1989818356860837 seconds.
-------------------- epoch: 35 --------------------
training...
train_acc: 1.0 train_loss: 0.0015
Took 3.214083269211841 seconds.
-------------------- epoch: 36 --------------------
training...
train_acc: 1.0 train_loss: 0.0014
Took 3.2003026777867234 seconds.
-------------------- epoch: 37 --------------------
training...
train_acc: 1.0 train_loss: 0.0013
Took 3.192801374084226 seconds.
-------------------- epoch: 38 --------------------
training...
train_acc: 1.0 train_loss: 0.0012
Took 3.210803408954149 seconds.
-------------------- epoch: 39 --------------------
training...
train_acc: 1.0 train_loss: 0.0012
Took 3.1756395435764944 seconds.
-------------------- epoch: 40 --------------------
training...
train_acc: 1.0 train_loss: 0.0011
Took 3.214450858620296 seconds.
-------------------- epoch: 41 --------------------
training...
train_acc: 1.0 train_loss: 0.0011
Took 3.2313537719851695 seconds.
-------------------- epoch: 42 --------------------
training...
train_acc: 1.0 train_loss: 0.001
Took 3.208135103694474 seconds.
-------------------- epoch: 43 --------------------
training...
train_acc: 1.0 train_loss: 0.001
Took 3.2004011392354244 seconds.
-------------------- epoch: 44 --------------------
training...
train_acc: 1.0 train_loss: 0.0009
Took 3.2162162359282433 seconds.
-------------------- epoch: 45 --------------------
training...
train_acc: 1.0 train_loss: 0.0009
Took 3.176620511343117 seconds.
-------------------- epoch: 46 --------------------
training...
train_acc: 1.0 train_loss: 0.0009
Took 3.1763677936247916 seconds.
-------------------- epoch: 47 --------------------
training...
train_acc: 1.0 train_loss: 0.0008
Took 3.226537183783762 seconds.
-------------------- epoch: 48 --------------------
training...
train_acc: 1.0 train_loss: 0.0008
Took 3.1565894412859734 seconds.
-------------------- epoch: 49 --------------------
training...
train_acc: 1.0 train_loss: 0.0008
Took 3.176725172216379 seconds.
-------------------- epoch: 50 --------------------
training...
train_acc: 1.0 train_loss: 0.0008
Took 3.188038757343577 seconds.

Took 169.4511467112053 seconds.

活性化関数がreluのモデルを学習させてみます.

In [20]:
lenet_relu = LeNet(act_func='relu')
activation function is relu
In [21]:
train_model(lenet_relu, train, 'lenet_relu')
-------------------- epoch: 1 --------------------
training...
train_acc: 0.9028 train_loss: 0.3333
Took 3.367264121012795 seconds.
-------------------- epoch: 2 --------------------
training...
train_acc: 0.9705 train_loss: 0.1001
Took 3.2876470995810223 seconds.
-------------------- epoch: 3 --------------------
training...
train_acc: 0.9804 train_loss: 0.0671
Took 3.3144610694371295 seconds.
-------------------- epoch: 4 --------------------
training...
train_acc: 0.985 train_loss: 0.0504
Took 3.2918648963054693 seconds.
-------------------- epoch: 5 --------------------
training...
train_acc: 0.9878 train_loss: 0.0407
Took 3.275175316079526 seconds.
-------------------- epoch: 6 --------------------
training...
train_acc: 0.9897 train_loss: 0.0344
Took 3.24893388131386 seconds.
-------------------- epoch: 7 --------------------
training...
train_acc: 0.9914 train_loss: 0.0282
Took 3.261977470561874 seconds.
-------------------- epoch: 8 --------------------
training...
train_acc: 0.9925 train_loss: 0.0244
Took 3.2607306568837657 seconds.
-------------------- epoch: 9 --------------------
training...
train_acc: 0.9939 train_loss: 0.0203
Took 3.2822043695002776 seconds.
-------------------- epoch: 10 --------------------
training...
train_acc: 0.9947 train_loss: 0.0175
Took 3.282784198031436 seconds.
-------------------- epoch: 11 --------------------
training...
train_acc: 0.996 train_loss: 0.0149
Took 3.2632359537449247 seconds.
-------------------- epoch: 12 --------------------
training...
train_acc: 0.9961 train_loss: 0.0131
Took 3.2626696180789168 seconds.
-------------------- epoch: 13 --------------------
training...
train_acc: 0.997 train_loss: 0.0111
Took 3.2754415266629167 seconds.
-------------------- epoch: 14 --------------------
training...
train_acc: 0.9976 train_loss: 0.0095
Took 3.2697194578056497 seconds.
-------------------- epoch: 15 --------------------
training...
train_acc: 0.9979 train_loss: 0.008
Took 3.2461609151808943 seconds.
-------------------- epoch: 16 --------------------
training...
train_acc: 0.9982 train_loss: 0.0072
Took 3.26637468592628 seconds.
-------------------- epoch: 17 --------------------
training...
train_acc: 0.9986 train_loss: 0.0058
Took 3.2573530645213395 seconds.
-------------------- epoch: 18 --------------------
training...
train_acc: 0.9988 train_loss: 0.005
Took 3.2534375809116227 seconds.
-------------------- epoch: 19 --------------------
training...
train_acc: 0.9994 train_loss: 0.004
Took 3.29076285742417 seconds.
-------------------- epoch: 20 --------------------
training...
train_acc: 0.9993 train_loss: 0.0038
Took 3.309753518173352 seconds.
-------------------- epoch: 21 --------------------
training...
train_acc: 0.9996 train_loss: 0.0027
Took 3.2678986503487977 seconds.
-------------------- epoch: 22 --------------------
training...
train_acc: 0.9996 train_loss: 0.0023
Took 3.261799875282236 seconds.
-------------------- epoch: 23 --------------------
training...
train_acc: 0.9997 train_loss: 0.0021
Took 3.289815074812509 seconds.
-------------------- epoch: 24 --------------------
training...
train_acc: 0.9997 train_loss: 0.0019
Took 3.2548907989598774 seconds.
-------------------- epoch: 25 --------------------
training...
train_acc: 0.9998 train_loss: 0.0016
Took 3.2724260536285783 seconds.
-------------------- epoch: 26 --------------------
training...
train_acc: 0.9998 train_loss: 0.0014
Took 3.273571488481821 seconds.
-------------------- epoch: 27 --------------------
training...
train_acc: 0.9999 train_loss: 0.0012
Took 3.2665628566948044 seconds.
-------------------- epoch: 28 --------------------
training...
train_acc: 0.9999 train_loss: 0.0009
Took 3.2489349753299166 seconds.
-------------------- epoch: 29 --------------------
training...
train_acc: 0.9998 train_loss: 0.0009
Took 3.3612349983042122 seconds.
-------------------- epoch: 30 --------------------
training...
train_acc: 0.9999 train_loss: 0.0008
Took 3.3386413778767974 seconds.
-------------------- epoch: 31 --------------------
training...
train_acc: 1.0 train_loss: 0.0007
Took 3.2431202797764627 seconds.
-------------------- epoch: 32 --------------------
training...
train_acc: 1.0 train_loss: 0.0007
Took 3.2673293973065256 seconds.
-------------------- epoch: 33 --------------------
training...
train_acc: 1.0 train_loss: 0.0005
Took 3.2688114244455164 seconds.
-------------------- epoch: 34 --------------------
training...
train_acc: 0.9999 train_loss: 0.0006
Took 3.2855206969611572 seconds.
-------------------- epoch: 35 --------------------
training...
train_acc: 0.9999 train_loss: 0.0006
Took 3.298971624869182 seconds.
-------------------- epoch: 36 --------------------
training...
train_acc: 1.0 train_loss: 0.0005
Took 3.323961140548249 seconds.
-------------------- epoch: 37 --------------------
training...
train_acc: 1.0 train_loss: 0.0005
Took 3.2576477195234474 seconds.
-------------------- epoch: 38 --------------------
training...
train_acc: 1.0 train_loss: 0.0004
Took 3.2717102024295173 seconds.
-------------------- epoch: 39 --------------------
training...
train_acc: 1.0 train_loss: 0.0004
Took 3.2320838453936176 seconds.
-------------------- epoch: 40 --------------------
training...
train_acc: 1.0 train_loss: 0.0004
Took 3.2563210426702653 seconds.
-------------------- epoch: 41 --------------------
training...
train_acc: 1.0 train_loss: 0.0003
Took 3.3770289440191164 seconds.
-------------------- epoch: 42 --------------------
training...
train_acc: 1.0 train_loss: 0.0003
Took 3.3026661172274316 seconds.
-------------------- epoch: 43 --------------------
training...
train_acc: 1.0 train_loss: 0.0003
Took 3.2929734992833346 seconds.
-------------------- epoch: 44 --------------------
training...
train_acc: 1.0 train_loss: 0.0004
Took 3.2406864586334905 seconds.
-------------------- epoch: 45 --------------------
training...
train_acc: 1.0 train_loss: 0.0003
Took 3.2584959466703367 seconds.
-------------------- epoch: 46 --------------------
training...
train_acc: 1.0 train_loss: 0.0003
Took 3.395573975545176 seconds.
-------------------- epoch: 47 --------------------
training...
train_acc: 1.0 train_loss: 0.0003
Took 3.3131340279120423 seconds.
-------------------- epoch: 48 --------------------
training...
train_acc: 1.0 train_loss: 0.0002
Took 3.348850371418507 seconds.
-------------------- epoch: 49 --------------------
training...
train_acc: 1.0 train_loss: 0.0002
Took 3.3405154274503275 seconds.
-------------------- epoch: 50 --------------------
training...
train_acc: 1.0 train_loss: 0.0002
Took 3.309607284688582 seconds.

Took 171.64705144428353 seconds.

評価用画像データに対して予測値を出力する関数を実装します.

In [22]:
def predict(model, X_test, batchsize = 256, gpu=True):
    n_test = len(X_test)
    predictions = np.array([])
    model.train = False
    pred_start = clock()
    print('\n')
    print('predicting...')
    for t in range(0, n_test, batchsize):
        minibatch = X_test[t:t+batchsize]
        if gpu:
            minibatch = cuda.to_gpu(minibatch)
        y = model(minibatch)
        y_pred = F.softmax(y)
        preds = cuda.to_cpu(y_pred.data).argmax(axis=1)
        predictions = np.concatenate((predictions, preds))
    
    print('Took', clock()-pred_start, 'seconds.')
    return predictions

評価用画像データを読み込みます. 学習用画像データと同様に[0, 1]に正規化しておきます.

In [23]:
image_size = (28,28)
test_files = os.listdir(_test_images_path)
num_test = len(test_files)
test_X = np.zeros((num_test,)+(1,)+image_size)
for i, file_name in enumerate(os.listdir(_test_images_path)):
    image = Image.open(os.path.join(_test_images_path, file_name))
    image_array = np.array(image).reshape((1,)+image_size)
    test_X[i,:] = image_array/255.
In [24]:
test_X = test_X.astype(np.float32)

LeNet(tanh)に対する予測値を出力します.

In [25]:
predictions_lenet = predict(lenet, test_X)

predicting...
Took 0.14893643401808276 seconds.

LeNet(relu)に対する予測値を出力します.

In [26]:
predictions_lenet_relu = predict(lenet_relu, test_X)

predicting...
Took 0.13441337033543732 seconds.
In [27]:
pd.DataFrame({'0':test_files, '1':predictions_lenet.astype(np.int)}).to_csv('lenet.tsv', sep='\t', index=None, header=None)
pd.DataFrame({'0':test_files, '1':predictions_lenet_relu.astype(np.int)}).to_csv('lenet_relu.tsv', sep='\t', index=None, header=None)

評価

lenet.tsv(LeNet(tanh))を投稿すると, 0.9896ほどで,
lenet_relu.tsv(LeNet(relu))を投稿すると, 0.9898ほど
となります.

まとめ

今回は比較的浅いネットワークモデルであるLeNetを活性化関数を変えつつ適用してみて, ほぼ99%の正解率を出すことがわかりました.
LeNetは初期の深層学習モデルで、その後様々な新しい深層学習モデルが提案されているので, 色々試してみると精度に違いが見えると思います.
その他色々工夫をしてさらなる改善を目指してみてください.

参考文献

[1] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner: Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, 86(11):2278-2324, November 1998

create date : Aug. 8, 2018 at 23:00:00