创建神经网络并进行训练和查询

摘要：通过对神经网络编程这本书的内容，模仿其中的代码，对初次学习神经网络进行熟悉，并逐步完成神经网络的搭建，完成对手写数字的识别。

神经网络的三大步骤

初始化函数–设定输入节点、隐藏节点和输出节点的数量。
训练–学习给定的训练集样本后，优化权重。
查询–给定输入，从输出的节点给出答案。

初始化网络–输入

def __init__(self , inputnodes , hiddennodes , outputnodes , learningrate):   
        #set number of nodes in each input, hidden, output layer
    
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes
        
        
        #链接权重矩阵
        #link weight matrices, wih and who 注释：此处wih的意思是 w:权重  i:input  h:hidden ,后面的who同理
        #weights inside the arrays are w_i_j,where link is from node i to node j in the next layter
        #w11 w21
        #w12 w22 etc
        self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5)
        self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)
    
        #learning rate
        self.lr = learningrate
        
        
        #导入scipy.special 才可以用
        #activation function is the sigmoid function
        #使用lambda来创建函数， 函数接受了X，返回了scipy.special.expit(x)，这就是S函数，使用lambda创建的匿名函数
        self.activation_function = lambda x: scipy.special.expit(x)
            
        pass

其中，需要注意的地方有：

函数名 __init__是前后两条“_” ,如果只有一条下划线的话会报错TypeError : object() takes no parameters

参考链接：https://blog.csdn.net/qq_26489165/article/details/80595864

链接权重矩阵：此处wih的意思是 w:权重 i:input h:hidden ,后面的whoy也是同理

1 2	self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5) self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)

使用lambda来创建函数，函数接受了X，返回了scipy.special.expit(x)，这就是S函数，使用lambda创建的匿名函数

1	self.activation_function = lambda x: scipy.special.expit(x)

初始化网络–查询

#convert inouts list to 2d array
        inputs = numpy.array(inputs_list,ndmin= 2 ) .T
        
        #calculate signals into hidden layer
        hidden_inputs = numpy.dot(self.wih , inputs)
        #calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
        #calculate signals into final output layer
        final_inputs = numpy.dot(self.who , hidden_outputs)
        #calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        return final_outputs
        pass

初始化网络–训练

#train the neural network
def train(self, input_list, targets_list):
    #可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
    #cober inputs list to 2d array
    inputs = numpy.array(input_list,ndmin=2).T
    targets = numpy.array(targets_list,ndmin=2).T
    
    #calculate signals into hidden layer
    hidden_inputs = numpy.dot(self.wih , inputs)
    #calculate the signals emerging from hidden layer
    hidden_outputs = self.activation_function(hidden_inputs)
     #calculate signals into final output layer
    final_inputs = numpy.dot(self.who , hidden_outputs)
    #calculate the signals emerging from final output layer
    final_outputs = self.activation_function(final_inputs)
    
    #error is the (target -actual),即为反向传播的误差
    output_errors = targets - final_outputs
    #hidden layer error is the output_errors , split by weights,recombined at hodden nodes
    hidden_errors = numpy.dot(self.who.T , output_errors)
    #update the weights for the links between the hidden and output layers
    #其中，学习率是self.lr  利用numpy.dot进行矩阵的乘法
    self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs))
    #update the weights for the links between the intput and hidden layers
    self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs))
    pass

其中，需要注意的是：

可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
而error中error is the (target -actual),即为反向传播的误差
学习率是self.lr 利用numpy.dot进行矩阵的乘法

创建神经网络对象

#number of input, hidden and output nodes
input_nodes = 784
hidden_nodes = 100
output_nodes = 10

#learn rate is 0.3 （学习率）
learning_rate = 0.3

#create instance of neural network
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

为什么选择784个输入节点呢？请记住这是28*28的结果，即是组成手写数字图像的像素个数。

选择100个隐藏节点并没有固定规定，书中认为神经网络可以发现在输入中的特征或模式，这些模式或者特征可以使用比输入本身更简短的表达，因此没有和选择比784大的数字。选择比输入节点小的数量来强制网络尝试总结输入的主要特点。但选择太少的隐藏节点就限制了网络的能力。给定10个输出层的节点对应的是10个标签。

强调一点：对于一个问题，选择多少个隐藏层节点并不存在一份最佳方法。最好的办法就是进行实验，直到找到适合你解决问题的一个数字。

测试网络

打开文件

#load the mnist training data CSV file into a list
training_data_file = open("TestandTrain/mnist_train_100.csv",'r')
training_data_list = training_data_file.readlines()
training_data_file.close()

将文件放在同目录下即可直接调用

训练网络

# train the neural network
# go through all records in the training data set 
for record in training_data_list:
    # split the record by the ',' commas
    all_values = record.split(',')
    # scale and shift the inputs
    inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
    targets = numpy.zeros(output_nodes) + 0.01
    # all_values[0] is the target label for this record
    targets[int(all_values[0])] = 0.99
    n.train(inputs, targets)
    pass

查询网络

打开文件

#load the mnist test data CSV file into a list
test_data_file = open("TestandTrain/mnist_test_10.csv",'r')
test_data_list = test_data_file.readlines()
test_data_file.close()

打印标签

并进行matplotlib进行图形化显示并查看测试概率

#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])

image_array = numpy.asfarray(all_values[1:]).reshape((28,28))
matplotlib.pyplot.imshow(image_array,cmap='Greys',interpolation='None')


n.query((numpy.asfarray(all_values[1:])/ 255.0 * 0.99) + 0.01)

完整代码

import numpy
#scipy.special for the sigmoid function expit()
import scipy.special
#添加绘图库
import matplotlib.pyplot
%matplotlib inline

#neural network classdefinition
class neuralNetwork:
    #初始化网络
    #initialise the neural netwoak
    #注意！----->此处init前后均是双下划线，否则报错
    #TypeError: object() takes no parameters
    #参考链接：https://blog.csdn.net/qq_26489165/article/details/80595864
    def __init__(self , inputnodes , hiddennodes , outputnodes , learningrate):   
        #set number of nodes in each input, hidden, output layer
    
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes
        
        
        #链接权重矩阵
        #link weight matrices, wih and who 注释：此处wih的意思是 w:权重  i:input  h:hidden ,后面的who同理
        #weights inside the arrays are w_i_j,where link is from node i to node j in the next layter
        #w11 w21
        #w12 w22 etc
        self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5)
        self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)
    
        #learning rate
        self.lr = learningrate
        
        
        #导入scipy.special 才可以用
        #activation function is the sigmoid function
        #使用lambda来创建函数， 函数接受了X，返回了scipy.special.expit(x)，这就是S函数，使用lambda创建的匿名函数
        self.activation_function = lambda x: scipy.special.expit(x)
            
        pass
    
    #train the neural network
    def train(self, input_list, targets_list):
        #可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
        #cober inputs list to 2d array
        inputs = numpy.array(input_list,ndmin=2).T
        targets = numpy.array(targets_list,ndmin=2).T
        
        #calculate signals into hidden layer
        hidden_inputs = numpy.dot(self.wih , inputs)
        #calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
         #calculate signals into final output layer
        final_inputs = numpy.dot(self.who , hidden_outputs)
        #calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        
        #error is the (target -actual),即为反向传播的误差
        output_errors = targets - final_outputs
        #hidden layer error is the output_errors , split by weights,recombined at hodden nodes
        hidden_errors = numpy.dot(self.who.T , output_errors)
        #update the weights for the links between the hidden and output layers
        #其中，学习率是self.lr  利用numpy.dot进行矩阵的乘法
        self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs))
        #update the weights for the links between the intput and hidden layers
        self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs))
        pass
    
    #query the neural network
    def query(self, inputs_list):
        #convert inouts list to 2d array
        inputs = numpy.array(inputs_list,ndmin= 2 ) .T
        
        #calculate signals into hidden layer
        hidden_inputs = numpy.dot(self.wih , inputs)
        #calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
        #calculate signals into final output layer
        final_inputs = numpy.dot(self.who , hidden_outputs)
        #calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        return final_outputs
        pass
    
#number of input, hidden and output nodes
input_nodes = 784
hidden_nodes = 100
output_nodes = 10

#learn rate is 0.3
learning_rate = 0.3

#create instance of neural network
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)
#load the mnist training data CSV file into a list
training_data_file = open("TestandTrain/mnist_train_100.csv",'r')
training_data_list = training_data_file.readlines()
training_data_file.close()
# train the neural network
# go through all records in the training data set 
for record in training_data_list:
    # split the record by the ',' commas
    all_values = record.split(',')
    # scale and shift the inputs
    inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
    targets = numpy.zeros(output_nodes) + 0.01
    # all_values[0] is the target label for this record
    targets[int(all_values[0])] = 0.99
    n.train(inputs, targets)
    pass
    #load the mnist test data CSV file into a list
test_data_file = open("TestandTrain/mnist_test_10.csv",'r')
test_data_list = test_data_file.readlines()
test_data_file.close()
#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])
#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])
image_array = numpy.asfarray(all_values[1:]).reshape((28,28))
matplotlib.pyplot.imshow(image_array,cmap='Greys',interpolation='None')
n.query((numpy.asfarray(all_values[1:])/ 255.0 * 0.99) + 0.01)

记得给代码加上头文件

import numpy
#scipy.special for the sigmoid function expit()
import scipy.special
#添加绘图库
import matplotlib.pyplot
%matplotlib inline