创建神经网络并进行训练和查询

摘要:通过对神经网络编程这本书的内容,模仿其中的代码,对初次学习神经网络进行熟悉,并逐步完成神经网络的搭建,完成对手写数字的识别。

神经网络的三大步骤

  1. 初始化函数–设定输入节点、隐藏节点和输出节点的数量。

  2. 训练–学习给定的训练集样本后,优化权重。

  3. 查询–给定输入,从输出的节点给出答案。

初始化网络–输入

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def __init__(self , inputnodes , hiddennodes , outputnodes , learningrate):   
#set number of nodes in each input, hidden, output layer

self.inodes = inputnodes
self.hnodes = hiddennodes
self.onodes = outputnodes


#链接权重矩阵
#link weight matrices, wih and who 注释:此处wih的意思是 w:权重 i:input h:hidden ,后面的who同理
#weights inside the arrays are w_i_j,where link is from node i to node j in the next layter
#w11 w21
#w12 w22 etc
self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5)
self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)

#learning rate
self.lr = learningrate


#导入scipy.special 才可以用
#activation function is the sigmoid function
#使用lambda来创建函数, 函数接受了X,返回了scipy.special.expit(x),这就是S函数,使用lambda创建的匿名函数
self.activation_function = lambda x: scipy.special.expit(x)

pass

其中,需要注意的地方有:

  • 函数名 __init__是前后两条“_” ,如果只有一条下划线的话会报错TypeError : object() takes no parameters

参考链接:https://blog.csdn.net/qq_26489165/article/details/80595864

  • 链接权重矩阵:此处wih的意思是 w:权重 i:input h:hidden ,后面的whoy也是同理
1
2
self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5)
self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)
  • 使用lambda来创建函数, 函数接受了X,返回了scipy.special.expit(x),这就是S函数,使用lambda创建的匿名函数
1
self.activation_function = lambda x: scipy.special.expit(x)

初始化网络–查询

1
2
3
4
5
6
7
8
9
10
11
12
13
#convert inouts list to 2d array
inputs = numpy.array(inputs_list,ndmin= 2 ) .T

#calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih , inputs)
#calculate the signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)
#calculate signals into final output layer
final_inputs = numpy.dot(self.who , hidden_outputs)
#calculate the signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)
return final_outputs
pass

初始化网络–训练

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#train the neural network
def train(self, input_list, targets_list):
#可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
#cober inputs list to 2d array
inputs = numpy.array(input_list,ndmin=2).T
targets = numpy.array(targets_list,ndmin=2).T

#calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih , inputs)
#calculate the signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)
#calculate signals into final output layer
final_inputs = numpy.dot(self.who , hidden_outputs)
#calculate the signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)

#error is the (target -actual),即为反向传播的误差
output_errors = targets - final_outputs
#hidden layer error is the output_errors , split by weights,recombined at hodden nodes
hidden_errors = numpy.dot(self.who.T , output_errors)
#update the weights for the links between the hidden and output layers
#其中,学习率是self.lr 利用numpy.dot进行矩阵的乘法
self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs))
#update the weights for the links between the intput and hidden layers
self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs))
pass

其中,需要注意的是:

  1. 可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
  2. 而error中error is the (target -actual),即为反向传播的误差
  3. 学习率是self.lr 利用numpy.dot进行矩阵的乘法

创建神经网络对象

1
2
3
4
5
6
7
8
9
10
#number of input, hidden and output nodes
input_nodes = 784
hidden_nodes = 100
output_nodes = 10

#learn rate is 0.3 (学习率)
learning_rate = 0.3

#create instance of neural network
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

为什么选择784个输入节点呢?请记住这是28*28的结果,即是组成手写数字图像的像素个数。

选择100个隐藏节点并没有固定规定,书中认为神经网络可以发现在输入中的特征或模式,这些模式或者特征可以使用比输入本身更简短的表达,因此没有和选择比784大的数字。选择比输入节点小的数量来强制网络尝试总结输入的主要特点。但选择太少的隐藏节点就限制了网络的能力。给定10个输出层的节点对应的是10个标签。

强调一点:对于一个问题,选择多少个隐藏层节点并不存在一份最佳方法。最好的办法就是进行实验,直到找到适合你解决问题的一个数字。

测试网络

  • 打开文件
1
2
3
4
#load the mnist training data CSV file into a list
training_data_file = open("TestandTrain/mnist_train_100.csv",'r')
training_data_list = training_data_file.readlines()
training_data_file.close()

将文件放在同目录下即可直接调用

  • 训练网络
1
2
3
4
5
6
7
8
9
10
11
12
# train the neural network
# go through all records in the training data set
for record in training_data_list:
# split the record by the ',' commas
all_values = record.split(',')
# scale and shift the inputs
inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
targets = numpy.zeros(output_nodes) + 0.01
# all_values[0] is the target label for this record
targets[int(all_values[0])] = 0.99
n.train(inputs, targets)
pass

查询网络

  • 打开文件
1
2
3
4
#load the mnist test data CSV file into a list
test_data_file = open("TestandTrain/mnist_test_10.csv",'r')
test_data_list = test_data_file.readlines()
test_data_file.close()
  • 打印标签

并进行matplotlib进行图形化显示并查看测试概率

1
2
3
4
5
6
7
8
9
10
11
#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])

image_array = numpy.asfarray(all_values[1:]).reshape((28,28))
matplotlib.pyplot.imshow(image_array,cmap='Greys',interpolation='None')


n.query((numpy.asfarray(all_values[1:])/ 255.0 * 0.99) + 0.01)

完整代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
import numpy
#scipy.special for the sigmoid function expit()
import scipy.special
#添加绘图库
import matplotlib.pyplot
%matplotlib inline

#neural network classdefinition
class neuralNetwork:
#初始化网络
#initialise the neural netwoak
#注意!----->此处init前后均是双下划线,否则报错
#TypeError: object() takes no parameters
#参考链接:https://blog.csdn.net/qq_26489165/article/details/80595864
def __init__(self , inputnodes , hiddennodes , outputnodes , learningrate):
#set number of nodes in each input, hidden, output layer

self.inodes = inputnodes
self.hnodes = hiddennodes
self.onodes = outputnodes


#链接权重矩阵
#link weight matrices, wih and who 注释:此处wih的意思是 w:权重 i:input h:hidden ,后面的who同理
#weights inside the arrays are w_i_j,where link is from node i to node j in the next layter
#w11 w21
#w12 w22 etc
self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5)
self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)

#learning rate
self.lr = learningrate


#导入scipy.special 才可以用
#activation function is the sigmoid function
#使用lambda来创建函数, 函数接受了X,返回了scipy.special.expit(x),这就是S函数,使用lambda创建的匿名函数
self.activation_function = lambda x: scipy.special.expit(x)

pass

#train the neural network
def train(self, input_list, targets_list):
#可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
#cober inputs list to 2d array
inputs = numpy.array(input_list,ndmin=2).T
targets = numpy.array(targets_list,ndmin=2).T

#calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih , inputs)
#calculate the signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)
#calculate signals into final output layer
final_inputs = numpy.dot(self.who , hidden_outputs)
#calculate the signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)

#error is the (target -actual),即为反向传播的误差
output_errors = targets - final_outputs
#hidden layer error is the output_errors , split by weights,recombined at hodden nodes
hidden_errors = numpy.dot(self.who.T , output_errors)
#update the weights for the links between the hidden and output layers
#其中,学习率是self.lr 利用numpy.dot进行矩阵的乘法
self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs))
#update the weights for the links between the intput and hidden layers
self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs))
pass

#query the neural network
def query(self, inputs_list):
#convert inouts list to 2d array
inputs = numpy.array(inputs_list,ndmin= 2 ) .T

#calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih , inputs)
#calculate the signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)
#calculate signals into final output layer
final_inputs = numpy.dot(self.who , hidden_outputs)
#calculate the signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)
return final_outputs
pass

#number of input, hidden and output nodes
input_nodes = 784
hidden_nodes = 100
output_nodes = 10

#learn rate is 0.3
learning_rate = 0.3

#create instance of neural network
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)
#load the mnist training data CSV file into a list
training_data_file = open("TestandTrain/mnist_train_100.csv",'r')
training_data_list = training_data_file.readlines()
training_data_file.close()
# train the neural network
# go through all records in the training data set
for record in training_data_list:
# split the record by the ',' commas
all_values = record.split(',')
# scale and shift the inputs
inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
targets = numpy.zeros(output_nodes) + 0.01
# all_values[0] is the target label for this record
targets[int(all_values[0])] = 0.99
n.train(inputs, targets)
pass
#load the mnist test data CSV file into a list
test_data_file = open("TestandTrain/mnist_test_10.csv",'r')
test_data_list = test_data_file.readlines()
test_data_file.close()
#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])
#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])
image_array = numpy.asfarray(all_values[1:]).reshape((28,28))
matplotlib.pyplot.imshow(image_array,cmap='Greys',interpolation='None')
n.query((numpy.asfarray(all_values[1:])/ 255.0 * 0.99) + 0.01)

记得给代码加上头文件

1
2
3
4
5
6
import numpy
#scipy.special for the sigmoid function expit()
import scipy.special
#添加绘图库
import matplotlib.pyplot
%matplotlib inline
------- 本文结束  感谢您的阅读 -------