詳細：tensorflow構建神經網路基礎概念和梳理-有解無憂

1#： placeholder

x = tf.compat.v1.placeholder(tf.float32, shape=[None, w, h, c], name='x')
y_ = tf.compat.v1.placeholder(tf.int32, shape=[None, ], name='y_')

placeholder函式定義如下：

tf.placeholder(dtype, shape=None, name=None)，placeholder是占位符，在tensorflow中類似于函式引數，運行時必須傳入值，

dtype：資料型別，常用的是tf.float32,tf.float64等數值型別，
shape：資料形狀，默認是None，就是一維值，也可以是多維，比如[2,3], [None, 3]表示列是3，行不定, 此引數可以根據提供的資料推導得到，不一定要給出，，
name：名稱, 比如常在上邊的x, y_，

比如計算3*4=12

import tensorflow as tf
import numpy as np
input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)
output = tf.multiply(input1, input2)
with tf.Session() as sess:
    print sess.run(output, feed_dict = {input1:[3.], input2: [4.]})

計算矩陣相乘x*y

import tensorflow as tf
import numpy as np
x = tf.placeholder(tf.float32, shape=(1024, 1024))
y = tf.matmul(x, x)
with tf.Session() as sess:
#  print(sess.run(y))  # ERROR: x is none now
  rand_array = np.random.rand(1024, 1024)
  print(sess.run(y, feed_dict={x: rand_array}))  # Will succeed.

使用庫函式進行矩陣運算

import tensorflow as tf
# 定義placeholder
input1 = tf.placeholder(tf.float32,shape=(1, 2),name="input-1")
input2 = tf.placeholder(tf.float32,shape=(2, 1),name="input-2")
# 定義矩陣乘法運算(注意區分matmul和multiply的區別：matmul是矩陣乘法，multiply是點乘)
output = tf.matmul(input1, input2)
# 通過session執行乘法運行
with tf.Session() as sess:
    # 執行時要傳入placeholder的值
    print sess.run(output, feed_dict = {input1:[1,2], input2:[3,4]})
    # 最終執行結果 [11]

2#：卷積和池化

卷積層

tf.nn.conv2d(input, filter, strides=, padding=, name=None)
計算給定4-D input和filter張量的2維卷積
	* 
input：給定的輸入張量，具有[batch, heigth, width, channel]，型別為float32, 64
	* 
filter：指定過濾器的大小，[filter_height, filter_width, in_channels, out_channels]. out_channels:視窗數量
	* 
strides：strides = [1, stride, stride, 1],步長
	* 
padding：“SAME”, “VALID”，使用的填充演算法的型別，使用“SAME”，其中”VALID”表示滑動超出部分舍棄，“SAME”表示填充，使得變化后height, width一樣大

新的激活函式-Reluf(x) = max(0, x)
tf.nn.relu(features, name=None)
features: 卷積后加上偏置的結果
return: 結果
	1. 
采用sigmoid等函式，反向傳播求誤差梯度時，計算量相對大，而采用Relu激活函式，整個程序的計算量節省很多


	1. 
對于深層網路，sigmoid函式反向傳播時，很容易就會出現梯度消失的情況（求不出權重和偏置）

池化層(Pooling)計算

Pooling層主要的作用是特征提取，通過去掉Feature Map中不重要的樣本，（這里如何確定什么引數不重要是個很難的問題，哪些樣本不重要，這個是很不好判斷的，）進一步減少引數數量，Pooling的方法很多，最常用的是Max Pooling，


tf.nn.max_pool(value, ksize=, strides=, padding=,name=None)

輸入上執行最大池數
	* 
value: 4-D Tensor形狀[batch, height, width, channels]
	* 
ksize: 池化視窗大小，[1, ksize, ksize, 1]
	* 
strides:步長大小，[1, strides, strides, 1]
	* 
padding: “SAME”, “VALID”，使用的填充演算法的型別，使用“SAME”

Full Connected層(全連接層)

前面的卷積和池化相當于做特征工程，后面的全連接相當于做特征加權，最后的全連接層在整個卷積神經網路中起到“分類器”的作用，

函式的作用是將tensor變換為引數shape的形式，

其中shape為一個串列形式，特殊的一點是串列中可以存在-1，-1代表的含義是不用我們自己指定這一維的大小，函式會自動計算，但串列中只能存在一個-1，（當然如果存在多個-1，就是一個存在多解的方程了）

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# 輸入層# 準備占位符with tf.variable_scope('data'):
    x = tf.placeholder(tf.float32, [None, 784])
    y_true = tf.placeholder(tf.float32, [None, 10])

# 卷積層# 卷積1with tf.variable_scope('conv1'):
    # 初始化權重 視窗3*3  步長1  32個視窗
    weight1 = tf.Variable(tf.random_normal(shape=[3, 3, 1, 32]))
    bias1 = tf.Variable(tf.constant(1.0, shape=[32]))
    x_reshaped = tf.reshape(x, [-1, 28, 28, 1])
    # x [None, 28, 28, 1] ----> [None, 28, 28, 32]
    conved1 = tf.nn.conv2d(input=x_reshaped, filter=weight1, strides=[1, 1, 1, 1], padding='SAME')
    print(conved1)
    relu1 = tf.nn.relu(conved1) + bias1
    print(relu1)# 池化1with tf.variable_scope('pool'):
    # 視窗2*2  步長2     x [None, 28, 28, 32] ----> [None, 14, 14, 32]
    pool1 = tf.nn.max_pool(value=relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    print(pool1)

# 卷積2with tf.variable_scope('conv2'):
    # 視窗3*3  步長1   64個視窗
    weight2 = tf.Variable(tf.random_normal([3, 3, 32, 64]))
    bias2 = tf.Variable(tf.constant(1.0, shape=[64]))
    # x [None, 14, 14, 32] ----> [None, 14, 14, 64]
    conved2 = tf.nn.conv2d(input=pool1, filter=weight2, strides=[1, 1, 1, 1], padding='SAME')
    print(conved2)
    relu2 = tf.nn.relu(conved2) + bias2

# 池化2with tf.variable_scope('pool2'):
    # 視窗2*2  步長2     x [None, 14, 14, 64] ----> [None, 7, 7, 64]
    pool2 = tf.nn.max_pool(value=relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

# 全連接層  x[None, 7, 7, 64] -> x[None, 7*7*64] * ([7*7*64, 10]) ----> y[None, 10]with tf.variable_scope('full_coon'):
    x_fc = tf.reshape(pool2, [-1, 7*7*64])
    weight_fc = tf.Variable(tf.random_normal([7*7*64, 10]))
    bias_fc = tf.Variable(tf.constant(1.0, shape=[10]))
    y_predict = tf.matmul(x_fc, weight_fc) + bias_fc

# 交叉熵損失函式with tf.variable_scope('loss'):
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=y_predict))
    train_op = tf.train.GradientDescentOptimizer(0.001).minimize(loss)

# 計算準確度with tf.variable_scope('acc'):
    equal_list = tf.equal(tf.argmax(y_true, 1), tf.argmax(y_predict, 1))
    accuracy = tf.reduce_mean(tf.cast(equal_list, tf.float32))

init_var = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init_var)
    mnist = input_data.read_data_sets('./input_data', one_hot=True)
    for i in range(1000):
        image, label = mnist.train.next_batch(100)
        sess.run(train_op, feed_dict={x: image, y_true: label})
        print('第%s步，準確率為: %s' % (i, sess.run(accuracy, feed_dict={x: image, y_true: label})))

一個模型例子：：：

def model(input_tensor, train,
          regularizer):  
    with tf.variable_scope('layer1-conv1'):  # 定義一個作用域：layer1-conv1，在該作用域下面可以定義相同名稱的變數（用于變數）
        conv1_weights = tf.get_variable("weight", [5, 5, 3, 32],
                                        initializer=tf.truncated_normal_initializer(stddev=0.1))
        # 定義變數權重：weight，名稱是weight；5,5代表卷積核的大小，3代表輸入的信道數目，32代表輸出的信道數目；initializer代表神經網路權重和卷積核的推薦初始值，生成截斷正態分布亂數，服從標準差為0.1
        conv1_biases = tf.get_variable("bias", [32], initializer=tf.constant_initializer(0.0))
        # 定義變數偏置：bias，名稱bias，[32]代表當前層的深度；initializer代表偏置的初始化，用函式tf.constant_initializer將其初始化為0，也可以初始化為tf.zeros_initializer或者tf.ones_initializer
        conv1 = tf.nn.conv2d(input_tensor, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
        # 上面為定義卷積層：input_tensor為當前層的節點矩陣；conv1_weights代表卷積層的權重；strides為不同方向上面的步長；padding標識填充，有兩種方式，SAME表示用0填充，“VALID”表示不填充，
        relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_biases))  # 定義激活函式：利用bias_add給每個節點都加上偏置項，然后利用relu函式去線性化
    with tf.name_scope("layer2-pool1"):  # 定義一個：layer2-pool1（用于op）
        # 池化層可以優先縮小矩陣的尺寸，從而減小最后全連接層當中的引數；池化層既可以加快計算速度，也可以防止過擬合，
        pool1 = tf.nn.max_pool(relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="VALID")
        # ksize代表pool視窗的尺寸，首尾兩個數必須是1，ksize最常用[1,2,2,1]和[1,3,3,1]；strides代表filter的步長，首尾兩個數必須為1；padding代表填充方式；
    with tf.variable_scope("layer3-conv2"):  # 定義作用域（用于變數）
        # 定義權重
        conv2_weights = tf.get_variable("weight", [5, 5, 32, 64],
                                        initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv2_biases = tf.get_variable("bias", [64], initializer=tf.constant_initializer(0.0))  # 定義偏置
        conv2 = tf.nn.conv2d(pool1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')  # 定義卷積層
        relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_biases))  # 定義激活函式
    with tf.name_scope("layer4-pool2"):  # 定義命名空間（用于op）
        pool2 = tf.nn.max_pool(relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')  # 定義池化層
    with tf.variable_scope("layer5-conv3"):  # 定義作用域 （用于變數）
        # 定義權重
        conv3_weights = tf.get_variable("weight", [3, 3, 64, 128],
                                        initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv3_biases = tf.get_variable("bias", [128], initializer=tf.constant_initializer(0.0))  # 定義偏置
        conv3 = tf.nn.conv2d(pool2, conv3_weights, strides=[1, 1, 1, 1], padding='SAME')  # 定義卷積層
        relu3 = tf.nn.relu(tf.nn.bias_add(conv3, conv3_biases))  # 定義激活函式
    with tf.name_scope("layer6-pool3"):  # 定義命名空間（用于op）
        pool3 = tf.nn.max_pool(relu3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')  # 定義池化層
    with tf.variable_scope("layer7-conv4"):  # 定義作用域（用于變數）
        # 定義權重
        conv4_weights = tf.get_variable("weight", [3, 3, 128, 128],
                                        initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv4_biases = tf.get_variable("bias", [128], initializer=tf.constant_initializer(0.0))  # 定義偏置
        conv4 = tf.nn.conv2d(pool3, conv4_weights, strides=[1, 1, 1, 1], padding='SAME')  # 定義卷積層
        relu4 = tf.nn.relu(tf.nn.bias_add(conv4, conv4_biases))  # 定義激活函式
    with tf.name_scope("layer8-pool4"):  # 定義命名空間（用于op）
        pool4 = tf.nn.max_pool(relu4, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')  # 定義池化層
      
        nodes = 6 * 6 * 128  
        reshaped = tf.reshape(pool4, [-1, nodes])
        print("shape of reshaped:", reshaped.shape)  # reshape函式將pool4的輸出轉化成向量
    # 定義作用域：
    with tf.variable_scope('layer9-fc1'):
        # 定義全連接層的權重：
        fc1_weights = tf.get_variable("weight", [nodes, 1024], initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None: tf.add_to_collection('losses', regularizer(fc1_weights))
        # 給全連接層的權重添加正則項，tf.add_to_collection函式可以把變數放入一個集合，把很多變數變成一個串列
        fc1_biases = tf.get_variable("bias", [1024], initializer=tf.constant_initializer(0.1))  # 定義全連接層的偏置：
        fc1 = tf.nn.relu(tf.matmul(reshaped, fc1_weights) + fc1_biases)  # 定義激活函式：
        if train: fc1 = tf.nn.dropout(fc1, 0.5)  # 針對訓練資料，在全連接層添加dropout層，防止過擬合
    with tf.variable_scope('layer10-fc2'):
        fc2_weights = tf.get_variable("weight", [1024, 512], initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None: tf.add_to_collection('losses', regularizer(fc2_weights))
        fc2_biases = tf.get_variable("bias", [512], initializer=tf.constant_initializer(0.1))
        fc2 = tf.nn.relu(tf.matmul(fc1, fc2_weights) + fc2_biases)
        if train: fc2 = tf.nn.dropout(fc2, 0.5)
    with tf.variable_scope('layer11-fc3'):
        fc3_weights = tf.get_variable("weight", [512, 5],
                                      initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None: tf.add_to_collection('losses', regularizer(fc3_weights))
        fc3_biases = tf.get_variable("bias", [5], initializer=tf.constant_initializer(0.1))
        logit = tf.matmul(fc2, fc3_weights) + fc3_biases
    return logit

def inference(input_tensor, train, regularizer):
    with tf.compat.v1.variable_scope('layer1-conv1'):
        # 定義變數權重：weight，名稱是weight；5,5代表卷積核的大小，3代表輸入的信道數目，32代表輸出的信道數目；
        # initializer代表神經網路權重和卷積核的推薦初始值，生成截斷正態分布亂數，服從標準差為0.1
        conv1_weights = tf.compat.v1.get_variable("weight", [5, 5, 3, 32],
                                                  initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1))
        # 初始化偏置值為0
        conv1_biases = tf.compat.v1.get_variable("bias", [32], initializer=tf.constant_initializer(0.0))
        conv1 = tf.nn.conv2d(input_tensor, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_biases))


    with tf.name_scope("layer2-pool1"):
        pool1 = tf.compat.v1.nn.max_pool(relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="VALID")
        logging.info(f"First convolutional layer:{pool1}")


    with tf.compat.v1.variable_scope("layer3-conv2"):
        conv2_weights = tf.compat.v1.get_variable("weight", [5, 5, 32, 64],
                                                  initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1))
        conv2_biases = tf.compat.v1.get_variable("bias", [64], initializer=tf.compat.v1.constant_initializer(0.0))
        conv2 = tf.compat.v1.nn.conv2d(pool1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu2 = tf.compat.v1.nn.relu(tf.nn.bias_add(conv2, conv2_biases))


    with tf.name_scope("layer4-pool2"):
        pool2 = tf.compat.v1.nn.max_pool(relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
        logging.info(f"Second convolutional layer:{pool2}")


    with tf.compat.v1.variable_scope("layer5-conv3"):
        conv3_weights = tf.compat.v1.get_variable("weight", [3, 3, 64, 128],
                                                  initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1))
        conv3_biases = tf.compat.v1.get_variable("bias", [128], initializer=tf.compat.v1.constant_initializer(0.0))
        conv3 = tf.compat.v1.nn.conv2d(pool2, conv3_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu3 = tf.compat.v1.nn.relu(tf.nn.bias_add(conv3, conv3_biases))


    with tf.compat.v1.name_scope("layer6-pool3"):
        pool3 = tf.compat.v1.nn.max_pool(relu3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
        logging.info(f"Third convolutional layer:{pool3}")


    with tf.compat.v1.variable_scope("layer7-conv4"):
        conv4_weights = tf.compat.v1.get_variable("weight", [3, 3, 128, 128],
                                                  initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1))
        conv4_biases = tf.compat.v1.get_variable("bias", [128], initializer=tf.compat.v1.constant_initializer(0.0))
        conv4 = tf.compat.v1.nn.conv2d(pool3, conv4_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu4 = tf.compat.v1.nn.relu(tf.nn.bias_add(conv4, conv4_biases))


    with tf.compat.v1.name_scope("layer8-pool4"):
        pool4 = tf.compat.v1.nn.max_pool(relu4, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')
        logging.info(f"The fourth convolutional layer:{pool4}")
        nodes = 6 * 6 * 128
        # 展開
        reshaped = tf.compat.v1.reshape(pool4, [-1, nodes])


    with tf.compat.v1.variable_scope('layer9-fc1'):
        fc1_weights = tf.compat.v1.get_variable("weight", [nodes, 1024],
                                                initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1))
        if regularizer != None: tf.compat.v1.add_to_collection('losses', regularizer * tf.nn.l2_loss(fc1_weights))
        fc1_biases = tf.compat.v1.get_variable("bias", [1024], initializer=tf.compat.v1.constant_initializer(0.1))


        fc1 = tf.compat.v1.nn.relu(tf.compat.v1.matmul(reshaped, fc1_weights) + fc1_biases)
        logging.info(f"The first fully connected layer:{fc1}")
        if train: fc1 = tf.compat.v1.nn.dropout(fc1, 0.5)


    with tf.compat.v1.variable_scope('layer10-fc2'):
        fc2_weights = tf.compat.v1.get_variable("weight", [1024, 512],
                                                initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1))
        if regularizer != None: tf.compat.v1.add_to_collection('losses', regularizer * tf.nn.l2_loss(fc2_weights))
        fc2_biases = tf.compat.v1.get_variable("bias", [512], initializer=tf.compat.v1.constant_initializer(0.1))


        fc2 = tf.compat.v1.nn.relu(tf.matmul(fc1, fc2_weights) + fc2_biases)
        if train: fc2 = tf.compat.v1.nn.dropout(fc2, 0.5)
        logging.info(f"The second fully connected layer:{fc2}")


    with tf.compat.v1.variable_scope('layer11-fc3'):
        fc3_weights = tf.compat.v1.get_variable("weight", [512, 5],
                                                initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1))
        if regularizer != None: tf.compat.v1.add_to_collection('losses', regularizer * tf.nn.l2_loss(fc3_weights))
        fc3_biases = tf.compat.v1.get_variable("bias", [5], initializer=tf.compat.v1.constant_initializer(0.1))
        logit = tf.compat.v1.matmul(fc2, fc3_weights) + fc3_biases
    return logit

conv1 = tf.nn.conv2d(input_tensor,conv1_weights,strides=[1,1,1,1],padding='SAME')

這是一個常見的卷積操作，其中strides=【1,1,1,1】表示滑動步長為1，padding=‘SAME’表示填0操作

當我們要設定步長為2時，strides=【1,2,2,1】，很多同學可能不理解了，這四個引數分別代表了什么，

strides在官方定義中是一個一維具有四個元素的張量，其規定前后必須為1，所以我們可以改的是中間兩個數，中間兩個數分別代表了水平滑動和垂直滑動步長值，于是就很好理解了，在卷積核移動逐漸掃描整體圖時候，因為步長的設定問題，可能導致剩下未掃描的空間不足以提供給卷積核的，大小掃描比如有圖大小為5*5,卷積核為2*2,步長為2,卷積核掃描了兩次后，剩下一個元素，不夠卷積核掃描了，這個時候就在后面補零，補完后滿足卷積核的掃描，這種方式就是same，如果說把剛才不足以掃描的元素位置拋棄掉，就是valid方式，

函式引數的解釋：

tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)

除去name引數用以指定該操作的name，與方法有關的一共五個引數：

input：

指需要做卷積的輸入影像，它要求是一個Tensor，具有[batch, in_height, in_width, in_channels]這樣的shape，具體含義是[訓練時一個batch的圖片數量, 圖片高度, 圖片寬度, 影像通道數]，注意這是一個4維的Tensor，要求型別為float32和float64其中之一

filter：

相當于CNN中的卷積核，它要求是一個Tensor，具有[filter_height, filter_width, in_channels, out_channels]這樣的shape，具體含義是[卷積核的高度，卷積核的寬度，影像通道數，卷積核個數]，要求型別與引數input相同，有一個地方需要注意，第三維in_channels，就是引數input的第四維，【有時候也叫ksize或卷積核】

* strides：卷積時在影像每一維的步長，這是一個一維的向量，長度4，【注意：一般兩邊為1.形如[1,height,weight,1]】，

padding：

string型別的量，只能是”SAME”,”VALID”其中之一，這個值決定了不同的卷積方式【same：不夠還加，valid：不夠舍棄，】

use_cudnn_on_gpu：

bool型別，是否使用cudnn加速，默認為true

batch和Eposh

神經網路中Batch和Epoch之間的區別是什么？隨機梯度下降法是一種具有大量超引數的學習演算法，兩個超引數: Batch大小和Epoch數量,它們都是整數值，batch字面上是批量的意思，在深度學習中指的是計算一次cost需要的輸入資料個數，這意味著資料集將分為40個Batch，每個Batch有5個樣本，每批五個樣品后，模型權重將更新，這也意味著一個epoch將涉及40個Batch或40個模型更新，有1000個Epoch，模型將暴露或傳遞整個資料集1,000次，一個 batch 的樣本通常比單個輸入更接近于總體輸入資料的分布，batch 越大就越近似，然而，每個 batch 將花費更長的時間來處理，并且仍然只更新模型一次，
Sample: 樣本，資料集中的一個元素，一條資料，

例1: 在卷積神經網路中，一張影像是一個樣本，

例2: 在語音識別模型中，一段音頻是一個樣本，
Batch: 批，含有 N 個樣本的集合，每一個 batch 的樣本都是獨立并行處理的，在訓練時，一個 batch 的結果只會用來更新一次模型，

一個 batch 的樣本通常比單個輸入更接近于總體輸入資料的分布，batch 越大就越近似，然而，每個 batch 將花費更長的時間來處理，并且仍然只更新模型一次，在推理（評估/預測）時，建議條件允許的情況下選擇一個盡可能大的 batch，（因為較大的 batch 通常評估/預測的速度會更快），
Epoch: 輪次，通常被定義為「在整個資料集上的一輪迭代」，用于訓練的不同的階段，這有利于記錄和定期評估，

當在 Keras 模型的 fit 方法中使用 validation_data 或 validation_split 時，評估將在每個 epoch 結束時運行，

在 Keras 中，可以添加專門的用于在 epoch 結束時運行的 callbacks 回呼，例如學習率變化和模型檢查點（保存），
這也意味著一個epoch將涉及40個Batch或40個模型更新，

有1000個Epoch，模型將暴露或傳遞整個資料集1,000次，在整個培訓程序中，總共有40,000Batch，

用一個小例子來說明這一點，

假設您有一個包含200個樣本（資料行）的資料集，并且您選擇的Batch大小為5和1,000個Epoch，

這意味著資料集將分為40個Batch，每個Batch有5個樣本，每批五個樣品后，模型權重將更新，
池化

池化程序在一般卷積程序后，池化（pooling）的本質，其實就是采樣，Pooling 對于輸入的 Feature Map，選擇某種方式對其進行降維壓縮，以加快運算速度，
池化的作用：

（1）保留主要特征的同時減少引數和計算量，防止過擬合，

（2）invariance(不變性)，這種不變性包括translation(平移)，rotation(旋轉)，scale(尺度)，

Pooling 層說到底還是一個特征選擇，資訊過濾的程序，也就是說我們損失了一部分資訊，這是一個和計算性能的一個妥協，隨著運算速度的不斷提高，我認為這個妥協會越來越小，

現在有些網路都開始少用或者不用pooling層了，

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/327858.html

標籤：AI

上一篇：python機器學習《機器學習Python實踐》整理，sklearn庫應用詳解

下一篇：Python 計算機視覺（五）—— OpenCV 進行影像幾何變換