本文通过 TensorFlow 深度学习框架演示 CNN 卷积层的工作原理。首先,引入相关依赖并查看 TensorFlow 的版本号。
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
tf.__version__
'2.7.4'
定义权值矩阵
W = np.array([[0, 0, -1], [0, 1, 0], [-2, 0, 2]], dtype=np.int32)
W
array([[ 0, 0, -1], [ 0, 1, 0], [-2, 0, 2]], dtype=int32)
定义偏置项
b = np.array([1], dtype=np.int32)
b
array([1], dtype=int32)
定义一个 Sequential 模型,它只包含一个 2D 卷积层,并使用 W
和 b
初始化模型层参数。为了输出模型的 summary,输入了 input_shape
,可以看到总的参数量为 10,这正好是 W
和 b
的参数总和。
model = models.Sequential(
[
layers.Conv2D(
input_shape=(7, 7, 1),
filters=1,
kernel_size=[3, 3],
kernel_initializer=tf.constant_initializer(W),
bias_initializer=tf.constant_initializer(b),
)
]
)
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param# ================================================================= conv2d (Conv2D) (None, 5, 5, 1) 10 ================================================================= Total params: 10 Trainable params: 10 Non-trainable params: 0 _________________________________________________________________
2024-10-10 16:21:45.159123: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
模拟一张单通道图片,例如灰度图。在传入 model
之前,需要进行维度扩展,因为所需的最低维度是 4,第一维是批次 (batch),最后一维是通道。
image = np.array(
[
[0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 2, 1, 0],
[0, 0, 2, 2, 0, 1, 0],
[0, 1, 1, 0, 2, 1, 0],
[0, 0, 2, 1, 1, 0, 0],
[0, 2, 1, 1, 2, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
],
dtype=np.int32,
)
image.shape
(7, 7)
image = np.expand_dims(image, axis=0)
image.shape
(1, 7, 7)
image = np.expand_dims(image, axis=-1)
image.shape
(1, 7, 7, 1)
传入 model
,执行一次卷积操作;将得到的结果进行维度压缩,以便于展示。
output = model(image)
tf.squeeze(output)
<tf.Tensor: shape=(5, 5), dtype=float32, numpy= array([[ 6., 5., -2., 1., 2.], [ 3., 0., 3., 2., -2.], [ 4., 2., -1., 0., 0.], [ 2., 1., 2., -1., -3.], [ 1., 1., 1., 3., 1.]], dtype=float32)>
假设图片是彩色的,例如有 RGB 三通道,现在考虑一个卷积层,有 2 个卷积核,大小是 3×3,因此参数量为 (3×3×3+1)×2=56。由于有 2 个卷积核,所以卷积之后,最后一维的大小为 2。
model = models.Sequential(
[layers.Conv2D(input_shape=(7, 7, 3), filters=2, kernel_size=[3, 3])]
)
model.summary()
Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param# ================================================================= conv2d_1 (Conv2D) (None, 5, 5, 2) 56 ================================================================= Total params: 56 Trainable params: 56 Non-trainable params: 0 _________________________________________________________________