深度学习（八）——神经网络：卷积层，如何应用？

摘要：# 一、卷积层Convolution Layers函数简介 &gt; 官网网址：[torch.nn.functional — PyTorch 2.0 documentation](https:pytorch.orgdo

一、卷积层Convolution Layers函数简介官网网址：torch.nn.functional — PyTorch 2.0 documentation 由于是图像处理，所以主要介绍Conv2d。 class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None) 参数解释可见上一篇笔记 in_channels(int): 输入图像的通道数，彩色图像一般为3（RGB三通道） out_channel(int): 通过卷积后，产生的输出的通道数 kernel_size(int or tuple): 一个数或者元组，定义卷积大小。如kernel_size=3，即定义了一个大小为3×3的卷积核；kernel_size=(1,2)，即定义了一个大小为1×2的卷积核。 stride(int or tuple，可选): 默认为1，卷积核横向、纵向的步幅大小 padding(int or tuple，可选): 默认为0，对图像边缘进行填充的范围 padding_mode(string，可选): 默认为zeros，对图像周围进行padding时，采取什么样的填充方式。可选参数有：'zeros','reflect','replicate'or'circular'。 dilation(int or tuple，可选): 默认为1，定义在卷积过程中，它的核之间的距离。这个我们称之为空洞卷积，但不常用。 groups(int or tuple，可选): 默认为1。分组卷积，一般都设置为1，很少有改动 bias(bool，可选): 默认为True。偏置，常年设置为True。代表卷积后的结果是否加减一个常数。关于卷积操作，官方文档的解释如下： In the simplest case, the output value of the layer with input size\((N,C_{in},H,W)\)and output\((N,C_{out},H_{out},W_{out})\)can be precisely described as: \[out(N_i,Cou_{tj})=bias(C_{out_j})+∑_{k=0}^{C_{in}−1}weight(C_{out_j},k)⋆input(N_i,k) \] where⋆is the valid 2Dcross-correlationoperator,\(N\)is a batch size,\(C\)denotes a number of channels,\(H\)is a height of input planes in pixels, and\(W\)is width in pixels. （1）参数kernel_size的说明 kernel_size主要是用来设置卷积核大小尺寸的，给定模型一个kernel_size，模型就可以据此生成相应尺寸的卷积核。卷积核中的参数从图像数据分布中采样计算得到的。卷积核中的参数会通过训练不断进行调整。（2）参数out_channel的说明如果输入图像in_channel=1，并且只有一个卷积核，那么对于卷积后产生的输出，其out_channel也为1 如果输入图像in_channel=2，此时有两个卷积核，那么在卷积后将会输出两个矩阵，把这两个矩阵当作一个输出，此时out_channel=2 二、实例讲解使用CIFAR中的图像数据，对Conv2d进行讲解 import torch from torch import nn import torchvision from torch.utils.data import DataLoader from torch.nn import Conv2d from torch.utils.tensorboard import SummaryWriter #导入图像数据 dataset=torchvision.datasets.CIFAR10(root="./dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True) #打包数据 dataloder=DataLoader(dataset,batch_size=64) #搭建神经网络 class Demo(nn.Module): def __init__(self): super().__init__() #导入图像为彩色，所以in_chennel=3；输出我们可以试试out_chennel=6；kernel_size=3(进行3×3的卷积)，stride和padding均使用默认值1和0 self.conv1=Conv2d(in_channels=3,out_channels=6,kernel_size=3,stride=1,padding=0) #self.conv1为一个卷积层，Conv2d是建立卷积层的函数 def forward(self,x): x = self.conv1(x) #将输入的x放进卷积层self.conv1中，然后返回得到的值（即输出的x） return x demo=Demo() print(demo) """ [Run] Demo( (conv1): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1)) ) [解读] 就是Demo中有一个卷积层conv1，输入3通道图像，输出6通道图像。卷积核大小为3×3，步幅为1×1 """ #将每张图像放入神经网络中，并查看大小 for data in dataloder: imgs,targets=data #获取打包好的图像 output=demo(imgs) #将图像数据放入神经网络中，经过forward函数进行卷积操作 print(imgs.shape) #举例其中一个结果：[Run] torch.Size([64, 3, 30, 30]),即第10行中batch_size=64,in_channel为6通道，32×32 print(output.shape) #举例其中一个结果：[Run] torch.Size([64, 6, 30, 30]),即第10行中batch_size=64,out_channel为6通道，卷积操作后尺寸变为30×30 #更直观地对处理前后图像进行可视化 writer=SummaryWriter("nn_logs") step=0 for data in dataloder: imgs,targets=data output=demo(imgs) #torch.Size([64, 3, 32, 32]) writer.add_images("input",imgs,step) #torch.Size([64, 6, 30, 30]) #由于通道数为6，add_images不知道如何显示，所以用一个不太严谨的方法，reshape一下图像，变为[xxx,3,30,30]，多余的像素放在batch_size里面 output=torch.reshape(output,(-1,3,20,30)) #由于第一个值不知道是多少，所以写-1，它会根据后面的值去计算 writer.add_images("output",output,step) step+=1 结果中图像的尺寸变小，如果要使图像尺寸不变，可以考虑用padding进行填充需要注意的是，要完全在网页上显示图像，打开路径时代码要变成： tensorboard --logdir=路径 --samples_per_plugin=images=1000 三、图像输入输出尺寸转化计算公式参数说明： \(N:\) 图像的batch_size \(C:\) 图像的通道数 \(H:\) 图像的高 \(W:\) 图像的宽计算过程： Input:\((N,C_{in},H_{in},W_{in})or(C_{in},H_{in},W_{in})\) Output: \((N,C_{out},H_{out},W_{out})or(C_{out},H_{out},W_{out})\) 其中有： \(H_{out}=⌊\frac{H_{in}+2×padding[0]−dilation[0]×(kernel\_size[0]−1)−1}{stride[0]}+1⌋\) \(W_{out}=⌊\frac{W_{in}+2×padding[1]−dilation[1]×(kernel\_size[1]−1)−1}{stride[1]}+1⌋\) 看论文的时候，有些比如像padding这样的参数不知道，就可以用这条公式去进行推导

深度学习（八）——神经网络：卷积层，如何应用？

相关推荐