模型压缩：剪枝_全局无结构剪枝-CSDN博客

剪枝：基于深度神经网络他有大量的参数量，才能达到SOTA，但是仿照生物的神经网络，将稠密的神经连接变为稀疏的也能达到SOTA效果。

举例模型

class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        # 1: 图像的输入通道(1是黑白图像), 6: 输出通道数量, 3 * 3: 卷积核的尺寸
        self.conv1 = nn.Conv2d(1, 6, 3)
        self.conv2 = nn.Conv2d(6, 16, 3)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, int(x.nelement() / x.shape[0]))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

1、无结构化的剪枝(随机剪枝)：

# module = model.conv1
# prune.random_unstructured(module, name="weight", amount=0.3)
# print(list(module.named_parameters()))
# print(list(module.named_buffers()))

这种剪枝方式是随机的剪掉对应层的权值30%

2、无结构化的剪枝（l1剪枝）:

# prune.l1_unstructured(module, name="bias", amount=3)

这种剪枝通过L1范数，可以先剪掉对模型贡献最小的那一部分

3、序列化剪枝：

prune.remove(module, 'weight')

首先再剪枝后，会生成weight_orig(存储在缓冲区，模型之前未剪枝前的权重)与weight_mask(掩码,就是那些需要掩码)，而weight是他俩个相乘，这就导致降低了效率，并且占用了内存，所以remove以后就不存在这两个了，而是变为weight。

4、结构化剪枝（多参数模块剪枝）：

# for name, module in model.named_modules():
#     # 对模型中所有的卷积层执行l1_unstructured剪枝操作, 选取20%的参数进行剪枝
#     if isinstance(module, torch.nn.Conv2d):
#         prune.l1_unstructured(module, name="weight", amount=0.2)
#     # 对模型中所有的全连接层执行ln_structured剪枝操作, 选取40%的参数进行剪枝
#     elif isinstance(module, torch.nn.Linear):
#         prune.ln_structured(module, name="weight", amount=0.4, n=2, dim=0)#bias只有1维没法用结构化剪枝

这样可以处理模型的所有层，不需要一个一个写，但需要注意结构化剪枝是需要多维数据的一维的是不行的

5、全局剪枝：

# # 构建参数集合, 决定哪些层, 哪些参数集合参与剪枝
# parameters_to_prune = (
#             (model.conv1, 'weight'),
#             (model.conv2, 'weight'),
#             (model.fc1, 'weight'),
#             (model.fc2, 'weight'),
#             (model.fc3, 'weight'))
#
# # 调用prune中的全局剪枝函数global_unstructured执行剪枝操作, 此处针对整体模型中的20%参数量进行剪枝
# prune.global_unstructured(parameters_to_prune, pruning_method=prune.L1Unstructured, amount=0.2)
#
# # 最后打印剪枝后的模型的状态字典
# print(model.state_dict().keys())
# print(
#     "Sparsity in conv1.weight: {:.2f}%".format(
#     100. * float(torch.sum(model.conv1.weight == 0))
#     / float(model.conv1.weight.nelement())
#     ))
#
# print(
#     "Sparsity in conv2.weight: {:.2f}%".format(
#     100. * float(torch.sum(model.conv2.weight == 0))
#     / float(model.conv2.weight.nelement())
#     ))
#
# print(
#     "Sparsity in fc1.weight: {:.2f}%".format(
#     100. * float(torch.sum(model.fc1.weight == 0))
#     / float(model.fc1.weight.nelement())
#     ))
#
# print(
#     "Sparsity in fc2.weight: {:.2f}%".format(
#     100. * float(torch.sum(model.fc2.weight == 0))
#     / float(model.fc2.weight.nelement())
#     ))
#
# print(
#     "Sparsity in fc3.weight: {:.2f}%".format(
#     100. * float(torch.sum(model.fc3.weight == 0))
#     / float(model.fc3.weight.nelement())
#     ))
#
# print(
#     "Global sparsity: {:.2f}%".format(
#     100. * float(torch.sum(model.conv1.weight == 0)
#                + torch.sum(model.conv2.weight == 0)
#                + torch.sum(model.fc1.weight == 0)
#                + torch.sum(model.fc2.weight == 0)
#                + torch.sum(model.fc3.weight == 0))
#          / float(model.conv1.weight.nelement()
#                + model.conv2.weight.nelement()
#                + model.fc1.weight.nelement()
#                + model.fc2.weight.nelement()
#                + model.fc3.weight.nelement())
#     ))

他是剪去总体的20%，具体每一层剪多少有参数决定

6、自定义剪枝