4lqedz 发表于 2024-8-31 06:07:53

运用贝叶斯优化进行深度神经网络超参数优化


    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">点击上方“</span><span style="color: black;">Deephub Imba</span><span style="color: black;">”,关注公众号,好<span style="color: black;">文案</span>不<span style="color: black;">错失</span> </span><span style="color: black;">!</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在本文中,<span style="color: black;">咱们</span>将深入<span style="color: black;">科研</span>超参数优化。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">为了方便起见本文将<span style="color: black;">运用</span> Tensorflow 中<span style="color: black;">包括</span>的 Fashion MNIST 数据集。该数据集在训练集中<span style="color: black;">包括</span> 60,000 张灰度图像,在测试集中<span style="color: black;">包括</span> 10,000 张图像。每张<span style="color: black;">照片</span><span style="color: black;">表率</span>属于 10 个类别之一的单品(“T 恤/上衣”、“裤子”、“套头衫”等)。<span style="color: black;">因此呢</span>这是一个多类<span style="color: black;">归类</span>问题。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">这儿</span>简单介绍准备数据集的<span style="color: black;">过程</span>,<span style="color: black;">由于</span>本文的<span style="color: black;">重点</span>内容是超参数的优化,<span style="color: black;">因此</span>这部分只是简单介绍流程,<span style="color: black;">通常</span><span style="color: black;">状况</span>下,流程如下:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">加载数据。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">分为训练集、验证集和测试集。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">将像素值从 0–255 标准化到 0–1 范围。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">One-hot 编码<span style="color: black;">目的</span>变量。</span></p><span style="color: black;">#load data</span><span style="color: black;">(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()</span><span style="color: black;"># split into train, validation and test sets</span><span style="color: black;">train_x, val_x, train_y, val_y = train_test_split(train_images, train_labels, stratify=train_labels, random_state=48, test_size=0.05)</span><span style="color: black;">(test_x, test_y)=(test_images, test_labels)</span><span style="color: black;"># normalize pixels to range 0-1</span><span style="color: black;">train_x = train_x / 255.0</span><span style="color: black;">val_x = val_x / 255.0</span><span style="color: black;">test_x = test_x / 255.0</span><span style="color: black;">#one-hot encode target variable</span><span style="color: black;">train_y = to_categorical(train_y)</span><span style="color: black;">val_y = to_categorical(val_y)</span><span style="color: black;">test_y = to_categorical(test_y)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">咱们</span>所有训练、验证和测试集的形状是:</span></p><span style="color: black;">print(train_x.shape) #(57000, 28, 28)</span><span style="color: black;">print(train_y.shape) #(57000, 10)</span><span style="color: black;">print(val_x.shape) &nbsp; #(3000, 28, 28)</span><span style="color: black;">print(val_y.shape) &nbsp; #(3000, 10)</span><span style="color: black;">print(test_x.shape) &nbsp; #(10000, 28, 28)</span><span style="color: black;">print(test_y.shape) &nbsp; #(10000, 10)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">此刻</span>,<span style="color: black;">咱们</span>将<span style="color: black;">运用</span> Keras Tuner 库 :它将<span style="color: black;">帮忙</span><span style="color: black;">咱们</span><span style="color: black;">容易</span><span style="color: black;">调节</span>神经网络的超参数:</span></p><span style="color: black;">pip install keras-tuner</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Keras Tuner 需要 Python 3.6+ 和 TensorFlow 2.0+</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">超参数<span style="color: black;">调节</span>是机器学习项目的<span style="color: black;">基本</span>部分。有两种类型的超参数:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">结构超参数:定义模型的整体架构(例如<span style="color: black;">隐匿</span>单元的数量、层数)</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优化器超参数:影响训练速度和质量的参数(例如学习率和优化器类型、批量<span style="color: black;">体积</span>、轮次数等)</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">为何</span>需要超参数调优库?<span style="color: black;">咱们</span><span style="color: black;">不可</span>尝试所有可能的组合,<span style="color: black;">瞧瞧</span>验证集上什么是最好的吗?</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这肯定是不行的<span style="color: black;">由于</span>深度神经网络需要大量时间来训练,<span style="color: black;">乃至</span>几天。<span style="color: black;">倘若</span>在云服务器上训练大型模型,<span style="color: black;">那样</span><span style="color: black;">每一个</span>实验实验都需要花<span style="color: black;">非常多</span>的钱。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">因此呢</span>,需要一种限制超参数搜索空间的剪枝策略。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">keras-tuner<span style="color: black;">供给</span>了贝叶斯优化器。它搜索<span style="color: black;">每一个</span>可能的组合,而是随机<span style="color: black;">选取</span>前几个。<span style="color: black;">而后</span><span style="color: black;">按照</span>这些超参数的性能,<span style="color: black;">选取</span>下一个可能的最佳值。<span style="color: black;">因此呢</span><span style="color: black;">每一个</span>超参数的<span style="color: black;">选取</span>都取决于之前的尝试。<span style="color: black;">按照</span>历史记录<span style="color: black;">选取</span>下一组超参数并<span style="color: black;">评定</span>性能,直到找到最佳组合或到达最大<span style="color: black;">实验</span>次数。<span style="color: black;">咱们</span><span style="color: black;">能够</span><span style="color: black;">运用</span>参数“max_trials”来配置它。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">除了贝叶斯优化器之外,keras-tuner还<span style="color: black;">供给</span>了<span style="color: black;">另一</span>两个<span style="color: black;">平常</span>的<span style="color: black;">办法</span>:RandomSearch 和 Hyperband。<span style="color: black;">咱们</span>将在本文末尾讨论它们。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">接下来<span style="color: black;">便是</span>对<span style="color: black;">咱们</span>的网络应用超参数<span style="color: black;">调节</span>。<span style="color: black;">咱们</span>尝试两种网络架构,标准多层感知器(MLP)和卷积神经网络(CNN)。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">首要</span>让<span style="color: black;">咱们</span><span style="color: black;">瞧瞧</span>基线 MLP 模型是什么:</span></p><span style="color: black;">model_mlp = Sequential()</span><span style="color: black;">model_mlp.add(Flatten(input_shape=(28, 28)))</span><span style="color: black;">model_mlp.add(Dense(350, activation=relu))</span><span style="color: black;">model_mlp.add(Dense(10, activation=softmax))</span><span style="color: black;">print(model_mlp.summary())</span><span style="color: black;">model_mlp.compile(optimizer="adam",loss=categorical_crossentropy)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">调优过程需要两种<span style="color: black;">重点</span><span style="color: black;">办法</span>:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"> hp.Int():设置超参数的范围,其值为整数 - 例如,密集层中<span style="color: black;">隐匿</span>单元的数量:</span></p><span style="color: black;">model.add(Dense(units = hp.Int(dense-bot, min_value=50, max_value=350, step=50))</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">hp.Choice():为超参数<span style="color: black;">供给</span>一组值——例如,Adam 或 SGD <span style="color: black;">做为</span>最佳优化器?</span></p><span style="color: black;">hp_optimizer=hp.Choice(Optimizer, values=)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在<span style="color: black;">咱们</span>的 MLP 示例中,<span style="color: black;">咱们</span>测试了以下超参数:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"> <span style="color: black;">隐匿</span>层数:1-3</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">第1</span>密集层<span style="color: black;">体积</span>:50–350</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">第二和第三密集层<span style="color: black;">体积</span>:50–350</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Dropout:0、0.1、0.2</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优化器:SGD(nesterov=True,momentum=0.9) 或 Adam</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">学习率:0.1、0.01、0.001</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">代码如下:</span></p><span style="color: black;">model = Sequential()</span><span style="color: black;">model.add(Dense(units = hp.Int(dense-bot, min_value=50, max_value=350, step=50), input_shape=(784,), activation=relu))</span><span style="color: black;">for i in range(hp.Int(num_dense_layers, 1, 2)):</span><span style="color: black;">model.add(Dense(units=hp.Int(dense_ + str(i), min_value=50, max_value=100, step=25), activation=relu))</span><span style="color: black;"> model.add(Dropout(hp.Choice(dropout_+ str(i), values=)))</span><span style="color: black;">model.add(Dense(10,activation="softmax"))</span><span style="color: black;">hp_optimizer=hp.Choice(Optimizer, values=)</span><span style="color: black;">if hp_optimizer == Adam:</span><span style="color: black;">hp_learning_rate = hp.Choice(learning_rate, values=)</span><span style="color: black;">elif hp_optimizer == SGD:</span><span style="color: black;"> &nbsp; hp_learning_rate = hp.Choice(learning_rate, values=)</span><span style="color: black;"> &nbsp; nesterov=True</span><span style="color: black;">momentum=0.9</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">这儿</span>需要<span style="color: black;">重视</span>第 5 行的 for 循环:让模型决定网络的深度!</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">最后,<span style="color: black;">便是</span>运行了。请<span style="color: black;">重视</span><span style="color: black;">咱们</span>之前<span style="color: black;">说到</span>的 max_trials 参数。</span></p><span style="color: black;">model.compile(optimizer = hp_optimizer, loss=categorical_crossentropy, metrics=)</span><span style="color: black;">tuner_mlp = kt.tuners.BayesianOptimization(</span><span style="color: black;"> &nbsp; model,</span><span style="color: black;"> &nbsp; seed=random_seed,</span><span style="color: black;"> &nbsp; objective=val_loss,</span><span style="color: black;"> &nbsp; max_trials=30,</span><span style="color: black;"> &nbsp; directory=.,</span><span style="color: black;"> &nbsp; project_name=tuning-mlp)</span><span style="color: black;">tuner_mlp.search(train_x, train_y, epochs=50, batch_size=32, validation_data=(dev_x, dev_y), callbacks=callback)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">咱们</span>得到结果</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_png/6wQyVOrkRNI28YtRibhRqr9gbjZG0uobMUDBdyuq3837tlccEKAew3TOqFZPeB4cGKDDzicXzhP9JRVdrfwrm8Lg/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这个过程用尽了迭代次数,大约需要 1 小时<span style="color: black;">才可</span>完成。<span style="color: black;">咱们</span>还<span style="color: black;">能够</span><span style="color: black;">运用</span>以下命令打印模型的最佳超参数:</span></p><span style="color: black;">best_mlp_hyperparameters = tuner_mlp.get_best_hyperparameters(1)</span><span style="color: black;">print("Best Hyper-parameters")</span><span style="color: black;">best_mlp_hyperparameters.values</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_png/6wQyVOrkRNI28YtRibhRqr9gbjZG0uobMz8814jYnpMjL1sdM5GFmXnuPC0ntMS2Ix6mic6lqQ0icBrhaIqey8XGA/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">此刻</span><span style="color: black;">咱们</span><span style="color: black;">能够</span><span style="color: black;">运用</span>最优超参数重新训练<span style="color: black;">咱们</span>的模型:</span></p><span style="color: black;">model_mlp = Sequential()</span><span style="color: black;">model_mlp.add(Dense(best_mlp_hyperparameters, input_shape=(784,), activation=relu))</span><span style="color: black;">for i in range(best_mlp_hyperparameters):</span><span style="color: black;"> model_mlp.add(Dense(units=best_mlp_hyperparameters, activation=relu))</span><span style="color: black;">model_mlp.add(Dropout(rate=best_mlp_hyperparameters))</span><span style="color: black;">model_mlp.add(Dense(10,activation="softmax"))</span><span style="color: black;">model_mlp.compile(optimizer=best_mlp_hyperparameters, loss=categorical_crossentropy,metrics=)</span><span style="color: black;">history_mlp= model_mlp.fit(train_x, train_y, epochs=100, batch_size=32, validation_data=(dev_x, dev_y), callbacks=callback)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">或</span>,<span style="color: black;">咱们</span><span style="color: black;">能够</span>用这些参数重新训练<span style="color: black;">咱们</span>的模型:</span></p><span style="color: black;">model_mlp=tuner_mlp.hypermodel.build(best_mlp_hyperparameters)</span><span style="color: black;">history_mlp=model_mlp.fit(train_x, train_y, epochs=100, batch_size=32,</span><span style="color: black;"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; validation_data=(dev_x, dev_y), callbacks=callback)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">而后</span>测试准确率</span></p><span style="color: black;">mlp_test_loss, mlp_test_acc = model_mlp.evaluate(test_x, test_y, verbose=2)</span><span style="color: black;">print(\nTest accuracy:, mlp_test_acc)</span><span style="color: black;"># Test accuracy: 0.8823</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">与基线的模型测试精度相比:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">基线 MLP 模型:86.6 %最佳 MLP 模型:88.2 %。测试准确度的差异约为 3%!</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">下面<span style="color: black;">咱们</span><span style="color: black;">运用</span>相同的流程,将MLP改为CNN,<span style="color: black;">这般</span><span style="color: black;">能够</span>测试<span style="color: black;">更加多</span>参数。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">首要</span>,这是<span style="color: black;">咱们</span>的基线模型:</span></p><span style="color: black;">model_cnn = Sequential()</span><span style="color: black;">model_cnn.add(Conv2D(32, (3, 3), activation=relu, input_shape=(28, 28, 1)))</span><span style="color: black;">model_cnn.add(MaxPooling2D((2, 2)))</span><span style="color: black;">model_cnn.add(Flatten())</span><span style="color: black;">model_cnn.add(Dense(100, activation=relu))</span><span style="color: black;">model_cnn.add(Dense(10, activation=softmax))</span><span style="color: black;">model_cnn.compile(optimizer="adam", loss=categorical_crossentropy, metrics=)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">基线模型 <span style="color: black;">包括</span>卷积和池化层。<span style="color: black;">针对</span>调优,<span style="color: black;">咱们</span>将测试以下内容:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"> 卷积、MaxPooling 和 Dropout 层的“块”数</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">每一个</span>块中 Conv 层的过滤器<span style="color: black;">体积</span>:32、64</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">转换层上的有效或相同填充</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">最后一个额外层的<span style="color: black;">隐匿</span>层<span style="color: black;">体积</span>:25-150,乘以 25</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">优化器:SGD(nesterov=True,动量=0.9)或 Adam</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">学习率:0.01、0.001</span></p><span style="color: black;">model = Sequential()</span><span style="color: black;">model = Sequential()</span><span style="color: black;">model.add(Input(shape=(28, 28, 1)))</span><span style="color: black;">for i in range(hp.Int(num_blocks, 1, 2)):</span><span style="color: black;"> &nbsp; hp_padding=hp.Choice(padding_+ str(i), values=)</span><span style="color: black;">hp_filters=hp.Choice(filters_+ str(i), values=)</span><span style="color: black;"> &nbsp; model.add(Conv2D(hp_filters, (3, 3), padding=hp_padding, activation=relu, kernel_initializer=he_uniform, input_shape=(28, 28, 1)))</span><span style="color: black;">model.add(MaxPooling2D((2, 2)))</span><span style="color: black;"> &nbsp; model.add(Dropout(hp.Choice(dropout_+ str(i), values=)))</span><span style="color: black;">model.add(Flatten())</span><span style="color: black;">hp_units = hp.Int(units, min_value=25, max_value=150, step=25)</span><span style="color: black;">model.add(Dense(hp_units, activation=relu, kernel_initializer=he_uniform))</span><span style="color: black;">model.add(Dense(10,activation="softmax"))</span><span style="color: black;">hp_learning_rate = hp.Choice(learning_rate, values=)</span><span style="color: black;">hp_optimizer=hp.Choice(Optimizer, values=)</span><span style="color: black;">if hp_optimizer == Adam:</span><span style="color: black;">hp_learning_rate = hp.Choice(learning_rate, values=)</span><span style="color: black;">elif hp_optimizer == SGD:</span><span style="color: black;"> &nbsp; hp_learning_rate = hp.Choice(learning_rate, values=)</span><span style="color: black;"> &nbsp; nesterov=True</span><span style="color: black;">momentum=0.9</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">像以前<span style="color: black;">同样</span>,<span style="color: black;">咱们</span>让网络决定它的深度。最大迭代次数设置为 100:</span></p><span style="color: black;">model.compile( optimizer=hp_optimizer,loss=categorical_crossentropy, metrics=)</span><span style="color: black;">tuner_cnn = kt.tuners.BayesianOptimization(</span><span style="color: black;"> &nbsp; model,</span><span style="color: black;"> &nbsp; objective=val_loss,</span><span style="color: black;"> &nbsp; max_trials=100,</span><span style="color: black;"> &nbsp; directory=.,</span><span style="color: black;"> &nbsp; project_name=tuning-cnn)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">结果如下:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_png/6wQyVOrkRNI28YtRibhRqr9gbjZG0uobMUyWFZUcy8vkNLR5tLB9esOzE1V5TAVXAx168bbJv6tYmNMJLjqDdyQ/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">得到的超参数</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_png/6wQyVOrkRNI28YtRibhRqr9gbjZG0uobMGIljp4CmJ0zFpoB18lGQKKQ7M1tXJO3c4skveIpibKTdJnaEoEeoeHw/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">最后<span style="color: black;">运用</span>最佳超参数训练<span style="color: black;">咱们</span>的 CNN 模型:</span></p><span style="color: black;">model_cnn = Sequential()</span><span style="color: black;">model_cnn.add(Input(shape=(28, 28, 1)))</span><span style="color: black;">for i in range(best_cnn_hyperparameters):</span><span style="color: black;"> hp_padding=best_cnn_hyperparameters</span><span style="color: black;">hp_filters=best_cnn_hyperparameters</span><span style="color: black;">model_cnn.add(Conv2D(hp_filters, (3, 3), padding=hp_padding, activation=relu, kernel_initializer=he_uniform, input_shape=(28, 28, 1)))</span><span style="color: black;"> model_cnn.add(MaxPooling2D((2, 2)))</span><span style="color: black;"> model_cnn.add(Dropout(best_cnn_hyperparameters))</span><span style="color: black;">model_cnn.add(Flatten())</span><span style="color: black;">model_cnn.add(Dense(best_cnn_hyperparameters, activation=relu, kernel_initializer=he_uniform))</span><span style="color: black;">model_cnn.add(Dense(10,activation="softmax"))</span><span style="color: black;">model_cnn.compile(optimizer=best_cnn_hyperparameters, </span><span style="color: black;"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; loss=categorical_crossentropy, </span><span style="color: black;"> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; metrics=)</span><span style="color: black;">print(model_cnn.summary())</span><span style="color: black;">history_cnn= model_cnn.fit(train_x, train_y, epochs=50, batch_size=32, validation_data=(dev_x, dev_y), callbacks=callback)</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">检测</span>测试集的准确率:</span></p><span style="color: black;">cnn_test_loss, cnn_test_acc = model_cnn.evaluate(test_x, test_y, verbose=2)</span><span style="color: black;">print(\nTest accuracy:, cnn_test_acc)</span><span style="color: black;"># Test accuracy: 0.92</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">与基线的 CNN 模型测试精度相比:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">基线 CNN 模型:90.8 %</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">最佳 CNN 模型:92%</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">咱们</span>看到优化模型的性能<span style="color: black;">提高</span>!</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">除了准确性之外,<span style="color: black;">咱们</span>还<span style="color: black;">能够</span>看到优化的效果很好,<span style="color: black;">由于</span>:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">在每种<span style="color: black;">状况</span>下都<span style="color: black;">选取</span>了一个非零的 Dropout 值,即使<span style="color: black;">咱们</span><span style="color: black;">亦</span><span style="color: black;">供给</span>了零 Dropout。这是意料之中的,<span style="color: black;">由于</span> Dropout 是一种减少过拟合的机制。有趣的是,最好的 CNN 架构是标准CNN,其中过滤器的数量在每一层中<span style="color: black;">逐步</span><span style="color: black;">增多</span>。这是意料之中的,<span style="color: black;">由于</span>随着后续层的<span style="color: black;">增多</span>,模式变得更加<span style="color: black;">繁杂</span>(这<span style="color: black;">亦</span>是<span style="color: black;">咱们</span>在学习<span style="color: black;">各样</span>模型和论文时被证明的结果)需要<span style="color: black;">更加多</span>的过滤器<span style="color: black;">才可</span><span style="color: black;">捕捉</span>这些模式组合。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">以上例子<span style="color: black;">亦</span>说明Keras Tuner 是<span style="color: black;">运用</span> Tensorflow 优化深度神经网络的很好用的工具。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">咱们</span>上面<span style="color: black;">亦</span>说了本文<span style="color: black;">选取</span>是贝叶斯优化器。<span style="color: black;">然则</span>还有两个其他的选项:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">RandomSearch:随机<span style="color: black;">选取</span>其中的<span style="color: black;">有些</span>来避免探索超参数的<span style="color: black;">全部</span>搜索空间。<span style="color: black;">然则</span>,它<span style="color: black;">不可</span><span style="color: black;">保准</span>会找到最佳超参数</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"> Hyperband:<span style="color: black;">选取</span><span style="color: black;">有些</span>超参数的随机组合,并仅<span style="color: black;">运用</span>它们来训练模型几个 epoch。<span style="color: black;">而后</span><span style="color: black;">运用</span>这些超参数来训练模型,直到用尽所有 epoch 并从中<span style="color: black;">选取</span>最好的。</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">最后数据集<span style="color: black;">位置</span>和keras_tuner的文档如下</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Fashion MNIST dataset by Zalando, </span><span style="color: black;">https://www.kaggle.com/datasets/zalando-research/fashionmnist</span><span style="color: black;">, MIT Licence (MIT) Copyright © </span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">Keras Tuner, </span><span style="color: black;">https://keras.io/keras_tuner/</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">作者:Nikos Kafritsas</span></p>MOREkaggle比赛交流和组队加我的<span style="color: black;">微X</span>,邀你进群<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_jpg/6wQyVOrkRNJNialnOvBon7lqSKbxEoMPRYmgZcJianH8SkXpRTjlH4yAGafAS59m5kPNJZEibA4P84hj8sH6uRADw/640?wx_fmt=jpeg&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;">爱好</span>就关注一下吧:</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_gif/mmF0rTjMmmKZIUILXTItEet6ibM90lIWhKibHGnNdyQicw5zHCjmjItuImUHHgB2oNJ5cq41p9xolv3cD3puJNZ5w/640?wx_fmt=gif&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1" style="width: 50%; margin-bottom: 20px;"><span style="color: black;">点个&nbsp;</span><span style="color: black;"><strong style="color: blue;"><span style="color: black;">在看</span></strong></span><span style="color: black;">&nbsp;你最好看!</span></p>




4zhvml8 发表于 2024-10-15 10:36:02

楼主果然英明!不得不赞美你一下!

1fy07h 发表于 3 天前

你的见解真是独到,让我受益匪浅。
页: [1]
查看完整版本: 运用贝叶斯优化进行深度神经网络超参数优化