4zhvml8 发表于 2024-8-25 21:51:15

经过shell脚本提交网站404死链


    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_gif/K0TMNq37VN3BeaWMVgeHu3fjzTfia8o2tUx09tRdfaNaibgic33QRul5H0ClZcGRYX73WnwDcqog5Jts8edicDWqDg/640?wx_fmt=1&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">网站运营人员<span style="color: black;">针对</span>死链这个概念<span style="color: black;">必定</span>不陌生,网站的<span style="color: black;">有些</span>数据删除或页面改版等都容易制造死链,影响用户体验不说,<span style="color: black;">太多</span>的死链还会影响到网站的整体权重或排名。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">百度站长平台<span style="color: black;">供给</span>的死链提交工具,可将网站存在的死链(协议死链、404页面)进行提交,可快速删除死链,<span style="color: black;">帮忙</span>网站SEO优化。在提交死链的文件中逐个手动填写死链的话太麻烦,工作中<span style="color: black;">咱们</span>提倡<span style="color: black;">繁杂</span>自动化,<span style="color: black;">因此</span>本文<span style="color: black;">咱们</span><span style="color: black;">一块</span>交流分享Apache服务中<span style="color: black;">经过</span>shell脚本整理网站死链,便于<span style="color: black;">咱们</span>提交。</p>
    <img src="https://mmbiz.qpic.cn/mmbiz_png/K0TMNq37VN3vXnvmx01reEcxE9yX710bVphjUngqvkFL8jDBdB9vKs1M8K32ru9nxKQfVtt0hXmRwf0CedXH2w/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"><span style="color: black;">1.配置Apache记录搜索引擎</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">Apache是<span style="color: black;">日前</span>网站建设最为主流的web服务,<span style="color: black;">然则</span>apache的日志文件默认是不记录百度、谷歌等各大搜索引擎的爬取程序的,<span style="color: black;">因此</span><span style="color: black;">首要</span>需要<span style="color: black;">咱们</span>设置Apache的配置文件。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">找到Apache的配置文件httpd.conf,在配置文件中找到下面两行:</p><span style="color: black;">CustomLog</span> <span style="color: black;">"logs/access_log"</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> common</p> <span style="color: black;">#CustomLog "logs/access_log" combined</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">默认采用的是common,<span style="color: black;">这儿</span><span style="color: black;">咱们</span>只需要将common这一行前面加#注释掉,<span style="color: black;">而后</span>将combined这一行前的#去掉<span style="color: black;">就可</span>。<span style="color: black;">而后</span><span style="color: black;">保留</span>重启Apache服务。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">注:<span style="color: black;">倘若</span>你的服务器上添加了多个站点,<span style="color: black;">每一个</span>站点有单独的配置文件,则<span style="color: black;">咱们</span>只需要在相应站点的配置文件中设置CustomLog项<span style="color: black;">就可</span>,例如:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">vim /usr/local/apache/conf/vhost/www.chanzhi.org.conf</p> &nbsp; &nbsp;ServerAdmin DocumentRoot <span style="color: black;">"/data/wwwroot/www.chanzhi.org"</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> &nbsp; &nbsp;ServerName www.chanzhi.org</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> &nbsp; &nbsp;ServerAlias chanzhi.org</p> &nbsp; &nbsp;ErrorLog <span style="color: black;">"/data/wwwlogs/www.chanzhi.org_error_apache.log"</span> &nbsp; &nbsp;CustomLog <span style="color: black;">"/data/wwwlogs/www.chanzhi.org_apache.log"</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> combined</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> &nbsp; &nbsp;SetOutputFilter DEFLATE</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">Options FollowSymLinks ExecCGI</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> &nbsp; &nbsp;Require all granted</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> &nbsp; &nbsp;AllowOverride All</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> &nbsp; &nbsp;Order allow,deny</p> &nbsp; &nbsp;Allow <span style="color: black;">from</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> all</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"> &nbsp; &nbsp;DirectoryIndex index.html index.php</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">下面是配置前后的网站日志记录格式:</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">配置前:</strong></p><img src="https://mmbiz.qpic.cn/mmbiz_png/K0TMNq37VN3vXnvmx01reEcxE9yX710bIIKXYzB6kaoRUktgJGLwibGGFA4EPhYBQxHvBSWeDhrbKSO5uYlPwdg/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;">
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">配置后:</strong></p><img src="https://mmbiz.qpic.cn/mmbiz_png/K0TMNq37VN3vXnvmx01reEcxE9yX710bl4WpsmAjYdNwxHHfJGLM81FOtdmK5LlX38w7EAzJFKNKbp0ECgIfSQ/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"><span style="color: black;">2.编写shell脚本</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">咱们</span><span style="color: black;">经过</span>shell脚本获取网站日志中指定爬虫的抓取记录,<span style="color: black;">而后</span>汇总到一个文件中,便于后期<span style="color: black;">运用</span>。代码如下,<span style="color: black;">例如</span><span style="color: black;">保留</span>为deathlink.sh</p><span style="color: black;">#!/bin/bash</span><span style="color: black;">#初始化变量</span><span style="color: black;">#定义蜘蛛UA信息(默认是百度蜘蛛)</span>UA=<span style="color: black;">+http://www.baidu.com/search/spider.html</span><span style="color: black;">#前一天的日期(apache日志)</span>DATE=`date +%Y%m%d -d <span style="color: black;">"1 day ago"</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">`</p><span style="color: black;">#定义日志路径</span>logfile=/data/wwwlogs/www.chanzhi.org_apache.log-<span style="color: black;">${DATE}</span>.<span style="color: black;">log</span><span style="color: black;">#定义死链文件存放路径</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">deathfile=/data/wwwroot/www.chanzhi.org/deathlink.txt</p><span style="color: black;">#定义网站<span style="color: black;">拜访</span><span style="color: black;">位置</span></span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">website=http://www.chanzhi.org</p><span style="color: black;">#分析日志并<span style="color: black;">保留</span>死链数据</span><span style="color: black;">for</span> url <span style="color: black;">in</span> `awk -v str=<span style="color: black;">"<span style="color: black;">${UA}</span>"</span> <span style="color: black;">$9=="404" &amp;&amp; $15~str {print $7}</span> <span style="color: black;">${logfile}</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">`</p><span style="color: black;">do</span> &nbsp;grep -q <span style="color: black;">"<span style="color: black;">$url</span>"</span> <span style="color: black;">${deathfile}</span> || <span style="color: black;">echo</span> <span style="color: black;">${website}</span><span style="color: black;">${url}</span> &gt;&gt;<span style="color: black;">${deathfile}</span><span style="color: black;">done</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">大众</span>在<span style="color: black;">运用</span>该脚本时,<span style="color: black;">按照</span>自己服务器<span style="color: black;">状况</span><span style="color: black;">调节</span>下路径和字段<span style="color: black;">就可</span>,<span style="color: black;">而后</span>执行脚本,:</p><span style="color: black;">bash</span> <span style="color: black;">deathlink</span><span style="color: black;">.sh</span><span style="color: black;">3.提交死链</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">执行上面脚本时候,就会在指定目录下生成<span style="color: black;">包括</span>所有获取的404页面链接的文件,<span style="color: black;">每一个</span>连接占一行。例如:</p>
    <img src="https://mmbiz.qpic.cn/mmbiz_png/K0TMNq37VN3vXnvmx01reEcxE9yX710bv08QKqJPUqktj83RvVrM5Wl7hDgicBEQ3C18VPVEVia6YAtvJtf3z0SA/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;">
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">最后在站长平台提交死链页面中,填写自己的死链文件<span style="color: black;">位置</span><span style="color: black;">就可</span>,例如:</p>
    <img src="https://mmbiz.qpic.cn/mmbiz_png/K0TMNq37VN3vXnvmx01reEcxE9yX710bEDVQwD6BjxFib3NdYJdibictv0Eq1uMTaoyeYLwY7iayib7mHtDbFpabA0Q/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;">
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">百度在审核<span style="color: black;">经过</span>之后,会将<span style="color: black;">已然</span>收录的失效链接删除,以避免失效页面链接对网站<span style="color: black;">导致</span>不良的影响。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">总结:</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">本文和<span style="color: black;">大众</span>分享了在Apache服务环境下,<span style="color: black;">怎样</span>利用shell脚本自动获取百度蜘蛛等爬虫抓取的死链,并生成汇总文件提交给搜索引擎。<span style="color: black;">倘若</span><span style="color: black;">大众</span>还有其他更好的<span style="color: black;">办法</span><span style="color: black;">或</span>疑问,欢迎<span style="color: black;">一块</span>分享交流。</p><span style="color: black;">让您学习到的每一节课都有所收获</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;">《Linux就该这么学》是一本由资深运维专家刘遄老师及国内多名红帽架构师(RHCA)基于最新RHEL7系统<span style="color: black;">一起</span>编写的高质量Linux技术自学教程,极其适合用于Linux技术入门教程或讲课辅助教材。荣获双11、双12购物狂欢节IT品类书籍销量冠军,2017年、2018年国内读者增速最快的技术书籍,您<span style="color: black;">能够</span>在京东、当当、亚马逊及天猫搜索书名后购买,<span style="color: black;">也</span>可加刘遄老师<span style="color: black;">微X</span>交流学习(手指按住下图3秒钟<span style="color: black;">就可</span>自动扫描)~</span></strong></span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_png/K0TMNq37VN1EDtr0Y5iayksCaGGxLSytmktZkqHvNYyxPvuMMSUBkliafN78qCQbSKOq7vLD1pJcQZquCx8uDP5g/640?wx_fmt=png&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><span style="color: black;"><strong style="color: blue;">刘遄老师QQ:5604583</strong></span></span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;"><span style="color: black;">☀&nbsp;</span><span style="color: black;">Linux技术交流群:<strong style="color: blue;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;"><span style="color: black;">2636170</span></span></strong></span></strong></span></strong></span></strong>(<span style="color: black;"><strong style="color: blue;">新群,</strong><strong style="color: blue;">火热加群中……</strong></span>)</span></span></strong></span></strong></span></strong></span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;"><span style="color: black;">☀&nbsp;</span><span style="color: black;">官方站点:<strong style="color: blue;">www.linuxprobe.com</strong></span></span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">☀ 书籍在线学习(电脑在线阅读效果更佳<strong style="color: blue;"><span style="color: black;">)</span></strong>:</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">http://www.linuxprobe.com/chapter-00.html</span></strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><img src="https://mmbiz.qpic.cn/mmbiz_jpg/K0TMNq37VN3RUU6QCN0u8EVVibbJUH42FIRbYEcd81LYdPQuVxzWC2d6QqjPJx12kwPSrecqLPfjZ1oRyUa0Kkg/640?wx_fmt=jpeg&amp;tp=webp&amp;wxfrom=5&amp;wx_lazy=1&amp;wx_co=1" style="width: 50%; margin-bottom: 20px;"></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;"><strong style="color: blue;"><span style="color: black;">《Linux就该这么学》</span></strong></span><span style="color: black;"><span style="color: black;">是一本基于最新Linux系统编写,面向零<span style="color: black;">基本</span>读者的技术书籍。从Linux<span style="color: black;">基本</span>知识讲起,<span style="color: black;">而后</span>渐进式地<span style="color: black;">加强</span>内容难度,<span style="color: black;">仔细</span>讲解Linux系统中<span style="color: black;">各样</span>服务的工作原理和配置方式,以匹配真实生产环境对运维人员的<span style="color: black;">需求</span>,突显内容的实用性。想要学习Linux系统的读者<span style="color: black;">能够</span>点击<span style="color: black;"><strong style="color: blue;"><span style="color: black;">"阅读原文"</span></strong></span>按钮<span style="color: black;">认识</span>这本书,<span style="color: black;">同期</span>这本书<span style="color: black;">亦</span>适合专业的运维人员阅读,<span style="color: black;">做为</span>一本非常有参考价值的工具书!</span></span></p>




4lqedz 发表于 2024-10-8 01:27:36

“沙发”(SF,第一个回帖的人)‌

qzmjef 发表于 2024-10-8 13:17:15

外链发布论坛学习网络优化SEO。

qzmjef 发表于 2024-10-21 15:18:14

楼主的文章非常有意义,提升了我的知识水平。

j8typz 发表于 2024-10-31 08:48:50

我完全赞同你的观点,思考很有深度。

7wu1wm0 发表于 2024-11-13 10:52:59

谢谢、感谢、感恩、辛苦了、有你真好等。
页: [1]
查看完整版本: 经过shell脚本提交网站404死链