lbk60ox 发表于 2024-8-18 05:56:20

python3—requests讲解


    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">requests库是一个常用的用于http请求的模块,它<span style="color: black;">运用</span>python语言编写,<span style="color: black;">能够</span>方便的对网页进行爬取,是学习python爬虫的较好的http请求模块。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">1、requests模块的安装</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">WIN+R——cmd——pip install requests<span style="color: black;">就可</span>。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">倘若</span>遇到read timeout(<span style="color: black;">拜访</span>超时)的问题,可参考链接<span style="color: black;">文案</span>:<a style="color: black;">you-get介绍及python<span style="color: black;">怎样</span>利用you-get工具进行网页视频的爬取?</a>中you-get工具借用豆瓣代理的下载安装方式。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">2、requests模块的<span style="color: black;">运用</span><span style="color: black;">办法</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">2.1 requests库的几个<span style="color: black;">重点</span><span style="color: black;">办法</span>:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/c484fd6ca66b4fa1b7e00c4e6fe7fe38~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=FPIDMcIAykN2BEOgabgkxPr%2FhQ4%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">(1)requests.get()</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">这个<span style="color: black;">办法</span>是<span style="color: black;">咱们</span>平时最常用的<span style="color: black;">办法</span>之一,<span style="color: black;">经过</span>这个<span style="color: black;">办法</span><span style="color: black;">咱们</span><span style="color: black;">能够</span><span style="color: black;">认识</span>到其他的<span style="color: black;">办法</span>,<span style="color: black;">因此</span><span style="color: black;">咱们</span><span style="color: black;">仔细</span>介绍这个<span style="color: black;">办法</span>。 <span style="color: black;">详细</span>参数是:</p>r=requests.<span style="color: black;">get</span>(url,<span style="color: black;">params</span>,**kwargs)url: <span style="color: black;">必须</span>爬取的网站<span style="color: black;">位置</span>。params: 翻译过来<span style="color: black;">便是</span>参数, url中的额外参数,字典<span style="color: black;">或</span>字节流格式,可选。**kwargs : 12个<span style="color: black;">掌控</span><span style="color: black;">拜访</span>的参数<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">params:字典或字节序列, <span style="color: black;">做为</span>参数<span style="color: black;">增多</span>到url中,<span style="color: black;">运用</span>这个参数<span style="color: black;">能够</span>把<span style="color: black;">有些</span>键值对以?key1=value1&amp;key2=value2的模式<span style="color: black;">增多</span>到url中 例如:kw= {key1: values, key2: values} r = requests.get(http:www.python123.io/ws, params=kw)</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">a.发送无参数的get请求:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/f292599f299645e1b81922a55aba8f97~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=b5V3bUvntqlQ1L04bTFLReujcFg%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">b.发送带参数的get请求:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/8eeebe237c6a4618a0ce45ac435babb6~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=VVZVIXIc%2BQpZxmSMTw4Khl2mw0o%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">以上得知,<span style="color: black;">咱们</span>的get参数是以params关键字参数传递的。</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;"><span style="color: black;">另外</span>,</strong>还<span style="color: black;">能够</span>传递一个list给一个请求参数:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/06bb146a479e4713832619e5e3485cdc~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=RryTqCpIQe9aRe0YpRUKT2txUJw%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">以上<span style="color: black;">便是</span>get请求的基本形式。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">**kwargs有以下的参数:</p>data:字典,字节序或文件对象,重点<span style="color: black;">做为</span>向服务器<span style="color: black;">供给</span>或提交资源是提交,,<span style="color: black;">做为</span>requests的内容,与params<span style="color: black;">区别</span>的是,data提交的数据并不放在url链接里, 而是放在url链接对应位置的<span style="color: black;">地区</span><span style="color: black;">做为</span>数据来存储。它<span style="color: black;">亦</span><span style="color: black;">能够</span>接受一个字符串对象。json:json格式的数据, json合适在<span style="color: black;">关联</span>的html,http<span style="color: black;">关联</span>的web<span style="color: black;">研发</span>中非常<span style="color: black;">平常</span>, <span style="color: black;">亦</span>是http最经常<span style="color: black;">运用</span>的数据格式, 他是<span style="color: black;">做为</span>内容部分<span style="color: black;">能够</span>向服务器提交。 例如:kv = {key1: value1} r = requests.post(http://python123.io/ws, json=kv)headers:字典是http的<span style="color: black;">关联</span>语,对应了向某个url<span style="color: black;">拜访</span>时所发起的http的头i字段, <span style="color: black;">能够</span>用这个字段来定义http的<span style="color: black;">拜访</span>的http头,<span style="color: black;">能够</span>用来模拟任何<span style="color: black;">咱们</span>想模拟的浏览器来对url发起<span style="color: black;">拜访</span>。 例子: hd = {user-agent: Chrome/10} r = requests.post(http://python123.io/ws, headers=hd)cookies:字典或CookieJar,指的是从http中解析cookieauth:元组,用来支持http认证功能files:字典, 是用来向服务器传输文件时<span style="color: black;">运用</span>的字段。 例子:fs = {files: open(data.txt, rb)} r = requests.post(http://python123.io/ws, files=fs)timeout: 用于设定超时时间, 单位为秒,当发起一个get请求时<span style="color: black;">能够</span>设置一个timeout时间, <span style="color: black;">倘若</span>在timeout时间内请求内容<span style="color: black;">无</span>返回, 将产生一个timeout的<span style="color: black;">反常</span>。proxies:字典, 用来设置<span style="color: black;">拜访</span>代理服务器。allow_redirects: 开关, <span style="color: black;">暗示</span><span style="color: black;">是不是</span><span style="color: black;">准许</span>对url进行重定向, 默认为True。stream: 开关, 指<span style="color: black;">是不是</span>对获取内容进行立即下载, 默认为True。verify:开关, 用于认证SSL证书, 默认为True。cert: 用于设置<span style="color: black;">保留</span>本地SSL证书路径<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">其中response(即:r)对象有以下属性:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/acd5ebb996974180b3e85da2022e830a~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=aWy%2F3pSOYIV0uNBo%2BF6fc86NHdk%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">咱们</span><span style="color: black;">能够</span>单击桌面左下角的WIN按钮,找到python安装包,打开IDLE来亲自操作<span style="color: black;">熟练</span>response的属性。</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/28a1b793045b476b8bc53cca44378a98~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=z7xBjS7XiXj9iOHGoXEIzwoZvVU%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p26-sign.toutiaoimg.com/pgc-image/f3bfd3102d7f44c096365e6c0c0881e6~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=KtzUD7K3hyxfyO8uftyHdBMS1c4%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">requests库的<span style="color: black;">反常</span></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">重视</span>requests库有时会产生<span style="color: black;">反常</span>,<span style="color: black;">例如</span>网络连接错误、http错误<span style="color: black;">反常</span>、重定向<span style="color: black;">反常</span>、请求url超时<span style="color: black;">反常</span>等等。<span style="color: black;">因此</span><span style="color: black;">咱们</span><span style="color: black;">必须</span>判断r.status_codes<span style="color: black;">是不是</span>是200,在<span style="color: black;">这儿</span><span style="color: black;">咱们</span>怎么样去<span style="color: black;">捉捕</span>异常呢?</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><span style="color: black;">这儿</span><span style="color: black;">咱们</span><span style="color: black;">能够</span>利用r.raise_for_status() 语句去<span style="color: black;">捉捕</span><span style="color: black;">反常</span>,该语句在<span style="color: black;">办法</span>内部判断r.status_code<span style="color: black;">是不是</span>等于200,<span style="color: black;">倘若</span>不等于,则抛出<span style="color: black;">反常</span>。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">于是在<span style="color: black;">这儿</span><span style="color: black;">咱们</span>有一个爬取网页的通用代码框架:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p26-sign.toutiaoimg.com/pgc-image/c145a47fea214bfdae6dcd6d0f55a2da~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=Pxt6tH1scCS31zEHLOvQVXvypYE%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">此处<span style="color: black;">不可</span>采用return来返回输出,<span style="color: black;">运用</span>的话会报错,<span style="color: black;">由于</span>:<strong style="color: blue;">return只能用在自定义函数中</strong>。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">(2) request.head()</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">看代码:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/9ceb45d4e8bb4a988b15303e3616c795~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=qi9%2FwAu8YMRpcRYNQOWGXOkyypw%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">(3)requests.post()</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">1、向url post一个字典:</p>
    <div style="color: black; text-align: left; margin-bottom: 10px;"><img src="https://p3-sign.toutiaoimg.com/pgc-image/9c1fe498c1e5413f9b3ee3ea95506b77~noop.image?_iz=58558&amp;from=article.pc_detail&amp;lk3s=953192f4&amp;x-expires=1723900321&amp;x-signature=Mf5IimadiW2HkH%2BYJr%2FbtrzYj6A%3D" style="width: 50%; margin-bottom: 20px;"></div>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">2、向url post 一个字符串,自动编码为data</p><span style="color: black;">&gt;&gt;</span>&gt; import requests
    <span style="color: black;">&gt;&gt;</span>&gt; r=requests.post(<span style="color: black;">"http://httpbin.org/post"</span>,data=<span style="color: black;">"hello python"</span>)
    <span style="color: black;">&gt;&gt;</span>&gt; print(r.text)
    {
    <span style="color: black;">"args"</span>: {},
    <span style="color: black;">"data"</span>: <span style="color: black;">"hello python"</span>,
    <span style="color: black;">"files"</span>: {},
    <span style="color: black;">"form"</span>: {},
    <span style="color: black;">"headers"</span>: {
    <span style="color: black;">"Accept"</span>: <span style="color: black;">"*/*"</span>,
    <span style="color: black;">"Accept-Encoding"</span>: <span style="color: black;">"gzip, deflate"</span>,
    <span style="color: black;">"Content-Length"</span>: <span style="color: black;">"12"</span>,
    <span style="color: black;">"Host"</span>: <span style="color: black;">"httpbin.org"</span>,
    <span style="color: black;">"User-Agent"</span>: <span style="color: black;">"python-requests/2.22.0"</span>,
    <span style="color: black;">"X-Amzn-Trace-Id"</span>: <span style="color: black;">"Root=1-5e510c68-12ee4eec533847d89a06d184"</span>
    },
    <span style="color: black;">"json"</span>: null,
    <span style="color: black;">"origin"</span>: <span style="color: black;">"36.47.128.206"</span>,
    <span style="color: black;">"url"</span>: <span style="color: black;">"http://httpbin.org/post"</span>
    }<p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">3.向url post一个文件</p><span style="color: black;">&gt;&gt;</span>&gt; import requests
    <span style="color: black;">&gt;&gt;</span>&gt; files = {<span style="color: black;">files</span><span style="color: black;">:open</span>(<span style="color: black;">C:\\Users\\Think\\Desktop\\test_requests\\test.txt</span>,<span style="color: black;">rb</span>)}
    <span style="color: black;">&gt;&gt;</span>&gt; r = requests.post(<span style="color: black;">https://httpbin.org/post</span>,files=files)
    <span style="color: black;">&gt;&gt;</span>&gt; print(r.text)
    {
    <span style="color: black;">"args"</span><span style="color: black;">:</span>{

    },
    <span style="color: black;">"data"</span><span style="color: black;">:<span style="color: black;">""</span></span>,
    <span style="color: black;">"files"</span><span style="color: black;">:</span>{
    <span style="color: black;">"files"</span><span style="color: black;">:<span style="color: black;">"hello worle!"</span></span>
    },
    <span style="color: black;">"form"</span><span style="color: black;">:</span>{

    },
    <span style="color: black;">"headers"</span><span style="color: black;">:</span>{
    <span style="color: black;">"Accept"</span><span style="color: black;">:<span style="color: black;">"*/*"</span></span>,
    <span style="color: black;">"Accept-Encoding"</span><span style="color: black;">:<span style="color: black;">"gzip, deflate"</span></span>,
    <span style="color: black;">"Connection"</span><span style="color: black;">:<span style="color: black;">"close"</span></span>,
    <span style="color: black;">"Content-Length"</span><span style="color: black;">:<span style="color: black;">"158"</span></span>,
    <span style="color: black;">"Content-Type"</span><span style="color: black;">:<span style="color: black;">"multipart/form-data; boundary=d2fb307f28aeb57b932d867f80f2f600"</span></span>,
    <span style="color: black;">"Host"</span><span style="color: black;">:<span style="color: black;">"httpbin.org"</span></span>,
    <span style="color: black;">"User-Agent"</span><span style="color: black;">:<span style="color: black;">"python-requests/2.19.1"</span></span>
    },
    <span style="color: black;">"json"</span><span style="color: black;">:null</span>,
    <span style="color: black;">"origin"</span><span style="color: black;">:<span style="color: black;">"113.65.2.187"</span></span>,
    <span style="color: black;">"url"</span><span style="color: black;">:<span style="color: black;">"https://httpbin.org/post"</span></span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">以上得知,post请求参数是以data关键字参数来传递的。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">(5)requests.put()</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">看代码:</p><span style="color: black;">&gt;&gt;</span>&gt; payload={<span style="color: black;">"key1"</span><span style="color: black;">:<span style="color: black;">"value1"</span></span>,<span style="color: black;">"key2"</span><span style="color: black;">:<span style="color: black;">"value2"</span></span>}
    <span style="color: black;">&gt;&gt;</span>&gt; r=requests.put(<span style="color: black;">"http://httpbin.org/put"</span>,data=payload)
    <span style="color: black;">&gt;&gt;</span>&gt; print(r.text)
    {
    <span style="color: black;">"args"</span>: {},
    <span style="color: black;">"data"</span>: <span style="color: black;">""</span>,
    <span style="color: black;">"files"</span>: {},
    <span style="color: black;">"form"</span>: {
    <span style="color: black;">"key1"</span>: <span style="color: black;">"value1"</span>,
    <span style="color: black;">"key2"</span>: <span style="color: black;">"value2"</span>
    },
    <span style="color: black;">"headers"</span>: {
    <span style="color: black;">"Accept"</span>: <span style="color: black;">"*/*"</span>,
    <span style="color: black;">"Accept-Encoding"</span>: <span style="color: black;">"gzip, deflate"</span>,
    <span style="color: black;">"Connection"</span>: <span style="color: black;">"close"</span>,
    <span style="color: black;">"Content-Length"</span>: <span style="color: black;">"23"</span>,
    <span style="color: black;">"Content-Type"</span>: <span style="color: black;">"application/x-www-form-urlencoded"</span>,
    <span style="color: black;">"Host"</span>: <span style="color: black;">"httpbin.org"</span>,
    <span style="color: black;">"User-Agent"</span>: <span style="color: black;">"python-requests/2.18.4"</span>
    },
    <span style="color: black;">"json"</span>: null,
    <span style="color: black;">"origin"</span>: <span style="color: black;">"218.197.153.150"</span>,
    <span style="color: black;">"url"</span>: <span style="color: black;">"http://httpbin.org/put"</span>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">(6)requests.patch()</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">requests.patch和request.put类似。 两者<span style="color: black;">区别</span>的是: 当<span style="color: black;">咱们</span>用patch时仅<span style="color: black;">必须</span>提交<span style="color: black;">必须</span>修改的字段。 而用put时,<span style="color: black;">必要</span>将20个字段<span style="color: black;">一块</span>提交到url,未提交字段将会被删除。 patch的好处是:节省网络带宽。</p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;"><strong style="color: blue;">(7)requests.request()</strong></p>
    <p style="font-size: 16px; color: black; line-height: 40px; text-align: left; margin-bottom: 15px;">requests.request()支持其他所有的<span style="color: black;">办法</span>。 requests.request(method,url,**kwargs)</p>method: “GET”、”HEAD”、”POST”、”PUT”、”PATCH”等等url: 请求的网址**kwargs: <span style="color: black;">掌控</span><span style="color: black;">拜访</span>的参数




星☆雨 发表于 2024-8-29 21:40:48

说得好啊!我在外链论坛打滚这么多年,所谓阅人无数,就算没有见过猪走路,也总明白猪肉是啥味道的。

qzmjef 发表于 2024-10-1 07:22:31

期待楼主的下一次分享!”

j8typz 发表于 2024-10-15 09:17:18

这篇文章真的让我受益匪浅,外链发布感谢分享!

nykek5i 发表于 2024-10-22 04:10:05

楼主节操掉了,还不快捡起来!

nqkk58 发表于 2024-10-28 16:18:46

交流如星光璀璨,点亮思想夜空。

4lqedz 发表于 2024-11-13 16:29:15

“板凳”(第三个回帖的人)‌
页: [1]
查看完整版本: python3—requests讲解