有没有python大神，问一下这墨子输出出装大神的结果是为什么呀

点击联系发帖人 时间：2016-11-16 05:01

python 输出

更多公众号：python6359每天推送python语言的相关信息最新文章相关推荐搜狗：感谢您阅读大神教你用python实现最简单的爬虫功能。，本文可能来自网络，如果侵犯了您的相关权益，请联系管理员。QQ:请问大神，为什么我的这个程序运行不出结果_python吧_百度贴吧
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&签到排名：今日本吧第个签到，本吧因你更精彩，明天继续来努力！
本吧签到人数：0成为超级会员，使用一键签到本月漏签0次！成为超级会员，赠送8张补签卡连续签到：天&&累计签到：天超级会员单次开通12个月以上，赠送连续签到卡3张
关注：108,194贴子：
请问大神，为什么我的这个程序运行不出结果收藏
按照正常的不应该是打印adult吗，还是在这里不可以运行的
票牛教你如何买到热门、便宜、真实的演出门票！
我运行了下你的程序，正常
在idle shell中，一次只能运行一个语句，也就是第一个语句。因为这是单行命令窗口。要想运行多个语句，可以一个语句一个语句地运行，也可以把多行语句嵌套在循环语句中，或定义一个函数，再执行函数等。一般情况下，先在文件菜单中建一个新文档，编辑、保存，再运行程序。因为这个窗口是编辑窗口。
1.《300python课程视频教学O、O、群》群共享文件有大量的教材.py文件和视频学费仅300元，欢迎加入！2015年新推出“视频讲解习题”的教学方式，让你在乐趣中积累python编程经验和知识授课方式：视频录制、即拍即播形式+QQ语音+及时消息实例讲解+带详细注释的群共享备课教材文件授课内容：面向对像编程之“class类”/GUI之Tkinter教学/编程思路/函数式编程/lambda嵌套递归/一行代码一个程序/闭包/yield/@修饰器/numpy/ matplotlib/数据处理/re正则匹配/多线程控制（如：暂停/继续）/ django应用、聊天室搭建、模拟登陆（包括验证码提交处理、cookies处理）等；　　不限期，跨年度！　　竭诚为你讲解，随时接受学员提问！2.提供代做作业、解题、小项目等业务（含讲解）、（50元～200元/题，按难度或代码量收费不等）；（即拍视频方式讲解）欢迎定购！服务质量三包。　（附：若有人免费帮你，请忽略我的回复！）
登录百度帐号推荐应用
为兴趣而生，贴吧更懂你。或用python写了个生命游戏，哪位大神帮我看一下为啥运行的结果不对_python吧_百度贴吧
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&签到排名：今日本吧第个签到，本吧因你更精彩，明天继续来努力！
本吧签到人数：0成为超级会员，使用一键签到本月漏签0次！成为超级会员，赠送8张补签卡连续签到：天&&累计签到：天超级会员单次开通12个月以上，赠送连续签到卡3张
关注：108,194贴子：
用python写了个生命游戏，哪位大神帮我看一下为啥运行的结果不对收藏
运行没有出错，就是最后出来的图形和预想的不一样，应该是逻辑错误，哪位大神帮我看一下源码有很多应该消失的元胞没有消失import randomimport timeimport osclass Worm(object):
def __init__(self,posx,posy):
self.posx = posx
self.posy = posy
self.stat = random.randint(0,1)
def birth(self):
self.stat = 1
def death(self):
self.stat = 0
def envir(self):
self.life_num = 0
for x in range(self.posx-1,self.posx+2):
for y in range(self.posy-1,self.posy+2):
if(x &= 0 and x & xmax and y &= 0 and y & ymax):
if(x != self.posx and y != self.posy):
self.life_num += worm_list[y][x].stat
def update(self):
if(self.stat):
if(self.life_num & 2):
self.death()
elif(self.life_num & 3):
self.death()
if(self.life_num == 3):
self.birth()
def show(self):
if(self.stat):
print(&▇▇&,end='')
&,end='')
xmax = int(input(&input the width for the game:&))ymax = int(input(&\ninput the height for the game:&))worm_list = [[0 for lie in range(xmax)] for row in range(ymax)]for y in range(0,ymax):
for x in range(0,xmax):
worm_list[y][x] = Worm(x,y)while 1:
for y in range(0,ymax):
for x in range(0,xmax):
worm_list[y][x].show()
worm_list[y][x].envir()
print('')
print('')
for y in range(0,ymax):
for x in range(0,xmax):
worm_list[y][x].update()
#worm_list[random.randint(0,ymax-1)][random.randint(0,xmax-1)].stat = random.randint(0,1)
time.sleep(0.5)
i = os.system('clear')
好的话剧，坚决不能错过，价格也很重要！
自顶！！！、　?﹏﹏　吕氏beatbox对
使用闪光的挽尊卡挽回他的尊严！效果：python吧经验+2
看我写的，如图详见本人的Django网站里 --& 代写、与教学服务介绍:
终于找到问题了，原来是环境判断里面有一行or写成and了果然我逻辑还是不行　?﹏﹏　｜｜｜｜｜｜｜｜｜┃　　　━　　　┃┃　┳┛　┗┳　┃┃　　　　　　　┃┃　　　┻　　　┃┃　　　　　　　┃┗━┓　　　┏━┛　　┃　　　┃　　　　┃　　　┗━━━┓　　┃经验与我同在　┣┓　　┃围观专用宠物　┃　　┗┓┓┏━┳┓┏┛　　　┃┫┫　┃┫┫　　　┗┻┛　┗┻┛
登录百度帐号推荐应用
为兴趣而生，贴吧更懂你。或Python-OpenCV 处理视频（一）：输入输出 - Python - 伯乐在线
& Python-OpenCV 处理视频（一）：输入输出
视频的处理和图片的处理类似，只不过视频处理需要连续处理一系列图片。
一般有两种视频源，一种是直接从硬盘加载视频，另一种是获取摄像头视频。
0x00. 本地读取视频
cv.CaptureFromFile()
代码示例：
import cv2.cv as cv
capture = cv.CaptureFromFile('myvideo.avi')
nbFrames = int(cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_COUNT))
#CV_CAP_PROP_FRAME_WIDTH Width of the frames in the video stream
#CV_CAP_PROP_FRAME_HEIGHT Height of the frames in the video stream
fps = cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FPS)
wait = int(1/fps * 1000/1)
duration = (nbFrames * fps) / 1000
print 'Num. Frames = ', nbFrames
print 'Frame Rate = ', fps, 'fps'
print 'Duration = ', duration, 'sec'
for f in xrange( nbFrames ):
frameImg = cv.QueryFrame(capture)
print cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_POS_FRAMES)
cv.ShowImage("The Video", frameImg)
cv.WaitKey(wait)
123456789101112131415161718192021222324
import cv2.cv as cv&capture = cv.CaptureFromFile('myvideo.avi')&nbFrames = int(cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_COUNT))&#CV_CAP_PROP_FRAME_WIDTH Width of the frames in the video stream#CV_CAP_PROP_FRAME_HEIGHT Height of the frames in the video stream&fps = cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FPS)&wait = int(1/fps * 1000/1)&duration = (nbFrames * fps) / 1000&print 'Num. Frames = ', nbFramesprint 'Frame Rate = ', fps, 'fps'print 'Duration = ', duration, 'sec'&for f in xrange( nbFrames ):&&&&frameImg = cv.QueryFrame(capture)&&&&print cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_POS_FRAMES)&&&&cv.ShowImage("The Video", frameImg)&&&&cv.WaitKey(wait)
import numpy as np
import cv2
cap = cv2.VideoCapture('vtest.avi')
while(cap.isOpened()):
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cv2.imshow('frame',gray)
if cv2.waitKey(1) & 0xFF == ord('q'):
cap.release()
cv2.destroyAllWindows()
12345678910111213141516
import numpy as npimport cv2&cap = cv2.VideoCapture('vtest.avi')&while(cap.isOpened()):&&&&ret, frame = cap.read()&&&&&gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)&&&&&cv2.imshow('frame',gray)&&&&if cv2.waitKey(1) & 0xFF == ord('q'):&&&&&&&&break&cap.release()cv2.destroyAllWindows()
0x01. 摄像头视频读取
核心函数：
cv.CaptureFromCAM()
示例代码：
import cv2.cv as cv
capture = cv.CaptureFromCAM(0)
while True:
frame = cv.QueryFrame(capture)
cv.ShowImage("Webcam", frame)
c = cv.WaitKey(1)
if c == 27: #Esc on Windows
12345678910
import cv2.cv as cv&capture = cv.CaptureFromCAM(0)&while True:&&&&frame = cv.QueryFrame(capture)&&&&cv.ShowImage("Webcam", frame)&&&&c = cv.WaitKey(1)&&&&if c == 27: #Esc on Windows&&&&&&&&break
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
# Our operations on the frame come here
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Display the resulting frame
cv2.imshow('frame',gray)
if cv2.waitKey(1) & 0xFF == ord('q'):
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
1234567891011121314151617181920
import numpy as npimport cv2&cap = cv2.VideoCapture(0)&while(True):&&&&# Capture frame-by-frame&&&&ret, frame = cap.read()&&&&&# Our operations on the frame come here&&&&gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)&&&&&# Display the resulting frame&&&&cv2.imshow('frame',gray)&&&&if cv2.waitKey(1) & 0xFF == ord('q'):&&&&&&&&break&# When everything done, release the capturecap.release()cv2.destroyAllWindows()
0x02. 写入视频
摄像头录制视频
import cv2.cv as cv
capture=cv.CaptureFromCAM(0)
temp=cv.QueryFrame(capture)
writer=cv.CreateVideoWriter("output.avi", cv.CV_FOURCC("D", "I", "B", " "), 5, cv.GetSize(temp), 1)
#On linux I used to take "M","J","P","G" as fourcc
while count&50:
print count
image=cv.QueryFrame(capture)
cv.WriteFrame(writer, image)
cv.ShowImage('Image_Window',image)
cv.WaitKey(1)
123456789101112131415
import cv2.cv as cv&capture=cv.CaptureFromCAM(0)temp=cv.QueryFrame(capture)writer=cv.CreateVideoWriter("output.avi", cv.CV_FOURCC("D", "I", "B", " "), 5, cv.GetSize(temp), 1)#On linux I used to take "M","J","P","G" as fourcc&count=0while count&50:&&&&print count&&&&image=cv.QueryFrame(capture)&&&&cv.WriteFrame(writer, image)&&&&cv.ShowImage('Image_Window',image)&&&&cv.WaitKey(1)&&&&count+=1
从文件中读取视频并保存
import cv2.cv as cv
capture = cv.CaptureFromFile('img/mic.avi')
nbFrames = int(cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_COUNT))
width = int(cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_WIDTH))
height = int(cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_HEIGHT))
fps = cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FPS)
codec = cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FOURCC)
wait = int(1/fps * 1000/1) #Compute the time to wait between each frame query
duration = (nbFrames * fps) / 1000 #Compute duration
print 'Num. Frames = ', nbFrames
print 'Frame Rate = ', fps, 'fps'
writer=cv.CreateVideoWriter("img/new.avi", int(codec), int(fps), (width,height), 1) #Create writer with same parameters
cv.SetCaptureProperty(capture, cv.CV_CAP_PROP_POS_FRAMES,80) #Set the number of frames
for f in xrange( nbFrames - 80 ): #Just recorded the 80 first frames of the video
frame = cv.QueryFrame(capture)
print cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_POS_FRAMES)
cv.WriteFrame(writer, frame)
cv.WaitKey(wait)
1234567891011121314151617181920212223242526272829
import cv2.cv as cvcapture = cv.CaptureFromFile('img/mic.avi')&nbFrames = int(cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_COUNT))width = int(cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_WIDTH))height = int(cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FRAME_HEIGHT))fps = cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FPS)codec = cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_FOURCC)&wait = int(1/fps * 1000/1) #Compute the time to wait between each frame query&duration = (nbFrames * fps) / 1000 #Compute duration&print 'Num. Frames = ', nbFramesprint 'Frame Rate = ', fps, 'fps'&writer=cv.CreateVideoWriter("img/new.avi", int(codec), int(fps), (width,height), 1) #Create writer with same parameters&cv.SetCaptureProperty(capture, cv.CV_CAP_PROP_POS_FRAMES,80) #Set the number of frames&for f in xrange( nbFrames - 80 ): #Just recorded the 80 first frames of the video&&&&&frame = cv.QueryFrame(capture)&&&&&print cv.GetCaptureProperty(capture, cv.CV_CAP_PROP_POS_FRAMES)&&&&&cv.WriteFrame(writer, frame)&&&&&cv.WaitKey(wait)
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))
while(cap.isOpened()):
ret, frame = cap.read()
if ret==True:
frame = cv2.flip(frame,0)
# write the flipped frame
out.write(frame)
cv2.imshow('frame',frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
# Release everything if job is finished
cap.release()
out.release()
cv2.destroyAllWindows()
123456789101112131415161718192021222324252627
import numpy as npimport cv2&cap = cv2.VideoCapture(0)&# Define the codec and create VideoWriter objectfourcc = cv2.VideoWriter_fourcc(*'XVID')out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))&while(cap.isOpened()):&&&&ret, frame = cap.read()&&&&if ret==True:&&&&&&&&frame = cv2.flip(frame,0)&&&&&&&&&# write the flipped frame&&&&&&&&out.write(frame)&&&&&&&&&cv2.imshow('frame',frame)&&&&&&&&if cv2.waitKey(1) & 0xFF == ord('q'):&&&&&&&&&&&&break&&&&else:&&&&&&&&break&# Release everything if job is finishedcap.release()out.release()cv2.destroyAllWindows()
可能感兴趣的话题
楼主，我想问一下，读取的视频文件是不是有些格式要求，为什么我下载了一个.avi格式的文件，用播放器可以播放，但是读取该文件时，得不到相关的参数，求解
关于 Python 频道
Python频道分享 Python 开发技术、相关的行业动态。
新浪微博：
推荐微信号
（加好友请注明来意）
– 好的话题、有启发的回复、值得信赖的圈子
– 分享和发现有价值的内容与观点
– 为IT单身男女服务的征婚传播平台
– 优秀的工具资源导航
– 翻译传播优秀的外文文章
– 国内外的精选文章
– UI,网页，交互和用户体验
– 专注iOS技术分享
– 专注Android技术分享
– JavaScript, HTML5, CSS
– 专注Java技术分享
– 专注Python技术分享
& 2016 伯乐在线python系列均基于python3.4环境
---------@_@? --------------------------------------------------------------------
提出问题：如何简单抓取一个网页的源码
解决方法：利用urllib库，抓取一个网页的源代码
------------------------------------------------------------------------------------
#python3.4
import urllib.request
response = urllib.request.urlopen("/b")
print(response.read())
b'\n&!DOCTYPE html&\n&html&\n&head&\n
&meta charset="utf-8"/&\n
&title&\xe6\x89\xbe\xe6\x89\xbe\xe7\x9c\x8b - \xe5\x8d\x9a\xe5\xae\xa2\xe5\x9b\xad&/title&
&link rel="shortcut icon" href="/Content/Images/favicon.ico" type="image/x-icon"/&\n
&meta content="\xe6\x8a\x80\xe6\x9c\xaf\xe6\x90\x9c\xe7\xb4\xa2,IT\xe6\x90\x9c\xe7\xb4\xa2,\xe7\xa8\x8b\xe5\xba\x8f\xe6\x90\x9c\xe7\xb4\xa2,\xe4\xbb\xa3\xe7\xa0\x81\xe6\x90\x9c\xe7\xb4\xa2,\xe7\xa8\x8b\xe5\xba\x8f\xe5\x91\x98\xe6\x90\x9c\xe7\xb4\xa2\xe5\xbc\x95\xe6\x93\x8e" name="keywords" /&\n
&meta content="\xe9\x9d\xa2\xe5\x90\x91\xe7\xa8\x8b\xe5\xba\x8f\xe5\x91\x98\xe7\x9a\x84\xe4\xb8\x93\xe4\xb8\x9a\xe6\x90\x9c\xe7\xb4\xa2\xe5\xbc\x95\xe6\x93\x8e\xe3\x80\x82\xe9\x81\x87\xe5\x88\xb0\xe6\x8a\x80\xe6\x9c\xaf\xe9\x97\xae\xe9\xa2\x98\xe6\x80\x8e\xe4\xb9\x88\xe5\x8a\x9e\xef\xbc\x8c\xe5\x88\xb0\xe5\x8d\x9a\xe5\xae\xa2\xe5\x9b\xad\xe6\x89\xbe\xe6\x89\xbe\xe7\x9c\x8b..." name="description" /&\n
&link type="text/css" href="/Content/Style.css" rel="stylesheet" /&\n
&script src="/script/jquery.js" type="text/javascript"&&/script&\n
&script src="/Scripts/Common.js" type="text/javascript"&&/script&\n
&script src="/Scripts/Home.js" type="text/javascript"&&/script&\n&/head&\n&body&\n
&div class="top"&\n
&div class="top_tabs"&\n
&a href=""&\xc2\xab \xe5\x8d\x9a\xe5\xae\xa2\xe5\x9b\xad\xe9\xa6\x96\xe9\xa1\xb5 &/a&\n
&div id="span_userinfo" class="top_links"&\n
&div style="clear: both"&\n
&center&\n
&div id="main"&\n
&div class="logo_index"&\n
&a href=""&\n
&img alt="\xe6\x89\xbe\xe6\x89\xbe\xe7\x9c\x8blogo" src="/images/logo.gif" /&&/a&\n
&div class="index_sozone"&\n
&div class="index_tab"&\n
&a href="/n" onclick="return
channelSwitch('n');"&\xe6\x96\xb0\xe9\x97\xbb&/a&\n&a class="tab_selected" href="/b" onclick="return
channelSwitch('b');"&\xe5\x8d\x9a\xe5\xae\xa2&/a&
&a href="/k" onclick="return
channelSwitch('k');"&\xe7\x9f\xa5\xe8\xaf\x86\xe5\xba\x93&/a&\n
&a href="/q" onclick="return
channelSwitch('q');"&\xe5\x8d\x9a\xe9\x97\xae&/a&\n
&div class="search_block"&\n
&div class="index_btn"&\n
&input type="button" class="btn_so_index" onclick="Search();" value="&\xe6\x89\xbe\xe4\xb8\x80\xe4\xb8\x8b&" /&\n
&span class="help_link"&&a target="_blank" href="/help"&\xe5\xb8\xae\xe5\x8a\xa9&/a&&/span&\n
&input type="text" onkeydown="searchEnter(event);" class="input_index" name="w" id="w" /&\n
&div class="footer"&\n
& &a href=""&\xe5\x8d\x9a\xe5\xae\xa2\xe5\x9b\xad&/a&\n
&/center&\n&/body&\n&/html&\n'
附上python2.7的实现代码：
#python2.7
import urllib2
response = urllib2.urlopen("/b")
print response.read()
可见，python3.4和python2.7的代码存在差异性。
----------@_@？问题出现！----------------------------------------------------------------------
发现问题：查看上面的运行结果，会发现中文并没有正常显示。
解决问题：处理中文编码问题
--------------------------------------------------------------------------------------------------
处理源码中的中文问题！！！
修改代码，如下：
#python3.4
import urllib.request
response = urllib.request.urlopen("/b")
print(response.read().decode('UTF-8'))
运行，结果显示：
C:\Python34\python.exe E:/pythone_workspace/mydemo/spider/demo.py
&!DOCTYPE html&
&meta charset="utf-8"/&
&title&找找看 - 博客园&/title&
&link rel="shortcut icon" href="/Content/Images/favicon.ico" type="image/x-icon"/&
&meta content="技术搜索,IT搜索,程序搜索,代码搜索,程序员搜索引擎" name="keywords" /&
&meta content="面向程序员的专业搜索引擎。遇到技术问题怎么办，到博客园找找看..." name="description" /&
&link type="text/css" href="/Content/Style.css" rel="stylesheet" /&
&script src="/script/jquery.js" type="text/javascript"&&/script&
&script src="/Scripts/Common.js" type="text/javascript"&&/script&
&script src="/Scripts/Home.js" type="text/javascript"&&/script&
&div class="top"&
&div class="top_tabs"&
&a href=""&& 博客园首页 &/a&
&div id="span_userinfo" class="top_links"&
&div style="clear: both"&
&div id="main"&
&div class="logo_index"&
&a href=""&
&img alt="找找看logo" src="/images/logo.gif" /&&/a&
&div class="index_sozone"&
&div class="index_tab"&
&a href="/n" onclick="return
channelSwitch('n');"&新闻&/a&
&a class="tab_selected" href="/b" onclick="return
channelSwitch('b');"&博客&/a&
&a href="/k" onclick="return
channelSwitch('k');"&知识库&/a&
&a href="/q" onclick="return
channelSwitch('q');"&博问&/a&
&div class="search_block"&
&div class="index_btn"&
&input type="button" class="btn_so_index" onclick="Search();" value="&找一下&" /&
&span class="help_link"&&a target="_blank" href="/help"&帮助&/a&&/span&
&input type="text" onkeydown="searchEnter(event);" class="input_index" name="w" id="w" /&
&div class="footer"&
& &a href=""&博客园&/a&
Process finished with exit code 0
结果显示：处理完编码后，网页源码中中文可以正常显示了
-----------@_@! 探讨一个新的中文编码问题 ----------------------------------------------------------
　　　问题：&如果url中出现中文，那么应该如果解决呢？&
　　　例如：url = "/s?w=python爬虫&t=b"
-----------------------------------------------------------------------------------------------------
接下来，我们来解决url中出现中文的问题！！！
（1）测试1：保留原来的格式，直接访问，不做任何处理
代码示例：
#python3.4
import urllib.request
url="/s?w=python爬虫&t=b"
resp = urllib.request.urlopen(url)
print(resp.read().decode('UTF-8'))
运行结果：
C:\Python34\python.exe E:/pythone_workspace/mydemo/spider/demo.py
Traceback (most recent call last):
File "E:/pythone_workspace/mydemo/spider/demo.py", line 9, in &module&
response = urllib.request.urlopen(url)
File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 463, in open
response = self._open(req, data)
File "C:\Python34\lib\urllib\request.py", line 481, in _open
'_open', req)
File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 1210, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Python34\lib\urllib\request.py", line 1182, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "C:\Python34\lib\http\client.py", line 1088, in request
self._send_request(method, url, body, headers)
File "C:\Python34\lib\http\client.py", line 1116, in _send_request
self.putrequest(method, url, **skips)
File "C:\Python34\lib\http\client.py", line 973, in putrequest
self._output(request.encode('ascii'))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 15-16: ordinal not in range(128)
Process finished with exit code 1
　　果然不行！！！
（2）测试2：中文单独处理
代码示例：
import urllib.request
import urllib.parse
url = "/s?w=python"+ urllib.parse.quote("爬虫")+"&t=b"
resp = urllib.request.urlopen(url)
print(resp.read().decode('utf-8'))
运行结果：
C:\Python34\python.exe E:/pythone_workspace/mydemo/spider/demo.py
&!DOCTYPE html&
&meta charset="utf-8" /&
&title&python爬虫-博客园找找看&/title&
&link rel="shortcut icon" href="/Content/Images/favicon.ico" type="image/x-icon"/&
&link href="/Content/so.css?id=" rel="stylesheet" type="text/css" /&
&link href="/Content/jquery-ui-1.8.21.custom.css" rel="stylesheet" type="text/css" /&
&script src="/script/jquery.js" type="text/javascript"&&/script&
&script src="/Scripts/jquery-ui-1.8.11.min.js" type="text/javascript"&&/script&
&script src="/Scripts/Common.js" type="text/javascript"&&/script&
&script src="/Scripts/Search.js" type="text/javascript"&&/script&
&script src="/Scripts/jquery.ui.datepicker-zh-CN.js" type="text/javascript"&&/script&
&div class="top_bar"&
&div class="top_tabs"&
&a href=""&& 博客园首页 &/a&
&div id="span_userinfo"&
&div id="header"&
&div id="headerMain"&
&a id="logo" href="/"&&/a&
&div id="searchBox"&
&div id="searchRangeList"&
&li&&a href="/s?t=n" onclick="return
channelSwitch('n');"&新闻&/a&&/li&
&li&&a class="tab_selected" href="/s?t=b" onclick="return
channelSwitch('b');"&博客&/a&&/li&
&li&&a href="/s?t=k" onclick="return
channelSwitch('k');"&知识库&/a&&/li&
&li&&a href="/s?t=q" onclick="return
channelSwitch('q');"&博问&/a&&/li&
&!--end: searchRangeList --&
&div class="seachInput"&
&input type="text" onchange="ShowtFilter(this, false);" onkeypress="return searchEnter(event);"
value="python爬虫" name="w" id="w" maxlength="2048" title="博客园找找看" class="txtSeach" /&
&input type="button" value="找一下" class="btnSearch" onclick="Search();" /&&&&
&span class="help_link"&&a target="_blank" href="/help"&帮助&/a&&/span&
&!--end: seachInput --&
&!--end: searchBox --&
&div style="clear: both"&
&!--end: headerMain --&
&div id="searchInfo"&
&span style="float: margin-left: 15"&&/span&博客园找找看，找到相关内容&b id="CountOfResults"&1491&/b&篇，用时132毫秒
&!--end: searchInfo --&
&!--end: header --&
&div id="main"&
&div id="searchResult"&
&div style="clear: both"&
&div class="forflow"&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/hearzeus/p/5238867.html"&&strong&Python 爬虫&/strong&入门&&小项目实战（自动私信博客园某篇博客下的评论人，随机发送一条笑话，完整代码在博文最后）&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
&strong&python, 爬虫&/strong&,　　之前写的都是针对&strong&爬虫&/strong&过程中遇到问题...55561 　　&strong&python&/strong&代码如下： def getCo...通过关键特征告诉&strong&爬虫&/strong&，已经遍历结束了。我用的特征代码如下： ...定时器　　　　&strong&python&/strong&定时器，代码示例： impor
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/hearzeus/" target="_blank"&不剃头的一休哥&/a&
&/span&&span class="searchItemInfo-publishDate"&2016-03-03&/span&
&span class="searchItemInfo-good"&推荐(12)&/span&
&span class="searchItemInfo-comments"&评论(55)&/span&
&span class="searchItemInfo-views"&浏览(1582)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/hearzeus/p/5238867.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/hearzeus/p/5151449.html"&&strong&Python 爬虫&/strong&入门（一）&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
&strong&python, 爬虫&/strong&,　　毕设是做&strong&爬虫&/strong&相关的，本来想的是用j...太满意。之前听说&strong&Python&/strong&这方面比较强，就想用&strong&Python&/strong&...至此，一个简单的&strong&爬虫&/strong&就完成了。之后是针对反&strong&爬虫&/strong&的一些策略，比...a写，也写了几个&strong&爬虫&/strong&，其中一个是爬网易云音乐的用户信息，爬了
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/hearzeus/" target="_blank"&不剃头的一休哥&/a&
&/span&&span class="searchItemInfo-publishDate"&2016-01-22&/span&
&span class="searchItemInfo-good"&推荐(1)&/span&
&span class="searchItemInfo-comments"&评论(13)&/span&
&span class="searchItemInfo-views"&浏览(1493)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/hearzeus/p/5151449.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/xueweihan/p/4592212.html"&[&strong&Python&/strong&]新手写&strong&爬虫&/strong&全过程（已完成）&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
hool.cc/&strong&python&/strong&/&strong&python&/strong&-files-io...&strong&python, 爬虫&/strong&,今天早上起来，第一件事情就是理一理今天...任务，写一个只用&strong&python&/strong&字符串内建函数的&strong&爬虫&/strong&，定义为v1...实主要的不是学习&strong&爬虫&/strong&，而是依照这个需求锻炼下自己的编程能力，
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/xueweihan/" target="_blank"&削微寒&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-06-21&/span&
&span class="searchItemInfo-good"&推荐(13)&/span&
&span class="searchItemInfo-comments"&评论(11)&/span&
&span class="searchItemInfo-views"&浏览(2405)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/xueweihan/p/4592212.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/hearzeus/p/5157016.html"&&strong&Python 爬虫&/strong&入门（二）&& IP代理使用&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
的代理。　　在&strong&爬虫&/strong&中，有些网站可能为了防止&strong&爬虫&/strong&或者DDOS...&strong&python, 爬虫&/strong&,　　上一节，大概讲述了Python 爬...所以，我们可以用&strong&爬虫&/strong&爬那么IP。用上一节的代码，完全可以做到...(;;)这样的。&strong&python&/strong&中的for循环，in 表示X的取
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/hearzeus/" target="_blank"&不剃头的一休哥&/a&
&/span&&span class="searchItemInfo-publishDate"&2016-01-25&/span&
&span class="searchItemInfo-good"&推荐(3)&/span&
&span class="searchItemInfo-comments"&评论(21)&/span&
&span class="searchItemInfo-views"&浏览(1893)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/hearzeus/p/5157016.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/ruthon/p/4638262.html"&《零基础写&strong&Python爬虫&/strong&》系列技术文章整理收藏&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
&strong&Python&/strong&,《零基础写&strong&Python爬虫&/strong&》系列技术文章整理收... 1零基础写&strong&python爬虫&/strong&之&strong&爬虫&/strong&的定义及URL构成ht...ml 8零基础写&strong&python爬虫&/strong&之&strong&爬虫&/strong&编写全记录http:/...ml 9零基础写&strong&python爬虫&/strong&之&strong&爬虫&/strong&框架Scrapy安装配
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/ruthon/" target="_blank"&豆芽ruthon&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-07-11&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/ruthon/p/4638262.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/wenjianmuran/p/5049966.html"&&strong&Python爬虫&/strong&入门案例：获取百词斩已学单词列表&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
记不住。我们来用&strong&Python&/strong&来爬取这些信息，同时学习&strong&Python爬虫&/strong&基础。首先...&strong&Python&/strong&, 案例, 百词斩是一款很不错的单词记忆APP，在学习过程中，它会记录你所学的每...n）如果要在&strong&Python&/strong&中解析json，我们需要json库。我们打印下前两页
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/wenjianmuran/" target="_blank"&文剑木然&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-12-16&/span&
&span class="searchItemInfo-good"&推荐(12)&/span&
&span class="searchItemInfo-comments"&评论(4)&/span&
&span class="searchItemInfo-views"&浏览(1235)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/wenjianmuran/p/5049966.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/cs-player1/p/5169307.html"&&strong&python爬虫&/strong&之初体验&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
&strong&python, 爬虫&/strong&,上网简单看了几篇博客自己试了试简单的&strong&爬虫&/strong&哎呦喂很有感觉蛮好玩的之前写博客有点感觉是在写教程啊什么的写的很别扭各种复制粘贴写得很不舒服以后还是怎么舒服怎么写把每天的练习所得写上来就好了本来就是个菜鸟不断学习不断debug就好直接
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/cs-player1/" target="_blank"&cs-player1&/a&
&/span&&span class="searchItemInfo-publishDate"&2016-01-29&/span&
&span class="searchItemInfo-good"&推荐(1)&/span&
&span class="searchItemInfo-comments"&评论(14)&/span&
&span class="searchItemInfo-views"&浏览(798)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/cs-player1/p/5169307.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/hearzeus/p/5226546.html"&&strong&Python 爬虫&/strong&入门（四）&& 验证码下篇（破解简单的验证码）&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
&strong&python, 爬虫&/strong&,　　年前写了验证码上篇，本来很早前就想写下篇来着，只是过年比较忙，还有就是验证码破解比较繁杂，方法不同，正确率也会有差...码（这里我用的是&strong&python&/strong&的"PIL"图像处理库）　　　a.)转为灰度图　　　　PIL 在这方面也提供了极完备的支
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/hearzeus/" target="_blank"&不剃头的一休哥&/a&
&/span&&span class="searchItemInfo-publishDate"&2016-02-29&/span&
&span class="searchItemInfo-good"&推荐(7)&/span&
&span class="searchItemInfo-comments"&评论(17)&/span&
&span class="searchItemInfo-views"&浏览(888)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/hearzeus/p/5226546.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/xin-xin/p/4297852.html"&《&strong&Python爬虫&/strong&学习系列教程》学习笔记&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
家的交流。一、&strong&Python&/strong&入门 1. &strong&Python爬虫&/strong&入门...一之综述 2. &strong&Python爬虫&/strong&入门二之&strong&爬虫&/strong&基础了解 3. ... &strong&Python爬虫&/strong&入门七之正则表达式二、&strong&Python&/strong&实战 ...on进阶 1. &strong&Python爬虫&/strong&进阶一之&strong&爬虫&/strong&框架Scrapy
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/xin-xin/" target="_blank"&心_心&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-02-23&/span&
&span class="searchItemInfo-good"&推荐(3)&/span&
&span class="searchItemInfo-comments"&评论(2)&/span&
&span class="searchItemInfo-views"&浏览(34430)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/xin-xin/p/4297852.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/nishuihan/p/4754622.html"&PHP, &strong&Python&/strong&, Node.js 哪个比较适合写&strong&爬虫&/strong&？&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
子，做一个简单的&strong&爬虫&/strong&容易，但要做一个完备的&strong&爬虫&/strong&挺难的。像我搭...path的类库/&strong&爬虫&/strong&库后，就会发现此种方式虽然入门门槛低，但...荐采用一些现成的&strong&爬虫&/strong&库，诸如xpath、多线程支持还是必须考...以考虑。3、如果&strong&爬虫&/strong&是涉及大规模网站爬取，效率、扩展性、可维
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/nishuihan/" target="_blank"&技术宅小牛牛&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-08-24&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/nishuihan/p/4754622.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/nishuihan/p/4815930.html"&PHP, &strong&Python&/strong&, Node.js 哪个比较适合写&strong&爬虫&/strong&？&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
子，做一个简单的&strong&爬虫&/strong&容易，但要做一个完备的&strong&爬虫&/strong&挺难的。像我搭...主要看你定义的&&strong&爬虫&/strong&&干什么用。1、如果是定向爬取几个页面，...path的类库/&strong&爬虫&/strong&库后，就会发现此种方式虽然入门门槛低，但...荐采用一些现成的&strong&爬虫&/strong&库，诸如xpath、多线程支持还是必须考
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/nishuihan/" target="_blank"&技术宅小牛牛&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-09-17&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/nishuihan/p/4815930.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/rwxwsblog/p/4557123.html"&安装&strong&python爬虫&/strong&scrapy踩过的那些坑和编程外的思考&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
了一下开源的&strong&爬虫&/strong&资料，看了许多对于开源&strong&爬虫&/strong&的比较发现开源&strong&爬虫&/strong&...没办法，只能升级&strong&python&/strong&的版本了。 1、升级&strong&python&/strong&...s://www.&strong&python&/strong&.org/ftp/&strong&python&/strong&/...n 检查&strong&python&/strong&版本 &strong&python&/strong& --ve
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/rwxwsblog/" target="_blank"&秋楓&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-06-06&/span&
&span class="searchItemInfo-good"&推荐(2)&/span&
&span class="searchItemInfo-comments"&评论(1)&/span&
&span class="searchItemInfo-views"&浏览(4607)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/rwxwsblog/p/4557123.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/maybe2030/p/4555382.html"&[&strong&Python&/strong&] 网络&strong&爬虫&/strong&和正则表达式学习总结&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
有的网站为了防止&strong&爬虫&/strong&，可能会拒绝&strong&爬虫&/strong&的请求，这就需要我们来修...，正则表达式不是&strong&Python&/strong&的语法，并不属于&strong&Python&/strong&，其...\d" 2.2 &strong&Python&/strong&的re模块　　&strong&Python&/strong&通过... 实例描述 &strong&python&/strong& 匹配 "&strong&python&/strong&".
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/maybe2030/" target="_blank"&poll的笔记&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-06-05&/span&
&span class="searchItemInfo-good"&推荐(2)&/span&
&span class="searchItemInfo-comments"&评论(5)&/span&
&span class="searchItemInfo-views"&浏览(1089)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/maybe2030/p/4555382.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/mr-zys/p/5059451.html"&一个简单的多线程&strong&Python爬虫&/strong&（一）&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
一个简单的多线程&strong&Python爬虫&/strong& 最近想要抓取[拉勾网](h...自己写一个简单的&strong&Python爬虫&/strong&的想法。本文中的部分链接...0525185/&strong&python&/strong&-threading-how-d...0525185/&strong&python&/strong&-threading-how-do-i-lock-a-thread) ## 一个&strong&爬虫&/strong&
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/mr-zys/" target="_blank"&mr_zys&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-12-19&/span&
&span class="searchItemInfo-good"&推荐(3)&/span&
&span class="searchItemInfo-comments"&评论(4)&/span&
&span class="searchItemInfo-views"&浏览(696)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/mr-zys/p/5059451.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/jixin/p/5145813.html"&自学&strong&Python&/strong&十一 &strong&Python爬虫&/strong&总结&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
Demo 　　&strong&爬虫&/strong&就靠一段落吧，更深入的&strong&爬虫&/strong&框架以及htm...学习与尝试逐渐对&strong&python爬虫&/strong&有了一些小小的心得，我们渐渐...尝试着去总结一下&strong&爬虫&/strong&的共性，试着去写个helper类以避免重...。　　参考:用&strong&python爬虫&/strong&抓站的一些技巧总结 zz 　
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/jixin/" target="_blank"&我的代码会飞&/a&
&/span&&span class="searchItemInfo-publishDate"&2016-01-20&/span&
&span class="searchItemInfo-good"&推荐(3)&/span&
&span class="searchItemInfo-comments"&评论(1)&/span&
&span class="searchItemInfo-views"&浏览(696)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/jixin/p/5145813.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/hearzeus/p/5162691.html"&&strong&Python 爬虫&/strong&入门（三）&& 寻找合适的爬取策略&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
&strong&python, 爬虫&/strong&,　　写&strong&爬虫&/strong&之前，首先要明确爬取的数据。...怎么寻找一个好的&strong&爬虫&/strong&策略。（代码仅供学习交流，切勿用作商业或...（这个也是我们用&strong&爬虫&/strong&发请求的结果），如图所示　　　　很庆...).顺便说一句，&strong&python&/strong&有json解析模块，可以用。　　下面附上蝉游记的&strong&爬虫&/strong&
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/hearzeus/" target="_blank"&不剃头的一休哥&/a&
&/span&&span class="searchItemInfo-publishDate"&2016-01-27&/span&
&span class="searchItemInfo-good"&推荐(5)&/span&
&span class="searchItemInfo-comments"&评论(3)&/span&
&span class="searchItemInfo-views"&浏览(799)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/hearzeus/p/5162691.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/ybjourney/p/5304501.html"&&strong&python&/strong&简单&strong&爬虫&/strong&&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
　　&strong&爬虫&/strong&真是一件有意思的事儿啊，之前写过&strong&爬虫&/strong&，用的是urll...Soup实现简单&strong&爬虫&/strong&，scrapy也有实现过。最近想更好的学...习&strong&爬虫&/strong&，那么就尽可能的做记录吧。这篇博客就我今天的一个学习过...的语法规则，我在&strong&爬虫&/strong&中常用的有： . 匹配任意字符（换
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/ybjourney/" target="_blank"&oyabea&/a&
&/span&&span class="searchItemInfo-publishDate"&2016-03-22&/span&
&span class="searchItemInfo-good"&推荐(4)&/span&
&span class="searchItemInfo-comments"&评论(1)&/span&
&span class="searchItemInfo-views"&浏览(477)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/ybjourney/p/5304501.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/hippieZhou/p/4967075.html"&&strong&Python&/strong&带你轻松进行网页&strong&爬虫&/strong&&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
，所以就打算自学&strong&Python&/strong&。在还没有学它的时候就听说用它来进行网页&strong&爬虫&/strong&...3.0这次的网络&strong&爬虫&/strong&需求背景我打算延续DotNet开源大本营...例。2.实战网页&strong&爬虫&/strong&：2.1.获取城市列表：首先，我们需要获...行速度，那么可能&strong&Python&/strong&还是挺适合的，毕竟可以通过它写更
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/hippiezhou/" target="_blank"&hippiezhou&/a&
&/span&&span class="searchItemInfo-publishDate"&2015-11-22&/span&
&span class="searchItemInfo-good"&推荐(2)&/span&
&span class="searchItemInfo-comments"&评论(2)&/span&
&span class="searchItemInfo-views"&浏览(1563)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/hippieZhou/p/4967075.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/mfryf/p/3695844.html"&开发记录_自学&strong&Python&/strong&写&strong&爬虫&/strong&程序爬取csdn个人博客信息&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
.3_开工据说&strong&Python&/strong&并不难，看过了&strong&python&/strong&的代码...lecd这个半&strong&爬虫&/strong&半网站的项目，累积不少&strong&爬虫&/strong&抓站的经验，... 某些网站反感&strong&爬虫&/strong&的到访，于是对&strong&爬虫&/strong&一律拒绝请求 ...模仿了一个自己的&strong&Python爬虫&/strong&。 [&strong&python&/strong&]
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/mfryf/" target="_blank"&知识天地&/a&
&/span&&span class="searchItemInfo-publishDate"&2014-04-28&/span&
&span class="searchItemInfo-good"&推荐(1)&/span&
&span class="searchItemInfo-comments"&评论(1)&/span&
&span class="searchItemInfo-views"&浏览(4481)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/mfryf/p/3695844.html&/span&
&!--end: searchURL --&
&div class="searchItem"&
&h3 class="searchItemTitle"&
&a target="_blank" href="/coltfoal/archive//2713348.html"&&strong&Python&/strong&天气预报采集器（网页&strong&爬虫&/strong&）&/a&
&!--end: searchItemTitle --&
&span class="searchCon"&
的。　　补充上&strong&爬虫&/strong&结果的截图：　　　　&strong&python&/strong&的使...编程, &strong&Python&/strong&,　　python是一门很强大的语言，在...以就算了。　　&strong&爬虫&/strong&简单说来包括两个步骤：获得网页文本、过滤...ml文本。　　&strong&python&/strong&在获取html方面十分方便，寥寥
&!--end: searchCon --&
&div class="searchItemInfo"&
&span class="searchItemInfo-userName"&
&a href="/coltfoal/" target="_blank"&coltfoal&/a&
&/span&&span class="searchItemInfo-publishDate"&2012-10-06&/span&
&span class="searchItemInfo-good"&推荐(5)&/span&
&span class="searchItemInfo-comments"&评论(16)&/span&
&span class="searchItemInfo-views"&浏览(5412)&/span&
&div class="searchItemInfo"&
&span class="searchURL"&/coltfoal/archive/2012/10/06/2713348.html&/span&
&!--end: searchURL --&
&div id="paging_block"&&div class="pager"&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=1" class="p_1 current" onclick="R;buildPaging(1);"&1&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=2" class="p_2" onclick="R;buildPaging(2);"&2&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=3" class="p_3" onclick="R;buildPaging(3);"&3&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=4" class="p_4" onclick="R;buildPaging(4);"&4&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=5" class="p_5" onclick="R;buildPaging(5);"&5&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=6" class="p_6" onclick="R;buildPaging(6);"&6&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=7" class="p_7" onclick="R;buildPaging(7);"&7&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=8" class="p_8" onclick="R;buildPaging(8);"&8&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=9" class="p_9" onclick="R;buildPaging(9);"&9&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=10" class="p_10" onclick="R;buildPaging(10);"&10&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=11" class="p_11" onclick="R;buildPaging(11);"&11&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=12" class="p_12" onclick="R;buildPaging(12);"&12&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=13" class="p_13" onclick="R;buildPaging(13);"&13&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=14" class="p_14" onclick="R;buildPaging(14);"&14&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=15" class="p_15" onclick="R;buildPaging(15);"&15&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=16" class="p_16" onclick="R;buildPaging(16);"&16&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=17" class="p_17" onclick="R;buildPaging(17);"&17&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=18" class="p_18" onclick="R;buildPaging(18);"&18&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=19" class="p_19" onclick="R;buildPaging(19);"&19&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=20" class="p_20" onclick="R;buildPaging(20);"&20&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=21" class="p_21" onclick="R;buildPaging(21);"&21&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=22" class="p_22" onclick="R;buildPaging(22);"&22&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=23" class="p_23" onclick="R;buildPaging(23);"&23&/a&&&&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=75" class="p_75" onclick="R;buildPaging(75);"&75&/a&&a href="/s?w=python%e7%88%ac%e8%99%ab&t=b&p=2" onclick="R;buildPaging(2);"&Next &&/a&&/div&&/div&&script type="text/javascript"&var pagingBuider={"OnlyLinkText":false,"TotalCount":1491,"PageIndex":1,"PageSize":20,"ShowPageCount":11,"SkipCount":0,"UrlFormat":"/s?w=python%e7%88%ac%e8%99%ab&t=b&p={0}","OnlickJsFunc":"R","FirstPageLink":"/s?w=python%e7%88%ac%e8%99%ab&t=b&p=1","AjaxUrl":"/","AjaxCallbak":null,"TopPagerId":"pager_top","IsRenderScript":true};function buildPaging(pageIndex){pagingBuider.PageIndex=pageI$.ajax({url:pagingBuider.AjaxUrl,data:JSON.stringify(pagingBuider),type:'post',dataType:'text',contentType:'application/ charset=utf-8',success:function (data) { $('#paging_block').html(data); var pagerTop=$('#pager_top');if(pageIndex&1){$(pagerTop).html(data).show();}else{$(pagerTop).hide();}}});}&/script&
&div class="forflow" id="sidebar"&
&div class="s_google"&
用 &a href="javascript:void(0);" title="Google站内搜索" onclick="return google_search()"&Google&/a& 找一下&br/&
&div style="clear:"&
&div style="clear:"&
&div class="sideRightWidget"&
&b&按浏览数筛选&/b&&br /&
&ol id="viewsRange"&
class="ui-selected"
&&a href="javascript:void(0);" onclick="Views(0);redirect();"&全部&/a&&/li&
&li &&a href="javascript:void(0);" onclick="Views(200);redirect();"&200以上&/a&&/li&
&li &&a href="javascript:void(0);" onclick="Views(500);redirect();"&500以上&/a&&/li&
&li &&a href="javascript:void(0);" onclick="Views(1000);redirect();"&1000以上&/a&&/li&
&div style="clear:"&
&div class="sideRightWidget"&
&b&按时间筛选&/b&&br /&
&ol id="dateRange"&
class="ui-selected"
&&a href="javascript:void(0);" onclick="clearDate();dateRange(null);redirect();"&全部&/a&&/li&
&li &&a href="javascript:void(0);" onclick="dateRange('One-Week');redirect();"&
一周内&/a&&/li&
&li &&a href="javascript:void(0);" onclick="dateRange('One-Month');redirect();"&
一月内&/a&&/li&
&li &&a href="javascript:void(0);" onclick="dateRange('Three-Month');redirect();"&
三月内&/a&&/li&
&li &&a href="javascript:void(0);" onclick="dateRange('One-Year');redirect();"&
一年内&/a&&/li&
&p id="datepicker"&
&input type="text" id="dateMin"
class="datepicker"/&-&input type="text" id="dateMax" class="datepicker"
&div style="clear:"&
&div class="sideRightWidget"&
& 去&&a title="博问是博客园提供的问答系统" href="/"&博问&/a&&问一下？
& 搜索&&a href="/search/"&招聘职位&/a&&
& 我有&a href="/forum/public"&反馈或建议&/a&
&div id="siderigt_ad"&
&script type='text/javascript'&
var googletag = googletag || {};
googletag.cmd = googletag.cmd || [];
(function () {
var gads = document.createElement('script');
gads.async = true;
gads.type = 'text/javascript';
var useSSL = 'https:' == document.location.
gads.src = (useSSL ? 'https:' : 'http:') +
'///tag/js/gpt.js';
var node = document.getElementsByTagName('script')[0];
node.parentNode.insertBefore(gads, node);
&script type='text/javascript'&
googletag.cmd.push(function () {
googletag.defineSlot('/1090369/cnblogs_zzk_Z1', [300, 250], 'div-gpt-ad-0-0').addService(googletag.pubads());
googletag.pubads().enableSingleRequest();
googletag.enableServices();
&!-- cnblogs_zzk_Z1 --&
&div id='div-gpt-ad-0-0' style='width:300 height:250'&
&script type='text/javascript'&
googletag.cmd.push(function () { googletag.display('div-gpt-ad-0-0'); });
&div style="clear:"&
&div id="footer"&
& 2004-2016 &a title="开发者的网上家园" href=""&博客园&/a&
&script type="text/javascript"&
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-']);
_gaq.push(['_trackPageview']);
(function () {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + './ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
&!--end: footer --&
Process finished with exit code 0
结果显示：对url中的中文进行单独处理，url对应内容可以正常抓取了
------@_@! 又有一个新的问题-----------------------------------------------------------
问题：如果把url的中英文一起进行处理呢？还能成功抓取吗？
----------------------------------------------------------------------------------------
（3）于是，测试3出现了！测试3：url中，中英文一起进行处理
代码示例：
#python3.4
import urllib.request
import urllib.parse
url = urllib.parse.quote("/s?w=python爬虫&t=b")
resp = urllib.request.urlopen(url)
print(resp.read().decode('utf-8'))
运行结果：
C:\Python34\python.exe E:/pythone_workspace/mydemo/spider/demo.py
Traceback (most recent call last):
File "E:/pythone_workspace/mydemo/spider/demo.py", line 21, in &module&
resp = urllib.request.urlopen(url)
File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 448, in open
req = Request(fullurl, data)
File "C:\Python34\lib\urllib\request.py", line 266, in __init__
self.full_url = url
File "C:\Python34\lib\urllib\request.py", line 292, in full_url
self._parse()
File "C:\Python34\lib\urllib\request.py", line 321, in _parse
raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: 'http%3A///s%3Fw%3Dpython%E7%88%AC%E8%99%AB%26t%3Db'
Process finished with exit code 1
结果显示：ValueError！无法成功抓取网页！
结合测试1、2、3，可得到下面结果：
（1）在python3.4中，如果url中包含中文，可以用&urllib.parse.quote("爬虫") 进行处理。
（2）url中的中文需要单独处理，不能中英文一起处理。
Tips：如果想了解一个函数的参数传值
#python3.4
import urllib.request
help(urllib.request.urlopen)
运行上面代码，控制台输出
C:\Python34\python.exe E:/pythone_workspace/mydemo/spider/demo.py
Help on function urlopen in module urllib.request:
urlopen(url, data=None, timeout=&object object at 0x00A50490&, *, cafile=None, capath=None, cadefault=False, context=None)
Process finished with exit code 0
& @_@)Y，这篇的分享就到此结束~待续~
阅读(...) 评论()}

天天发财游戏网