學(xué)習(xí)了很多Python知識(shí),敲寫了千萬(wàn)行代碼,感覺(jué)學(xué)Python語(yǔ)言太枯燥乏味了。但是呢,本著美女是學(xué)習(xí)動(dòng)力的第一原則,啊哈哈。寫個(gè)程序把妹子們都下載下來(lái)吧。
今天咱們就利用Python爬取唯一圖庫(kù)(http://www.mmonly.cc/mmtp/)上的漂亮的妹子圖,給大家一波福利。O(∩_∩)O。
妹子圖片質(zhì)量整體上還是不錯(cuò)呦,放三張不同風(fēng)格的圖大家感受下,O(∩_∩)O哈哈~
import urllib.requestfrom bs4 import BeautifulSoupimport osdef Download(url,picAlt,name):...def run(targetUrl, beginNUM ,endNUM):... if beginNUM ==endNUM...if __name__ == '__main__':
該程序利用Beautiful Soup實(shí)現(xiàn)的,它其實(shí)是Python的一個(gè)庫(kù),主要功能是從網(wǎng)頁(yè)抓取數(shù)據(jù),可參考這篇詳細(xì)文章(https://cuiqingcai.com/1319.html/comment-page-1#comments)
安裝Beautiful Soup
pip install beautiful soup4
導(dǎo)包
from bs4 import BeautifulSoup
建立保存路徑
def Download(url,picAlt,name): path = 'D:\\pythonD爬蟲妹子圖\\'+picAlt+'\\' if not os.path.exists(path): os.makedirs(path) urllib.request.urlretrieve( url, '{0}{1}.jpg'.format(path, name))
import urllib.requestfrom bs4 import BeautifulSoupimport os def Download(url,picAlt,name): path = 'D:\\pythonD爬蟲妹子圖\\'+picAlt+'\\' if not os.path.exists(path): os.makedirs(path) urllib.request.urlretrieve( url, '{0}{1}.jpg'.format(path, name)) header = { "User-Agent":'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36', 'Accept': '*/*', 'Accept-Language': 'en-US,en;q=0.8', 'Cache-Control': 'max-age=0', 'Connection': 'keep-alive' }def run(targetUrl, beginNUM ,endNUM): req = urllib.request.Request(url=targetUrl,headers=header) response = urllib.request.urlopen(req) html = response.read().decode('gb2312','ignore') soup = BeautifulSoup(html, 'html.parser') Divs = soup.find_all('div',attrs={'id':'big-pic' }) nowpage = soup.find('span',attrs={'class':'nowpage'}).get_text() totalpage= soup.find('span',attrs={'class':'totalpage'}).get_text() if beginNUM ==endNUM : return for div in Divs: beginNUM = beginNUM+1 if div.find("a") is None : print("沒(méi)有下一張了") return elif div.find("a")['href'] is None or div.find("a")['href']=="": print("沒(méi)有下一張了None") return print("下載信息:總進(jìn)度:",beginNUM,"/",endNUM," ,正在下載套圖:(",nowpage,"/",totalpage,")") if int(nowpage)<int(totalpage): nextPageLink ="http://www.mmonly.cc/mmtp/qcmn/" +(div.find('a')['href']) elif int(nowpage)==int(totalpage): nextPageLink = (div.find('a')['href']) picLink = (div.find('a').find('img')['src']) picAlt = (div.find('a').find('img'))['alt'] print('下載的圖片鏈接:',picLink) print('套圖名:[ ', picAlt , ' ] ') print('開(kāi)始下載...........') Download(picLink,picAlt, nowpage) print("下載成功!") print('下一頁(yè)鏈接:',nextPageLink) run(nextPageLink,beginNUM ,endNUM) return if __name__ == '__main__': targetUrl ="http://www.mmonly.cc/mmtp/qcmn/237269.html" run(targetUrl,beginNUM=0,endNUM=70) print(" OVER")
聯(lián)系客服