perloader

列表解析式

urls = [“http://www.hostloc.com/forum-45-{}.html”.format(str(i)) for i in range(1,15,1)] print(urls) 运行结果 [‘http://www.hostloc.com/forum-45-1.html’, ‘http://www.hostloc.com/forum-45-2.html’, ‘http://www.hostloc.com/forum-45-3.html’, ‘http://www.hostloc.com/forum-45-4.html’, ‘http://www.hostloc.com/forum-45-5.html’, ‘http://www.hostloc.com/forum-45-6.html’, ‘http://www.hostloc.com/forum-45-7.html’, ‘http://www.hostloc.com/forum-45-8.html’, ‘http://www.hostloc.com/forum-45-9.html’, ‘http://www.hostloc.com/forum-45-10.html’, ‘http://www.hostloc.com/forum-45-11.html’, ‘http://www.hostloc.com/forum-45-12.html’, ‘http://www.hostloc.com/forum-45-13.html’, ‘http://www.hostloc.com/forum-45-14.html’]

.

Read more

自写的第一个爬取图片实例

老套的爬图片实例,陆续写了几天,不容易,终于出来了。 难点有:分别为图片创建文件夹,爬取时不惧怕防盗链。 没有定义函数 比较难看,哈哈 from bs4 import BeautifulSoup import requests import urllib.request import os import re website = “http://www.sucaibar.com/image/meinv/” web_data = requests.get(website) soup = BeautifulSoup(web_data.text,’lxml’) urls= soup.select(‘#pic-list > li > a’) url_list = [] for url in urls: url_list.append(url.get(‘href’)) print(url_list) for url in url_list: web_data = requests.get(url) soup = BeautifulSoup(web_data.text,’lxml’) links= soup.select(‘body > div.content > […]

.

Read more

Windows上Python3.5安装lxml

pip install wheel http://www.lfd.uci.edu/~gohlke/pythonlibs/ 找到 lxml-3.6.4-cp35-cp35m-win32.whl lxml-3.6.4-cp35-cp35m-win_amd64.whl 放到Python 目录下 pip install lxml-3.6.4-cp35-cp35m-win_amd64.whl

.

Read more

火车头发布wordpress,定时发布过期,mysql批量修改

用火车头采集,然后发布至wordpress,由于时间成千上万条,定时发布会有很多失败也就是过期的,可以使用mysql命令行批量修改 。 先找出最近还没发布的post 的ID UPDATE wp_posts SET post_status = ‘publish’ WHERE ID < 23790 AND post_status = ‘future’;  

.

Read more

Python 片段

打印网页源代码 import urllib.request response = urllib.request.urlopen(“http://027886.xyz/”) print (response.read().decode(‘utf-8’)) 设置头部header head = {} head[‘User-Agent’] = ‘Mozilla ……’ head[‘Referer’] = ‘http://027886.xyz’ POST传送数据 import urllib.parse data = {} data[‘f’] = ‘undefined’ data[‘t’] = ‘undefined’ data[‘w’] = content 词霸翻译示例 import urllib.request import urllib.parse import json import time while True: content = input(‘请输入需要翻译的中文内容(输入“q!”退出程序):’) if content == ‘q!’: break url = […]

.

Read more

python 笔记

1.sum()函数接受一个list作为参数,并返回list所有元素之和。请计算 1*1 + 2*2 + 3*3 + … + 100*100。 L = [] x = 1 while x <= 100: L.append(x*x) x += 1 print (sum(L)) append() 方法用于在列表末尾添加新的对象。 以下实例展示了 append()函数的使用方法: #!/usr/bin/python aList = [123, ‘xyz’, ‘zara’, ‘abc’]; aList.append( 2009 ); print “Updated List : “, aList; 以上实例输出结果如下: Updated List : [123, ‘xyz’, ‘zara’, ‘abc’, 2009] […]

.

Read more

centos 6.x yum 安装LAMP

CentOS默认yum源软件版本太低了 本次安装最终为: CentOS release 6.8 (Final) PHP 5.4.45   1、配置防火墙,开启80端口、3306端口 vi /etc/sysconfig/iptables -A INPUT -m state –state NEW -m tcp -p tcp –dport 80 -j ACCEPT -A INPUT -m state –state NEW -m tcp -p tcp –dport 3306 -j ACCEPT :wq! #保存退出 注意需要添加至22端口的位置 即 -A INPUT -m state –state NEW -m tcp -p tcp –dport […]

.

Read more

CentOS 版本查询

1. 查看 Linux Kernel 版本 $uname -r 2. 查看 CentOS 版本 $cat /etc/redhat-release 3. 查看 PHP 版本 $php -v 4. 查看 MySQL 版本 $mysql -v 5. 查看 Apache 版本 $rpm -qa httpd

.

Read more

Centos6下安装Python3.5

centos6.6自带的Python2.6,不能够用yum安装,那么只能从源码编译安装。 yum install gcc -y yum install openssl-devel 如果没有安装openssl-devel,在安装过程中pip无法安装 wget https://www.python.org/ftp/python/3.5.1/Python-3.5.1.tgz tar zxvf Python-3.5.1.tgz cd Python-3.5.1 ./configure&& make && make install 将 python3.5 链接到python3命令 whereis python 会输出所有Python的目录 /usr/local/bin/python3.5 sudo ln -s /usr/local/bin/python3.5 /usr/bin/python3 就可以把Python3.5链接到Python3了 安装完后就可以输入# python3 看看是否安装完成吧 python3 –version 如果需要将python命令重新指向python2.6 sudo rm /usr/bin/python sudo ln -s /usr/bin/python2.6 /usr/bin/python sudo rm /usr/bin/python sudo ln -s /usr/local/bin/python3.5 […]

.

Read more

Python3 第一个小爬虫:抓取页面

查看: first-python-script #encoding:utf-8 import urllib.request as request #导入urllib模块 import urllib.parse as parse import string print(“”” +++++++++++++++++++++++ For test only version: python3.3 +++++++++++++++++=++++ “””) def baidu_tieba(url, begin_page, end_page): for i in range(begin_page, end_page + 1): sName = ‘c:/ChromeDL/PPyy/000/test/’+str(i).zfill(5)+’.html’ print(‘正在下载第’+str(i)+’个页面, 并保存为’+sName) m = request.urlopen(url+str(i)).read() with open(sName,’wb’) as file: file.write(m) file.close() if __name__ == “__main__”: url = “http://tieba.baidu.com/p/” begin_page […]

.

Read more
xyz