多线程列子


Mar 28 2016

多线程列子

首页 » 渗透编程 » 多线程列子   

# -*- coding: utf-8 -*-
from BeautifulSoup import BeautifulSoup
import requests
import threading
import Queue
import time

with open('url.txt') as f:
    l = f.readlines()

def btdk(url):
    try:
        html = requests.get(url, timeout = 10).text
    except:
        html = '<html><title>%s</title><meta name="keywords" content="" /><meta name="description" content="" /></html>'%url
    soup = BeautifulSoup(html.lower())
    t = soup.title.text.encode('utf8','ignore')
    try:
        k = soup.find(attrs={"name":"keywords"})['content'].encode('utf8','ignore')
    except:
        k = ""
    try:
        d = soup.find(attrs={"name":"description"})['content'].encode('utf8','ignore')
    except:
        d = ""

    return t,d,k


class MyThread(threading.Thread):

    def __init__(self, queue, url):
        threading.Thread.__init__(self)
        self.queue = queue
        self.url = url

    def run(self):
        while True:
            url = self.queue.get()
            t,k,d = btdk(url)
            with open('tdk.txt', 'a+') as s:
                line = url+'#'+t+'#'+'\n'
                s.writelines(line)
            self.queue.task_done()


def test(l, ts=4):
    ll = [i.rstrip() for i in l]
    for j in range(ts):
        t = MyThread(queue,ll)
        t.setDaemon(True)
        t.start()
    for url in ll:
        queue.put(url)
    queue.join()
if __name__ == '__main__':
    queue = Queue.Queue()
    start = time.time()
    test(l,4)
    end = time.time()
    print '共耗时:%s秒' % (end - start)

如果您喜欢本博客,欢迎点击图片定订阅到邮箱填写您的邮件地址,订阅我们的精彩内容:

正文部分到此结束

文章标签:这篇文章木有标签

版权声明:若无特殊注明,本文皆为( mOon )原创,转载请保留文章出处。

也许喜欢: «使用LINQ解除SQL注入安全问题 | 多线程后台扫描器【Python】»

你肿么看?

你还可以输入 250/250 个字

 微笑 大笑 拽 大哭 亲亲 流汗 喷血 奸笑 囧 不爽 晕 示爱 害羞 吃惊 惊叹 爱你 吓死了 呵呵

评论信息框

这篇文章还没有收到评论,赶紧来抢沙发吧~