Page 5 of 71 2 3 456 7

5月 27

从mysql向redis中加载数据测试

Posted on 2012 年 5 月 27 日 by 张子萌

　　有测试显示reids如果使用持久化测试后效率会下降，所以不使用持久化。现在来测试一下从mysql中捞取数据加载到redis中的速度。
　　服务器使用8核2.6 cpu，内存8G，sas硬盘，Centos5.6 64位操作系统。python 2.6 redis2.4.13.
　　使用测试代码如下，从mysql的photo表中捞取两列数据加载到redis中，这两列在表中都有索引，数据量28万。

#!/bin/env python
# -------------------------------------------------
# Filename:    
# Revision:    
# Date:        2012-05-27
# Author:      simonzhang
# Email:       simon-zzm@163.com
# -------------------------------------------------
import MySQLdb
import redis


def redis_run(sql_data):
    try:
        r = redis.Redis(host='192.168.1.100', password = '123456', port=6379, db=0)
    except redis.RedisError, e:
        print "Error %s" % e
    for i in sql_data:
        r.set(str(i[0]),i[1])
        

def mysql_run(sql):
    try:
        db = MySQLdb.connect(host='192.168.1.100', user='test', passwd ='123456', db='photo')
        cursor = db.cursor()   
    except MySQLdb.Error, e:
        print "Error %d:%s" % (e.args[0],e.args[1])
        exit(1)
    try:
        result_set = ''
        cursor.execute('%s' % sql)
        result_set=cursor.fetchall()
        cursor.close()
        db.close()
        return  result_set
    except MySQLdb.Error, e:
        print "Error %d:%s" % (e.args[0], e.args[1])
        cursor.close()
        db.close()

def main():
    _loop = 0
    _limit_start = 0
    _limit_span = 10000
    _count_result = 5
    while _count_result > 0:
        result_data = ''
        sql = "select id as pid, userid as uid from photo LIMIT %s,%s" % (_limit_start + _limit_span * _loop, _limit_span)
        result_data = mysql_run(sql)
        _count_result = len(result_data)
        redis_run(result_data)
        _loop += 1


if __name__ == '__main__':
    main()

进行测试，分别为每次捞取50万，10万，5万，1万，结果如下：

50万
real 0m26.239s
user 0m16.816s
sys 0m5.745s

10万
real 0m24.019s
user 0m15.670s
sys 0m4.932s

5万
real 0m26.061s
user 0m15.789s
sys 0m4.674s

1万
real 0m28.705s
user 0m15.778s
sys 0m4.913s

结论：每次捞取10万效率会比较理想，对于操作系统的压力不大，所以硬件方面不用考虑。
这里两列保存的都是id，加入用户id和照片id长度都是9位，一组数据是18位。一亿组数据也就需要2G内存。
通过计算28万需要24秒，如果有1亿的数据，全部倒入要2个半小时。所以内存存储不是问题。不知道用固态硬盘是否能快，我没有就不知道了。所以要做三件事，一做好集群，将数据及时同步到其他机房，自己写个程序同步定时同步，如果用主从，主机重启了为空，这个就很麻烦了，二使用redis的数据持久化，肯定比从mysql中直接捞快，三天天烧香希望不要宕机。

4月 08

python的聪明组合

Posted on 2012 年 4 月 8 日 by 张子萌

　　一直在买双色就是没有中过，看过高人指点，根据“聪明组合”写了这个脚本。在技术上没有任何难度，都是体力活。为了大家方便。
　　运行脚本输入１２个红球数，组合成１０组。然后在自己加上篮球即可。

#!/bin/env python
# -*- coding:utf-8 -*-
# -------------------------------------------
# Filename:    clever12.py
# Revision:    1.0
# Date:        2012-3-13
# Author:      simonzhang
# WEB:         www.simonzhang.net
# Email:       simon-zzm@163.com
# -------------------------------------------

def run_group(di):
    fen = di.split(' ')
    A = fen[0]
    B = fen[1]
    C = fen[2]
    D = fen[3]
    E = fen[4]
    F = fen[5]
    G = fen[6]
    H = fen[7]
    I = fen[8]
    J = fen[9]
    K = fen[10]
    L = fen[11]
    print("%s %s %s %s %s %s"%(A,B,D,E,K,L))
    print("%s %s %s %s %s %s"%(A,B,E,F,H,I))
    print("%s %s %s %s %s %s"%(A,B,E,G,I,K))
    print("%s %s %s %s %s %s"%(A,B,E,I,J,L))
    print("%s %s %s %s %s %s"%(A,C,D,E,F,L))
    print("%s %s %s %s %s %s"%(A,C,D,G,H,J))
    print("%s %s %s %s %s %s"%(A,C,D,I,K,L))
    print("%s %s %s %s %s %s"%(A,C,F,G,H,L))
    print("%s %s %s %s %s %s"%(A,C,F,H,J,K))
    print("%s %s %s %s %s %s"%(A,D,F,G,J,L))

def main():
    get_list = raw_input("12 number :")
    if len(get_list) == 35:
        run_group(get_list)
    else:
        print "input error"

if __name__ == "__main__":
    main()

3月 22

python 多线程程序，控制线程数

Posted on 2012 年 3 月 22 日 by 张子萌

python 控制线程数

用python写多线程的脚本，需要控制线程数。需要自己判断线程是否运行完毕。测试代码如下：

#!/bin/env python
# -------------------------------------------------------------------------------
# Filename:    my-thread.py
# Revision:
# Date:        2012-12-19
# Author:      simonzhang
# web :        www.simonzhang.net
# Email:       simon-zzm@163.com
# ------------------------------------------------------------------------------- 

####加载多线程模块
import threading
####需要个随机数和延迟，为测试用
import random
from time import sleep

#### 多线程运行的测试部分。循环3次，每次间隔0到2的随机秒数，
#### 等待后打印，运行总次数，线程数和循环值
def test_func(thread_number,sequence):
    for i in range(3):
        sleep(random.randint(0,2))
        print('run sequence:%s thread %d is running %d ' % (sequence,thread_number,i))   

def main():
    #### 定义循环序列，就是一个线程池
    threads = []
    #### 定义总共运行的次数
    all_number = 5
    #### 定义运行所使用的线程数
    thread_lines = 3
    #### 定义开始线程数
    start_line = 0
    #### 首先构建线程池
    for i in range(0,thread_lines):
        t = threading.Thread(target=test_func, args=(i,start_line,))
        threads.append(t)
        start_line +=1
    #### 运行第一批线程的任务
    for t in threads:
        t.start()
    #### 循环运行全部任务
    for number_line in xrange(start_line,all_number):
        #### 初始化当前线程的状态
        thread_status = False
        #### 初始化检查循环线程的开始值
        loop_line = 0
        #### 开始循环检查线程池中的线程状态
        while thread_status == False :
            #### 如果检查当前线程，如果线程停止，代表任务完成，则分配给此线程新任务，
            #### 如果检查当先线程正在运行，则开始检查下一个线程，直到分配完新任务。
            #### 如果线程池中线程全部在运行，则开始从头检查
            if threads[loop_line].isAlive() == False :
                t = threading.Thread(target=test_func, args=(loop_line,number_line,))
                threads[loop_line]=t
                threads[loop_line].start()
                thread_status = True
            else:
                if loop_line >= thread_lines-1 :
                    loop_line=0
                else:
                    loop_line+=1
    #### 等待超时
    sleep(30)
    #### 结束所有线程
    for number_line in xrange(start_line,thread_lines):
        thread[number_line].exit()

if __name__ == "__main__":
    main()

本代码存在一个问题，运行完毕后主进程在运行，程序走到下一步，但是还有线程也在运行。所以需要在调用全部完毕后，有一段检查线程是否全部结束，可以使用jion（）进行阻塞判断即可。根据实际需要编写。

3月 19

python 脚本不能并行运行

Posted on 2012 年 3 月 19 日 by 张子萌

写了个脚本，数据需要定时循环处理，但是不能同时重复处理。也就是说，脚本需要单进程运行。脚本定时运行配置在crontab中。如果在脚本开始写状态文件，运行完成后关闭状态文件也可以，但是如果脚本中途退出，状态文件不会更改，下次运行就不容易判断了。使用进程号来判断就比较准确了，所以写了以下代码。
以下脚本为linux下使用。

#!/usr/local/bin/python
# -------------------------------------------------------------------------------
# Filename:    my-pid.py
# Revision:
# Date:        2012-12-19
# Author:      simonzhang
# Email:       simon-zzm@163.com
# -------------------------------------------------------------------------------
import os

if __name__ == '__main__':
    try :
        #首先查看是否有pid文件
        #读取pid
        now_pid = open("./my_pid.pid","rb").read()
        #通过操作系统命令，统计pid运行的数量
        get_count = os.popen("ps -ef|awk ' ''{print $2}'|grep -w %s|wc -l"%now_pid).read()
    except:
        #如果没有pid文件，则统计为零
        get_count = 0
    #判断pid是否在进程中
    if int(get_count) > 0 :
        #如果系统进程中有pid进程，则打印后推出
        print "run...."
        sys.exit()
    else:
        #如果没有进程则获得后，保存在pid文件中
        w_pid = os.getpid()
        w_pid_file = open("./my_pid.pid","wb")
        w_pid_file.write("%s"%w_pid)
        w_pid_file.close()
        #运行脚本
         main()

3月 15

python 在 crontab 中的调用

Posted on 2012 年 3 月 15 日 by 张子萌

python 在 crontab 中的调用

　　使用python写了一个脚本，手动执行没有问题。需要自动运行，所以要配置到crontab中。举例脚本位置“/Data/script/test.py” ，其中包含读取操作系统、读取配置文件、写日志操作，使用root用户每5分钟执行一次。最初配置为：

*/5 * * * * /usr/local/bin/python /Data/script/test.py >/dev/null

　　需要注意的是“*/5 * * * *”中间要用空格分隔。“/usr/local/bin/python”要根据使用的python脚本安装位置填写，通过观察定时任务失败。
　　分析原因，crontab运行时的环境变量与ssh登录的环境变量不同，导致读取配置目录和文件失败。解决方法,写一个shell调用脚本，将脚本放在crontab中。脚本有一个参数“start”。
shell脚本为test.sh：

#! /bin/bash
#
# crontab shell python
#
# www.simonzhang.net
# email:simon-zzm@163.com
#
### END INIT INFO

. /etc/profile
cd /Data/script/

case "$1" in
  start)
      /usr/local/bin/python /Data/script/test.py start &
     ;;
  test)
      /usr/local/bin/python /Data/script/test.py test &
      ;;
  *)
        echo $"Usage: $0 {start|test}"
        exit 1
esac

exit 1

将脚本配置到crontab中运行成功，配置为：

*/5 * * * * /bin/sh /Data/script/test.sh start >/dev/null

simonzhang的家

有朋自远方来。。。。。

Tag Archives: 163

从mysql向redis中加载数据测试

python的聪明组合

python 多线程程序，控制线程数

python 脚本不能并行运行

python 在 crontab 中的调用

2025年七月
一	二	三	四	五	六	日
« 1月
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31