4月 19

hadoop运维问题1

2013-04-19 21:18:29,171 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid value for volsFailed : 1 , Volumes tolerated : 0
at org.apache.hadoop.hdfs.server.datanode.FSDataset.(FSDataset.java:975)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:389)
at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:299)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)

原因:磁盘损坏或链接失败
解决方案:通过hdfs的页面可以查到是那台服务器问题,在问题服务器上查看查看是那块硬盘的问题。紧急处理,停止该机的hdfs服务,在hadoop的hdfs-site.xml的配置中将损坏硬盘去掉,然后启动hdfs。之后尽快更换硬盘。

在恢复过程中使用“hadoop fsck /”查看当前状态。主要看“Missing replicas”副本丢失的比例,当数值为0%时,则恢复到系统设置的副本数量。
Minimally replicated blocks: 最小副本块完整的比例
Over-replicated blocks: 副本数高出设定默认复制级别参数的数据块文件所占比率
Under-replicated blocks: 副本数低于设定默认复制级别参数的数据块文件所占比率
Mis-replicated blocks: 丢失的副本块文件所占比率
Default replication factor: 默认副本数量
Average block replication: 实际平均副本数
Corrupt blocks: 损坏的block数
Missing replicas: 丢失的副本数
Number of data-nodes: 数据节点数
Number of racks: 机架数,如果没配置机架,默认都是1

4月 18

监控mysql从机同步状态脚本

  mysql数据库主从运行。为了知道从机的同步情况,写了个脚本,放在crontab中,如果同步出错,则邮件报警。去年写的,放上来做个备忘。

#!/usr/local/bin/python
# -------------------------------------------------------------------------------
# Filename:    .py
# Revision:    1.0
# Date:        2012-03-20
# Author:      simonzhang
# Web:         www.simonzhang.net
# Email:       simon-zzm@163.com
# -------------------------------------------------------------------------------
import os
import pexpect
import time
import smtplib
from email.mime.text import MIMEText

#### base se
mysql_bin = '/mysql5/bin/mysql'
mysql_user = ''
mysql_pass = ''
mail_host = 'smtp.exmail.qq.com'
mail_user = 'XXX@XXX.net'
mail_pwd = 'XXXX'
mail_to = "xxxxx@xxx.com"
####

def mail_warn(error_ip):
    content = 'IP %s mysql slave is error!'%error_ip
    msg = MIMEText(content)
    msg['From'] = mail_user
    msg['Subject'] = 'mysql warnning %s'%error_ip
    msg['To'] = mail_to
    try:
        s = smtplib.SMTP()
        s.connect(mail_host)
        s.login(mail_user,mail_pwd)
        s.sendmail(mail_user,[mail_to],msg.as_string())
        s.close()
    except Exception ,e:
        print e

def main():
    status = os.popen("%s -u%s -p%s -e 'show slave status\G'"%
                      (mysql_bin,mysql_user,mysql_pass)).read() 
    io_status = status[status.find('Slave_IO_Running: ')+18]
    sql_status = status[status.find('Slave_SQL_Running: ')+19]
    if (io_status == 'Y') or (sql_status == 'Y'):
        ip = os.popen("/sbin/ifconfig|grep 'inet addr'|awk '{print $2}'").read()
        get_local_ip = ip[ip.find(':')+1:ip.find('n')]
        mail_warn("%s"%get_local_ip)

if __name__ == "__main__":
    main()
3月 29

linux下python、go、C库处理图片缩放效率对比

  找到go语言的源码
https://github.com/nfnt/resize
http://code.google.com/p/gorilla/source/browse/lib/appengine/example/moustachio/resize/resize.go?r=3dbce6e267e9d497dffbce31220a059f02c4e99d

  使用’go get ‘安装需要使用git。如果是centos 6 直接安装’yum install git’。但是我的是CentOS 5 还要手动安装一下。
# yum install -y gcc make curl curl-devel zlib-devel openssl-devel perl perl-devel

cpio expat-devel gettext-devel
# wget http://codemonkey.org.uk/projects/git-snapshots/git/git-latest.tar.gz
# tar zxvf git-latest.tar.gz
# cd git-2013-03-28/
# autoconf
# ./configure
# make && make install
# git –version

  git安装完成,开始测试。我的环境,python2.6,go1.0.2,ImageMagick是C/C++语言开发使用也比较广泛。直接用命令测试。。使用一张150k 510×382的图片做测试。缩成宽300的等比例缩小图。

# go get github.com/nfnt/resize

go 测试代码

package main

import (
    "github.com/nfnt/resize"
    "image/jpeg"
    "os"
)

func main() {
    // open "test.jpg"
    file, err := os.Open("test.jpg")
    if err != nil {
        print("Open File Error")
    }

    // decode jpeg into image.Image
    img, err := jpeg.Decode(file)
    if err != nil {
        print("Not image file")
    }
    file.Close()

    // resize to width 1000 using Lanczos resampling
    // and preserve aspect ratio
    m := resize.Resize(300, 0, img, resize.Lanczos3)

    out, err := os.Create("test_go.jpg")
    if err != nil {
        print("Save Image Error!")
    }
    defer out.Close()

    // write new image to file
    jpeg.Encode(out, m, nil)
}

python 测试代码

#!/bin/env python
# -*- coding:utf-8 -*-
# --------------------------------
# Filename:    cut_image.py
# Revision:    1.1
# Author:      simon-zzm
# Web:         www.simonzhang.net
# Email:       simon-zzm@163.com
# --------------------------------
import Image


def main():
    file = Image.open('test.jpg')
    w = file.size[0]
    h = file.size[1]
    re_data = file.resize((300, int(h/(float(w)/300))),)
    re_data.save('test_py.jpg', 'JPEG',)


if __name__ == '__main__':
    main()

  在linux下使用time进行测试结果
# time python cut_image.py

real 0m0.051s
user 0m0.040s
sys 0m0.009s

# time go run cut_image.go

real 0m2.736s
user 0m2.695s
sys 0m0.039s

# time convert -resize 300x test.jpg test_c.jpg

real 0m0.073s
user 0m0.070s
sys 0m0.002s

-rw-r–r– 1 root root 150332 Jul 16 2012 test.jpg
-rw-r–r– 1 root root 12929 Mar 28 23:13 test_go.jpg
-rw-r–r– 1 root root 13087 Mar 28 23:13 test_py.jpg
-rw-r–r– 1 root root 58591 Mar 28 23:14 test_c.jpg

总结:GO使用这个方法作为图片缩放的处理速度和python处理速度的差距太大了。烦请高手指点如何处理。不过GO的语法还是比较简单,值得学习。之前做的多语言简单累加计算测试,GO效率还是比较高,所以处理业务逻辑处理的效率还应该不错。
源码下载