您的位置:首页 > 运维架构 > Shell

用shell分析nginx日志百度网页蜘蛛列表页来访情况

2014-12-17 11:19 232 查看
#!/bin/bash
#desc: this scripts for baidunews-spider
#date:2014.02.25
#testd in CentOS 5.9 x86_64
#saved in /usr/local/bin/baidu-web.sh
#written by coralzd@gmail.com www.zjyxh.com
dt=`date -d "yesterday" +%m%d`
if [ $1x != x ] ;then
if [ -e $1 ] ;then
grep -i "Baiduspider/2.0" $1 > baiduspider-${dt}.txt
num=`cat baiduspider-${dt}.txt|wc -l`
echo "baiduspider number is ${num},file is baidu-${dt}.txt"
cat baiduspider-${dt}.txt|awk '{print $7}'|sort |uniq -c|sort -r >`ls ${1}|cut -c 1-10`-${dt}.txt
echo "$1 was done"
else
echo "$1 not exsist!"
fi
else
echo "usage: $0 file_path"
fi
本次用shell分析百度网页蜘蛛跟百度新闻蜘蛛一个方法,无非就是把关键词由baiduspider-news换为baiduspider/2.0。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息