【Matplotlib】数据可视化实例分析
2017-07-19 11:07
991 查看
数据可视化实例分析
作者:白宁超2017年7月19日09:09:07
摘要:数据可视化主要旨在借助于图形化手段,清晰有效地传达与沟通信息。但是,这并不就意味着数据可视化就一定因为要实现其功能用途而令人感到枯燥乏味,或者是为了看上去绚丽多彩而显得极端复杂。为了有效地传达思想概念,美学形式与功能需要齐头并进,通过直观地传达关键的方面与特征,从而实现对于相当稀疏而又复杂的数据集的深入洞察。然而,设计人员往往并不能很好地把握设计与功能之间的平衡,从而创造出华而不实的数据可视化形式,无法达到其主要目的,也就是传达与沟通信息。数据可视化与信息图形、信息可视化、科学可视化以及统计图形密切相关。当前,在研究、教学和开发领域,数据可视化乃是一个极为活跃而又关键的方面。“数据可视化”这条术语实现了成熟的科学可视化领域与较年轻的信息可视化领域的统一。(本文原创编著,转载注明出处:数据可视化实例分析)
1 折线图的制作
1.1 需求描述
使用matplotlib绘制一个简单的折线图,在对其进行定制,以实现信息更加丰富的数据可视化,绘制(1,2,3,4,5)的平方折线图。1.2 源码
#coding=utf-8 import matplotlib as mpl import matplotlib.pyplot as plt import pylab # 解决中文乱码问题 mpl.rcParams['font.sans-serif']=['SimHei'] mpl.rcParams['axes.unicode_minus']=False # squares = [1,35,43,3,56,7] input_values = [1,2,3,4,5] squares = [1,4,9,16,25] # 设置折线粗细 plt.plot(input_values,squares,linewidth=5) # 设置标题和坐标轴 plt.title('平方数图',fontsize=24) plt.xlabel('值',fontsize=14) plt.ylabel('平方值',fontsize=14) # 设置刻度大小 plt.tick_params(axis='both',labelsize=14) plt.show()
1.3 生成结果
AKDT,Max TemperatureF,Mean TemperatureF,Min TemperatureF,Max Dew PointF,MeanDew PointF,Min DewpointF,Max Humidity, Mean Humidity, Min Humidity, Max Sea Level PressureIn, Mean Sea Level PressureIn, Min Sea Level PressureIn, Max VisibilityMiles, Mean VisibilityMiles, Min VisibilityMiles, Max Wind SpeedMPH, Mean Wind SpeedMPH, Max Gust SpeedMPH,PrecipitationIn, CloudCover, Events, WindDirDegrees 2014-7-1,64,56,50,53,51,48,96,83,58,30.19,30.00,29.79,10,10,10,7,4,,0.00,7,,337 2014-7-2,71,62,55,55,52,46,96,80,51,29.81,29.75,29.66,10,9,2,13,5,,0.14,7,Rain,327 2014-7-3,64,58,53,55,53,51,97,85,72,29.88,29.86,29.81,10,10,8,15,4,,0.01,6,,258 2014-7-4,59,56,52,52,51,50,96,88,75,29.91,29.89,29.87,10,9,2,9,2,,0.07,7,Rain,255 2014-7-5,69,59,50,52,50,46,96,72,49,29.88,29.82,29.79,10,10,10,13,5,,0.00,6,,110 2014-7-6,62,58,55,51,50,46,80,71,58,30.13,30.07,29.89,10,10,10,20,10,29,0.00,6,Rain,213 2014-7-7,61,57,55,56,53,51,96,87,75,30.10,30.07,30.05,10,9,4,16,4,25,0.14,8,Rain,211 2014-7-8,55,54,53,54,53,51,100,94,86,30.10,30.06,30.04,10,6,2,12,5,23,0.84,8,Rain,159 2014-7-9,57,55,53,56,54,52,100,96,83,30.24,30.18,30.11,10,7,2,9,5,,0.13,8,Rain,201 2014-7-10,61,56,53,53,52,51,100,90,75,30.23,30.17,30.03,10,8,2,8,3,,0.03,8,Rain,215 2014-7-11,57,56,54,56,54,51,100,94,84,30.02,30.00,29.98,10,5,2,12,5,,1.28,8,Rain,250 2014-7-12,59,56,55,58,56,55,100,97,93,30.18,30.06,29.99,10,6,2,15,7,26,0.32,8,Rain,275 2014-7-13,57,56,55,58,56,55,100,98,94,30.25,30.22,30.18,10,5,1,8,4,,0.29,8,Rain,291 2014-7-14,61,58,55,58,56,51,100,94,83,30.24,30.23,30.22,10,7,0,16,4,,0.01,8,Fog,307 2014-7-15,64,58,55,53,51,48,93,78,64,30.27,30.25,30.24,10,10,10,17,12,,0.00,6,,318 2014-7-16,61,56,52,51,49,47,89,76,64,30.27,30.23,30.16,10,10,10,15,6,,0.00,6,,294 2014-7-17,59,55,51,52,50,48,93,84,75,30.16,30.04,29.82,10,10,6,9,3,,0.11,7,Rain,232 2014-7-18,63,56,51,54,52,50,100,84,67,29.79,29.69,29.65,10,10,7,10,5,,0.05,6,Rain,299 2014-7-19,60,57,54,55,53,51,97,88,75,29.91,29.82,29.68,10,9,2,9,2,,0.00,8,,292 2014-7-20,57,55,52,54,52,50,94,89,77,29.92,29.87,29.78,10,8,2,13,4,,0.31,8,Rain,155 2014-7-21,69,60,52,53,51,50,97,77,52,29.99,29.88,29.78,10,10,10,13,4,,0.00,5,,297 2014-7-22,63,59,55,56,54,52,90,84,77,30.11,30.04,29.99,10,10,10,9,3,,0.00,6,Rain,240 2014-7-23,62,58,55,54,52,50,87,80,72,30.10,30.03,29.96,10,10,10,8,3,,0.00,7,,230 2014-7-24,59,57,54,54,52,51,94,84,78,29.95,29.91,29.89,10,9,3,17,4,28,0.06,8,Rain,207 2014-7-25,57,55,53,55,53,51,100,92,81,29.91,29.87,29.83,10,8,2,13,3,,0.53,8,Rain,141 2014-7-26,57,55,53,57,55,54,100,96,93,29.96,29.91,29.87,10,8,1,15,5,24,0.57,8,Rain,216 2014-7-27,61,58,55,55,54,53,100,92,78,30.10,30.05,29.97,10,9,2,13,5,,0.30,8,Rain,213 2014-7-28,59,56,53,57,54,51,97,94,90,30.06,30.00,29.96,10,8,2,9,3,,0.61,8,Rain,261 2014-7-29,61,56,51,54,52,49,96,89,75,30.13,30.02,29.95,10,9,3,14,4,,0.25,6,Rain,153 2014-7-30,61,57,54,55,53,52,97,88,78,30.31,30.23,30.14,10,10,8,8,4,,0.08,7,Rain,160 2014-7-31,66,58,50,55,52,49,100,86,65,30.31,30.29,30.26,10,9,3,10,4,,0.00,3,,217
View Code
highs_lows.py文件信息
import csv from datetime import datetime from matplotlib import pyplot as plt import matplotlib as mpl # 解决中文乱码问题 mpl.rcParams['font.sans-serif']=['SimHei'] mpl.rcParams['axes.unicode_minus']=False # Get dates, high, and low temperatures from file. filename = 'death_valley_2014.csv' with open(filename) as f: reader = csv.reader(f) header_row = next(reader) # print(header_row) # for index,column_header in enumerate(header_row): # print(index,column_header) dates, highs,lows = [],[], [] for row in reader: try: current_date = datetime.strptime(row[0], "%Y-%m-%d") high = int(row[1]) low = int(row[3]) except ValueError: # 处理 print(current_date, 'missing data') else: dates.append(current_date) highs.append(high) lows.append(low) # 汇制数据图形 fig = plt.figure(dpi=120,figsize=(10,6)) plt.plot(dates,highs,c='red',alpha=0.5)# alpha指定透明度 plt.plot(dates,lows,c='blue',alpha=0.5) plt.fill_between(dates,highs,lows,facecolor='orange',alpha=0.1)#接收一个x值系列和y值系列,给图表区域着色 #设置图形格式 plt.title('2014年加利福尼亚死亡谷日气温最高最低图',fontsize=24) plt.xlabel('日(D)',fontsize=16) fig.autofmt_xdate() # 绘制斜体日期标签 plt.ylabel('温度(F)',fontsize=16) plt.tick_params(axis='both',which='major',labelsize=16) # plt.axis([0,31,54,72]) # 自定义数轴起始刻度 plt.savefig('highs_lows.png',bbox_inches='tight') plt.show()
6.3 生成结果
7 制作世界人口地图:JSON格式
7.1 需求描述
下载json格式的人口数据,并使用json模块来处理。7.2 源码
json数据population_data.json部分信息countries.py
from pygal.maps.world import COUNTRIES for country_code in sorted(COUNTRIES.keys()): print(country_code, COUNTRIES[country_code])
countries_codes.py
from pygal.maps.world import COUNTRIES def get_country_code(country_name): """Return the Pygal 2-digit country code for the given country.""" for code, name in COUNTRIES.items(): if name == country_name: return code # If the country wasn't found, return None. return print(get_country_code('Thailand')) # print(get_country_code('Andorra'))
americas.py
import pygal wm =pygal.maps.world.World() wm.title = 'North, Central, and South America' wm.add('North America', ['ca', 'mx', 'us']) wm.add('Central America', ['bz', 'cr', 'gt', 'hn', 'ni', 'pa', 'sv']) wm.add('South America', ['ar', 'bo', 'br', 'cl', 'co', 'ec', 'gf', 'gy', 'pe', 'py', 'sr', 'uy', 've']) wm.add('Asia', ['cn', 'jp', 'th']) wm.render_to_file('americas.svg')
world_population.py
#conding = utf-8 import json from matplotlib import pyplot as plt import matplotlib as mpl from country_codes import get_country_code import pygal from pygal.style import RotateStyle from pygal.style import LightColorizedStyle # 解决中文乱码问题 mpl.rcParams['font.sans-serif']=['SimHei'] mpl.rcParams['axes.unicode_minus']=False # 加载json数据 filename='population_data.json' with open(filename) as f: pop_data = json.load(f) # print(pop_data[1]) # 创建一个包含人口的字典 cc_populations={} # cc1_populations={} # 打印每个国家2010年的人口数量 for pop_dict in pop_data: if pop_dict['Year'] == '2010': country_name = pop_dict['Country Name'] population = int(float(pop_dict['Value'])) # 字符串数值转化为整数 # print(country_name + ":" + str(population)) code = get_country_code(country_name) if code: cc_populations[code] = population # elif pop_dict['Year'] == '2009': # country_name = pop_dict['Country Name'] # population = int(float(pop_dict['Value'])) # 字符串数值转化为整数 # # print(country_name + ":" + str(population)) # code = get_country_code(country_name) # if code: # cc1_populations[code] = population cc_pops_1,cc_pops_2,cc_pops_3={},{},{} for cc,pop in cc_populations.items(): if pop <10000000: cc_pops_1[cc]=pop elif pop<1000000000: cc_pops_2[cc]=pop else: cc_pops_3[cc]=pop # print(len(cc_pops_1),len(cc_pops_2),len(cc_pops_3)) wm_style = RotateStyle('#336699',base_style=LightColorizedStyle) wm =pygal.maps.world.World(style=wm_style) wm.title = '2010年世界各国人口统计图' wm.add('0-10m', cc_pops_1) wm.add('10m-1bm',cc_pops_2) wm.add('>1bm',cc_pops_3) # wm.add('2009', cc1_populations) wm.render_to_file('world_populations.svg')
7.3 生成结果
countries.pyworld_population.py
8 Pygal可视化github仓库
8.1 需求描述
调用web API对GitHub数据仓库进行可视化展示:https://api.github.com/search/repositories?q=language:python&sort=stars8.2 源码
python_repos.py# coding=utf-8 import requests import pygal from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS # Make an API call, and store the response. url = 'https://api.github.com/search/repositories?q=language:python&sort=stars' r = requests.get(url) print("Status code:", r.status_code) # 查看请求是否成功,200表示成功 response_dict = r.json() # print(response_dict.keys()) print("Total repositories:", response_dict['total_count']) # Explore information about the repositories. repo_dicts = response_dict['items'] print("Repositories returned:",len(repo_dicts)) # 查看项目信息 # repo_dict =repo_dicts[0] # print('\n\neach repository:') # for repo_dict in repo_dicts: # print("\nName:",repo_dict['name']) # print("Owner:",repo_dict['owner']['login']) # print("Stars:",repo_dict['stargazers_count']) # print("Repository:",repo_dict['html_url']) # print("Description:",repo_dict['description']) # 查看每个项目的键 # print('\nKeys:',len(repo_dict)) # for key in sorted(repo_dict.keys()): # print(key) names, plot_dicts = [], [] for repo_dict in repo_dicts: names.append(repo_dict['name']) plot_dicts.append(repo_dict['stargazers_count']) # 可视化 my_style = LS('#333366', base_style=LCS) my_config = pygal.Config() # Pygal类Config实例化 my_config.x_label_rotation = 45 # x轴标签旋转45度 my_config.show_legend = False # show_legend隐藏图例 my_config.title_font_size = 24 # 设置图标标题主标签副标签的字体大小 my_config.label_font_size = 14 my_config.major_label_font_size = 18 my_config.truncate_label = 15 # 较长的项目名称缩短15字符 my_config.show_y_guides = False # 隐藏图表中的水平线 my_config.width = 1000 # 自定义图表的宽度 chart = pygal.Bar(my_config, style=my_style) chart.title = 'Most-Starred Python Projects on GitHub' chart.x_labels = names chart.add('', plot_dicts) chart.render_to_file('python_repos.svg')
8.3 生成结果
9 参考文献
1 matplotlib官网2 天气数据官网
3 实验数据下载
4 google charts
5 Plotly
6 Jpgraph
相关文章推荐
- 【Matplotlib】数据可视化实例分析
- 【Matplotlib】数据可视化实例分析
- Python+pandas+matplotlib数据分析与可视化案例(附源码)
- Python进阶(三十九)-数据可视化の使用matplotlib进行绘图分析数据
- python数据挖掘课程 十一.Pandas、Matplotlib结合SQL语句可视化分析
- python—matplotlib数据可视化实例注解系列-----之函数填充
- python—matplotlib数据可视化实例注解系列-----之横条图
- python中数据分析数据可视化作图matplotlib
- python—matplotlib数据可视化实例注解系列-----之函数图
- python—matplotlib数据可视化实例注解系列-----之plot图线型设置
- python数据分析之数据可视化matplotlib
- Python数据挖掘04---matplotlib数据可视化分析
- Python数据分析之可视化一matplotlib(常用方法)
- python—matplotlib数据可视化实例注解系列-----之箱状图
- python—matplotlib数据可视化实例注解系列-----设置标注字体样式(matplotlib颜色库)
- python—matplotlib数据可视化实例注解系列-----之柱状图
- 动态可视化 数据可视化之魅D3,Processing,pandas数据分析,科学计算包Numpy,可视化包Matplotlib,Matlab语言可视化的工作,Matlab没有指针和引用是个大问题
- 数据可视化系列之 matplotlib
- matplotlib实现数据可视化
- python 数据可视化 matplotlib学习一:绘制简单的折线图