您的位置:首页 > 其它

Visualizations&plot&pivot_table

2016-04-22 09:28 253 查看

Dataset

数据集forest_fires.csv是关于葡萄牙国家公园的森林火灾信息,下面是几个特征。

X – The X position on the grid where the fire occurred.

Y – The Y position on the grid where the fire occured.

month – the month the fire occcurred.

day – the day of the week the fire occurred.

temp – the temperature in Celsius when the fire occurred.

wind – the wind speed when the fire occurred.

rain – the rainfall when the fire occurred.

area – the area the fire consumed.

例子

X,Y,month,day,FFMC,DMC,DC,ISI,temp,RH,wind,rain,area

7,5,mar,fri,86.2,26.2,94.3,5.1,8.2,51,6.7,0,0

7,4,oct,tue,90.6,35.4,669.1,6.7,18,33,0.9,0,0

Scatter Plots(散点图)

python的绘图函数大部分来自于 pyplot这个模块。通常画一个散点图(scatter plot)的步骤是:

初始化一个figure

在figure中画点

show这个figure

看个例子,利用 scatter() 函数来画散点图:

import pandas
forest_fires = pandas.read_csv("forest_fires.csv")
plt.scatter(forest_fires["wind"], forest_fires["area"])
plt.show()


显示的图如下,可以看出plot是基于坐标网格进行画图的。



Line Charts(折线图)

age = [5, 10, 15, 20, 25, 30]
height = [25, 45, 65, 75, 75, 75]
plt.plot(age, height)
plt.show()


显示的图如下:



Bar Graphs(柱状图)

pivot_table() 是个聚合函数在前面提及到,按“Y”进行分组,计算的每个”Y”的”area”的平均值。返回的是一个DataFrame.

area_by_y = forest_fires.pivot_table(index="Y", values="area", aggfunc=numpy.mean)
area_by_x = forest_fires.pivot_table(index="X", values="area", aggfunc=numpy.mean)
plt.bar(area_by_y.index, area_by_y)
plt.show()




看一个pivot_table的例子:

>>> df
A   B   C      D
0  foo one small  1
1  foo one large  2
2  foo one large  2
3  foo two small  3
4  foo two small  3
5  bar one large  4
6  bar one small  5
7  bar two small  6
8  bar two large  7

>>> table = pivot_table(df, values='D', index=['A', 'B'],
...                     columns=['C'], aggfunc=np.sum)
>>> table
small  large
foo  one  1      4
two  6      NaN
bar  one  5      4
two  6      7


Horizontal Bar Graphs(单杠图)

单杠图与柱状图相似,只是图是水平的而不是垂直的

area_by_month = forest_fires.pivot_table(index="month", values="area", aggfunc=numpy.mean)
plt.barh(range(len(area_by_month)), area_by_month)
plt.show()




Labels

以下这几个方法都要在show方法之前调用才能显示出来。

Title – the title() method.

X axis label – the xlabel() method.

Y axis label – the ylabel() method.

plt.scatter(forest_fires["wind"], forest_fires["area"])
plt.xlabel('Wind speed when fire started')
plt.ylabel('Area consumed by fire')
plt.title("Wind speed vs fire area")
plt.show()




style

上面所画的图不够美观,它用的是plot默认的风格,因此如果想要图标更吸引人,可以采用style.use() 的方法对图形进行加工,自定义各种风格。下面是几种內建的风格介绍:

fivethirtyeight – the style of the plots on the site fivethirtyeight.com.

ggplot – the style of the popular R plotting library ggplot.

dark_background – will give the plot a darker background.

bmh – the style used in a popular online statistics book.

看一个fivethirtyeight的例子

plt.style.use('fivethirtyeight')
plt.scatter(forest_fires["rain"], forest_fires["area"])
plt.show()


内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: