谷歌开源可视化工具Facets,将用于人+AI协作项目研究——无非就是一个用于特征工程探索的绘图工具集,pandas可以做的
2017-07-24 10:57
1151 查看
见:http://www.infoq.com/cn/news/2017/07/goole-sight-facets-ai
https://github.com/PAIR-code/facets/blob/master/facets_dive/README.md
The visualizations are implemented as Polymer web components, backed by Typescript code and can be easily embedded into Jupyter notebooks or webpages.
Live demos of the visualizations can be found on the Facets project description page.
Overview gives a high-level view of one or more data sets. It produces a visual feature-by-feature statistical analysis, and can also be used to compare statistics across two or more data sets. The tool can process both numeric and string features, including multiple instances of a number or string per feature.
Overview can help uncover issues with datasets, including the following:
Unexpected feature values
Missing feature values for a large number of examples
Training/serving skew
Training/test/validation set skew
Key aspects of the visualization are outlier detection and distribution comparison across multiple datasets. Interesting values (such as a high proportion of missing data, or very different distributions of a feature across multiple datasets) are highlighted in red. Features can be sorted by values of interest such as the number of missing values or the skew between the different datasets.
Details about Overview usage can be found in its README.
https://github.com/PAIR-code/facets/blob/master/facets_dive/README.md
Introduction
The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive.The visualizations are implemented as Polymer web components, backed by Typescript code and can be easily embedded into Jupyter notebooks or webpages.
Live demos of the visualizations can be found on the Facets project description page.
Facets Overview
Overview gives a high-level view of one or more data sets. It produces a visual feature-by-feature statistical analysis, and can also be used to compare statistics across two or more data sets. The tool can process both numeric and string features, including multiple instances of a number or string per feature.
Overview can help uncover issues with datasets, including the following:
Unexpected feature values
Missing feature values for a large number of examples
Training/serving skew
Training/test/validation set skew
Key aspects of the visualization are outlier detection and distribution comparison across multiple datasets. Interesting values (such as a high proportion of missing data, or very different distributions of a feature across multiple datasets) are highlighted in red. Features can be sorted by values of interest such as the number of missing values or the skew between the different datasets.
Details about Overview usage can be found in its README.
Facets Dive
相关文章推荐
- 破解AI大脑黑盒迈出新一步!谷歌现在更懂机器,还开源了研究工具
- 探索 Jsoup开源项目,深入解析技术点.(我认为他是一个非常好而很容易上手的工具)
- 谷歌开源机器学习可视化工具 Facets:从全新角度观察数据
- Github网站加载不完全,响应超时,如何解决 Github是一个代码托管平台和开发者社区,开发者可以在Github上创建自己的开源项目并与其他开发者协作编码。毫不夸张地说,高效利用Github是一
- 谷歌、微软、OpenAI 等巨头的机器学习开源项目
- 开源javascript画图工具,可以用于流程图设计
- 50%的开发者说:开源项目就是一个朝九晚五的工作
- 介绍一个python的开源项目,有兴趣可以看看
- [C#项目开源] MongoDB 可视化管理工具 (2011年10月-至今)
- [C#项目开源] MongoDB 可视化管理工具 (2011年10月-至今)
- GitLab 的付费套餐现在可以免费用于开源项目
- iReaperPlus一个开源的工具项目,旨在获得MSDN学习资料,希望你能加入进来
- ios中的开发一些小工具(就是一个写UIView的自定义组件的脱离环境,可以重新使用)解耦
- ECharts是我接触过的最优秀的可视化工具,也是进步最快的软件,希望它早日成为世界级的开源项目。
- [C#项目开源] MongoDB 可视化管理工具 (2011年10月-至今)
- 输入法词库 转txt 工具 一个开源项目
- Ganglia是UC Berkeley发起的一个开源集群监视项目,设计用于测量数以千计的节点
- 国内外开源与 SaaS ,团队协作平台、项目管理工具整理
- Github上如何找到一个优秀的且可以贡献的开源项目?
- 国内外开源与 SaaS ,团队协作平台、项目管理工具整理