下载此文档

《数据仓库与数据挖掘》第8章.ppt


文档分类:IT计算机 | 页数:约148页 举报非法文档有奖
1/148
下载提示
  • 1.该资料是网友上传的,本站提供全文预览,预览什么样,下载就什么样。
  • 2.下载该文档所得收入归上传者、原创者。
  • 3.下载的文档,不会出现我们的网址水印。
1/148 下载此文档
文档列表 文档介绍
第6章: 关联规则挖掘
Association rule mining
Algorithms for scalable mining of (single-dimensional Boolean) association rules in transactional databases
Mining various kinds of association/correlation rules
Constraint-based association mining
Sequential pattern mining
Applications/extensions of frequent pattern mining
Summary
2017/11/10
1
Data Mining: Concepts and Techniques
What Is Association Mining?
Association rule mining:
Finding frequent patterns, associations, correlations, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories.
Frequent pattern: pattern (set of items, sequence, etc.) that occurs frequently in a database [AIS93]
Motivation: finding regularities in data
What products were often purchased together? — Beer and diapers?!
What are the subsequent purchases after buying a PC?
What kinds of DNA are sensitive to this new drug?
Can we automatically classify web documents?
2017/11/10
2
Data Mining: Concepts and Techniques
关联规则挖掘的基本概念
购物篮分析-引发关联规则挖掘的例子
问题:“什么商品组或集合顾客多半会在一次购物中同时购买?”
购物篮分析:设全域为商店出售的商品的集合(即项目全集),一次购物购买(即事务)的商品为项目全集的子集,若每种商品用一个布尔变量表示该商品的有无,则每个购物篮可用一个布尔向量表示。通过对布尔向量的分析,得到反映商品频繁关联或同时购买的购买模式。这些模式可用关联规则描述。
〖例〗购买计算机与购买财务管理软件的关联规则可表示为:
computer financial_management_softwar
[support=2%,confidence=60%]
support为支持度,confidence为置信度。
该规则表示:在所分析的全部事务中,有2%的事务同时购买计算机和财务管理软件;在购买计算机的顾客中60%也购买财务管理软件。
2017/11/10
3
Data Mining: Concepts and Techniques
Why Is Frequent Pattern or Assoiciation Mining an Essential Task in Data Mining?
Foundation for many essential data mining tasks
Association, correlation, causality
Sequential patterns, temporal or cyclic association, partial periodicity, spatial and multimedia association
Associative classification, cluster analysis, iceberg cube, fascicles (semantic pression)
Broad applications
Basket data analysis, cross-marketing, catalog design, sale campaign analysis
Web log (click stream) analysis, DNA sequence analysis, etc.
2017/11/10

《数据仓库与数据挖掘》第8章 来自淘豆网www.taodocs.com转载请标明出处.

非法内容举报中心
文档信息
  • 页数148
  • 收藏数0 收藏
  • 顶次数0
  • 上传人中国课件站
  • 文件大小0 KB
  • 时间2011-09-06