您好,欢迎来到聚文网。 登录 免费注册
商业数据科学

商业数据科学

  • 字数: 509000
  • 装帧: 平装
  • 出版社: 东南大学出版社
  • 作者: (美)福斯特·普罗沃斯特(Foster Provost),(美)汤姆·福赛特(Tom Fawcett) 著 著
  • 出版日期: 2018-02-01
  • 商品条码: 9787564175283
  • 版次: 1
  • 开本: 16开
  • 页数: 386
  • 出版年份: 2018
定价:¥98 销售价:登录后查看价格  ¥{{selectedSku?.salePrice}} 
库存: {{selectedSku?.stock}} 库存充足
{{item.title}}:
{{its.name}}
精选
内容简介
《商业数据科学(影印版)(英文)》是一本博大精深但又不太技术的指南,向你介绍数据科学的基本原则,并带领你全程浏览从所搜集数据中抽取有用知识和商业价值所必需的“数据分析思维”。通过学习数据科学原则,你将领略当今用到的诸多数据挖掘技巧。更重要的是,这些原则支撑着通过数据挖掘技巧解决商业问题所需的手段和策略。
目录
Preface
1.Introduction: Data—Analytic Thinking
The Ubiquity of Data Opportunities
Example: Hurricane Frances
Example: Predicting Customer Churn
Data Science, Engineering, and Data—Driven Deasion Making
Data Processing and "Big Data"
From Big Data 1.0 to Big Data 2.0
Data and Data Science Capability as a Strategic Asset
Data—Analytic Thinking
This Book
Data Mining and Data Science, Revisited
Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist
Summary
2.Business Problems and Data Soence Solutions
Fundamental concepts: A set of cononical data mining tasks; The data mining process; Superwsed versus unsupervised data mining.
From Business Problems to Data Mining Tasks
Supervised Versus Unsupervised Methods
Data Mining and Its Results
The Data Mining Process
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
Implications for Managing the Data Science Team
Other Analytics Techniques and Technologies
Statistics
Database Querying
Data Warehousing
Regression Analysis
Machine Learning and Data Mining
Answering Business Questions with These Techniques
Summary
3.Introduction to Predictive Modeling: From (orrelation to Supervised Segmentation,
Fundamental concepts: Identifying informative ottributes; Segmenting data by progressive attribute selection.
Exemplary techniques: Finding correlotions; Attribute/varioble selection; Tree induction.
Models, Induction, and Prediction
Supervised Segmentation
Selecting Informative Attributes
Example: Attribute Selection with Information Gain
Supervised Segmentation with Tree—Structured Models
Visualizing Segmentations
Trees as Sets of Rules
Probability Estimation
Example: Addressing the Churn Problem with Tree Induction
Summary
4.Fitting a Modelto Data
Fundomental concepts: Finding "optimal" model parameters based on data; Choosing the goal for data mining; Objective functions; Loss functions.
Exemplory techniques: Linear regression; Logistic regression; Support—vector machines.
Classification via Mathematical Functions
Linear Discriminant Functions
Optimizing an Objective Function
An Example of Mining a Linear Discriminant from Data
Linear Discriminant Functions for Scoring and Ranking Instances
Support Vector Machines, Briefly
Regression via Mathematical Functions
Class Probability Estimation and Logistic "Regression"
*Logistic Regression: Some Technical Details
Example: Logistic Regression versus Tree Induction
Nonlinear Functions, Support Vector Machines, and Neural Networks
Summary
5.Overfitting and Its Avoidance.
Fundomental concepts: Generalization; Fitting ond over fitting; Complexity control.Exemplory techniques: Cross—volidotion; Artribute selection; Tree pruning;
Regularizotion.
Generalization
Overfitting
Overfitting Examined
Holdout Data and Fitting Graphs
Overfitting in Tree Induction
Overfitting in Mathematical Functions
Example: Overfitting Linear Functions
*Example: Why Is Overfitting Bad?
From Holdout Evaluation to Cross—Validation
The Churn Dataset Revisited
Learning Curves
Overfitting Avoidance and Complexity Control
Avoiding Overfitting with Tree Induction
A General Method for Avoiding Overfitting
*Avoiding Overfitting for Parameter Optimization
Summary
6.Similarity,Neighbors,and Clusters
Fundomentol concepts: Calculoting similoriry of objecrs described by data; Using similortty for prediction; Oustering as similarity—based segmentotion.
Exemplary techniques: Searching for similar entities; Nearest neighbor methods; Clustering methods; Distance metrics for calculoting similarity.
Similarity and Distance
Nearest—Neighbor Reasoning
Example: Whiskey Analytics
Nearest Neighbors for Predictive Modeling
How Many Neighbors and How Much Influence?
Geometric Interpretation, Overfitting, and Complexity Control
Issues with Nearest—Neighbor Methods
Some Important Technical Details Relating to Similarities and Neighbors
Heterogeneous Attributes
*Other Distance Functions
*Combining Functions: Calculating Scores from Neighbors
Clustering
Example: Whiskey Analytics Revisited
Hierarchical Clustering
Nearest Neighbors Revisited: Clustering Around Centroids
Example: Clustering Business News Stories
Understanding the Results of Clustering
*Using Supervised Learning to Generate Cluster Descriptions
Stepping Back: Solving a Business Problem Versus Data Exploration
Summary
7.Deosion Analytic Thinking I: What Is a Good Model?
Fundomental concepts: Coreful considerotion of what is desired from data soence results; Expected value as a key evoluation fromework; Considerotion of oppropriote comporotive baselines.
Exemplary techniques: Various evoluotion metrics; Estimoring costs and benefits; Colculoting expected profit; Creating boseline methods for comparison.
Evaluating Classifiers
Plain Accuracy and Its Problems
The Confusion Matrix
Problems with Unbalanced Classes
Problems with Unequal Costs and Benefits
Generalizing Beyond Classification
A Key Analytical Framework: Expected Value
Using Expected Value to Frame Classifier Use
Using Expected Value to Frame Classifier Evaluation
Evaluation, Baseline Performance, and Implications for Investments in Data
Summary
8.Visualizing Model Performance
Fundamentol concepts: Visualizotion of model performonce under various kinds of uncer toinry; Fur ther considerotion of whot is desired from doto mining results.
Exemplary techniques: Profit cur ves; Cumulative response curves; Lift curves; ROC curves.
Ranking Instead of Classifying
Profit Curves
ROC Graphs and Curves
The Area Under the ROC Curve (AUC)
Cumulative Response and Lift Curves
Example: Performance Analytics for Churn Modeling
Summary
9.Evidence and Probabilities
Fundomemal concepts: Expliat evidence combination with Bayes'Rule; Probabilistic reosoning via assumptions of conditional independence.
Exemplary techniques: Noive Bayes closification; Ewdence lift.
Example: Targeting Online Consumers With Advertisements
Combining Evidence Probabilistically
Joint Probability and Independence
Bayes' Rule
Applying Bayes' Rule to Data Science
Conditionallndependence and Naive Bayes
Advantages and Disadvantages of Naive Bayes
A Model of Evidence "Liff"
Example: Evidence Lifts from Facebook "Likes"
Evidence in Action: Targeting Consumers with Ads
Summary
……
10.Representing and Mining Text
11.Deasion Analytic Thinking II: Toward Analytical Engineering
12.Other Data Saence Tasks and Techniques
13.Data Soence and Business Strategy
14.Conclusion
A.Proposal Review Guide
B.Another Sample Proposal
Glossary
Bibliography
Index

蜀ICP备2024047804号

Copyright 版权所有 © jvwen.com 聚文网