您好,欢迎来到聚文网。 登录 免费注册
Python数据处理(影印版)(英文版)

Python数据处理(影印版)(英文版)

  • 字数: 622
  • 出版社: 东南大学
  • 作者: (美)杰奎琳·卡兹奥//凯瑟琳·嘉穆
  • 商品条码: 9787564170035
  • 版次: 1
  • 开本: 16开
  • 页数: 488
  • 出版年份: 2017
  • 印次: 1
定价:¥96 销售价:登录后查看价格  ¥{{selectedSku?.salePrice}} 
库存: {{selectedSku?.stock}} 库存充足
{{item.title}}:
{{its.name}}
精选
内容简介
如何让你的数据分析技能超越Excel到更高的水 平?通过深入学习Pvthon来完成工作。杰奎琳·卡兹 奥、凯瑟琳·嘉穆编著的《Python数据处理(影印 版)(英文版)》向非程序员展示了如何处理本质上太 杂乱或难以下手的信息。你不需要了解Pvthon编程语 言基础知识就可以开始了。 通过循序渐进的练习,你将学习如何有效地获取 、清理、分析和呈现数据。你还将了解如何将数据处 理自动化,安排文件编辑和清理任务,处理更大的数 据集,以及使用所获取的数据讲述引人注目的故事。 快速学习基本的Python语法、数据类型和语言概 念 使用机器可读和人类可用的数据 抓取网站和API以查找大量有用的信息 清理和格式化数据以消除数据集里的重复和错误 数据 了解何时标准化数据以及何时测试和编写脚本进 行数据清理 使用新的Python库和技术探索并分析数据集 使用Python解决方案自动化整个数据处理过程
作者简介
Katharine Jarmul,是一位Python开发人员,喜欢数据分析和获取、网络抓取以及教授Python和Unix。作为PyLadies(http:#pyladies org/)的创始人之一,她领导了许多研讨会和教程,涵盖从Python初学者到Python的高级主题。 Jacqueline Kazil,是一位数据科学家和软件开发人员,她对人类行为和开放数据充满热情;她教授Python和数据科学课程。她是前总统创新研究员和18F的共同创始人。在此之前,她曾在《华盛顿邮报》担任数据记者。
目录
Preface 1. Introduction to Python Why Python Getting Started with Python Which Python Version Setting Up Python on Your Machine Test Driving Python Install pip Install a Code Editor Optional: Install IPython Summary 2. Python Basics Basic Data Types Strings Integers and Floats Data Containers Variables Lists Dictionaries What Can the Various Data Types Do? String Methods: Things Strings Can Do Numerical Methods: Things Numbers Can Do List Methods: Things Lists Can Do Dictionary Methods: Things Dictionaries Can Do Helpful Tools: type, dir, and help type dir help Putting It All Together What Does It All Mean? Summary 3. Data Meant to Be Read by Machines CSV Data How to Import CSV Data Saving the Code to a File; Running from Command Line JSON Data How to Import ]SON Data XML Data How to Import XML Data Summary 4. Working with Excel Files Installing Python Packages Parsing Excel Files Getting Started with Parsing Summary 5. PDFs and Problem Solving in Python Avoid Using PDFs! Programmatic Approaches to PDF Parsing Opening and Reading Using slate Converting PDF to Text Parsing PDFs Using pdfminer Learning How to Solve Problems Exercise: Use Table Extraction, Try a Different Library Exercise: Clean the Data Manually Exercise: Try Another Tool Uncommon File Types Summary 6. Acquiring and Storing Data Not All Data Is Created Equal Fact Checking Readability, Cleanliness, and Longevity Where to Find Data Using a Telephone US Government Data Government and Civic Open Data Worldwide Organization and Non-Government Organization (NGO) Data Education and University Data Medical and Scientific Data Crowdsourced Data and APIs Case Studies: Example Data Investigation Ebola Crisis Train Safety Football Salaries Child Labor Storing Your Data: When, Why, and How? Databases: A Brief Introduction Relational Databases: MySQL and PostgreSQL Non-Relational Databases: NoSQL Setting Up Your Local Database with Python When to Use a Simple File Cloud-Storage and Python Local Storage and Python Alternative Data Storage Summary 7. Data Cleanup: Investigation, Matching, and Formatting Why Clean Data? Data Cleanup Basics Identifying Values for Data Cleanup Formatting Data Finding Outliers and Bad Data Finding Duplicates Fuzzy Matching RegEx Matching What to Do with Duplicate Records Summary 8. Data Cleanup: Standardizing and Scripting Normalizing and Standardizing Your Data Saving Your Data Determining What Data Cleanup Is Right for Your Project Scripting Your Cleanup Testing with New Data Summary 9. Data Exploration and Analysis Exploring Your Data Importing Data Exploring Table Functions Joining Numerous Datasets Identifying Correlations Identifying Outliers Creating Groupings Further Exploration Analyzing Your Data Separating and Focusing Your Data What Is Your Data Saying? Drawing Conclusions Documenting Your Conclusions Summary 10. Presenting Your Data Avoiding Storytelling Pitfalls How Will You Tell the Story? Know Your Audience Visualizing Your Data Charts Time-Related Data Maps Interactives Words Images, Video, and Illustrations Presentation Tools Publishing Your Data Using Available Sites Open Source Platforms: Starting a New Site Jupyter (Formerly Known as IPython Notebooks) Summary 11. Web Scraping: Acquiring and Storing Data from the Web What to Scrape and How Analyzing a Web Page Inspection: Markup Structure Network/Timeline: How the Page Loads Console: Interacting with JavaScript In-Depth Analysis of a Page Getting Pages: How to Request on the Internet Reading a Web Page with Beautiful Soup Reading a Web Page with LXML A Case for XPath Summary 12. Advanced Web Scraping: Screen Scrapers and Spiders Browser-Based Parsing Screen Reading with Selenium Screen Reading with Ghost.Py Spidering the Web Building a Spider with Scrapy Crawling Whole Websites with Scrapy Networks: How the Internet Works and Why It's Breaking Your Script The Changing Web (or Why Your Script Broke) A (Few) Word(s) of Caution Summary 13. APIs API Features REST Versus Streaming APIs Rate Limits Tiered Data Volumes API Keys and Tokens A Simple Data Pull from Twitter's REST API Advanced Data Collection from Twitter's REST API Advanced Data Collection from Twitter's Streaming API Summary 14. Automation and Scaling Why Automate? Steps to Automate What Could Go Wrong? Where to Automate Special Tools for Automation Using Local Files, argv, and Config Files Using the Cloud for Data Processing Using Parallel Processing Using Distributed Processing Simple Automation CronJobs Web Interfaces Jupyter Notebooks Large-Scale Automation Celery: Queue-Based Automation Ansible: Operations Automation Monitoring Your Automation Python Logging Adding Automated Messaging Uploading and Other Reporting Logging and Monitoring as a Service No System Is Foolproof Summary 15. Conclusion Duties of a Data Wrangler Beyond Data Wrangling Become a Better Data Analyst Become a Better Developer Become a Better Visual Storyteller Become a Better Systems Architect Where Do You Go from Here? A. Comparison of Languages Mentioned B. Python Resources for Beginners C. Learning the Command Line D. Advanced Python Setup E. Python Gotchas F. IPython Hints G. Using Amazon Web Services Index

蜀ICP备2024047804号

Copyright 版权所有 © jvwen.com 聚文网