您好,欢迎来到聚文网。 登录 免费注册
云端基因组学(影印版)(英文版)

云端基因组学(影印版)(英文版)

  • 字数: 607
  • 出版社: 东南大学
  • 作者: (美)杰拉尔丁·A.范德奥维拉//布里安·D.奥康钠|责编:张烨
  • 商品条码: 9787564195878
  • 版次: 1
  • 开本: 16开
  • 页数: 467
  • 出版年份: 2021
  • 印次: 1
定价:¥148 销售价:登录后查看价格  ¥{{selectedSku?.salePrice}} 
库存: {{selectedSku?.stock}} 库存充足
{{item.title}}:
{{its.name}}
精选
内容简介
基因组学领域的数据正 在剧增。在短短几年内,美 国国家卫生研究院(National Institutes of Health,NIH) 等组织托管的基因组数据已 经超过了50PB(5OOO万GB) ,这些组织正在转向云基础 架构,以便将数据提供给研 究团体。你该如何调整分析 工具和协议来访问和分析云 端的海量数据? 通过这本实用书籍,研 究人员将学会如何使用基因 组分析工具包(Genome Analysis Toolkit,GATK)、 Docker、WDL、Terra等开 源工具来处理基因组学算法 。GATk用户社区的长期监 理人Geraldine Van der Auwera和加州大学圣克鲁 兹基因组学研究所的Brian O’Connor会指导你完成这一 过程。你将通过使用真实数 据和相关领域的基因组学算 法展开学习。 本书涵盖了: 基本的基因组学和计算 技术背景; 基本的云计算操作; GATK入门,加上三个主 要的GATK最佳实践; 使用WDL和Cromwell编 写的脚本化工作流进行自动 分析; 扩展云端的工作流执行 ,包括并行化和成本优化; 使用Jupyter notebook在 云端进行交互式分析; 使用Terra确保协作和计 算可重复性。
目录
Foreword Preface 1. Introduction The Promises and Challenges of Big Data in Biology and Life Sciences Infrastructure Challenges Toward a Cloud-Based Ecosystem for Data Sharing and Analysis Cloud-Hosted Data and Compute Platforms for Research in the Life Sciences Standardization and Reuse of Infrastructure Being FAIR Wrap-Up and Next Steps 2. Genomics in a Nutshell: A Primer for Newcomers to the Field Introduction to Genomics The Gene as a Discrete Unit of Inheritance (Sort Of) The Central Dogma of Biology: DNA to RNA to Protein The Origins and Consequences of DNA Mutations Genomics as an Inventory of Variation in and Among Genomes The Challenge of Genomic Scale, by the Numbers Genomic Variation The Reference Genome as Common Framework Physical Classification of Variants Germline Variants Versus Somatic Alterations High-Throughput Sequencing Data Generation From Biological Sample to Huge Pile of Read Data Types of DNA Libraries: Choosing the Right Experimental Design Data Processing and Analysis Mapping Reads to the Reference Genome Variant Calling Data Quality and Sources of Error Functional Equivalence Pipeline Specification Wrap-Up and Next Steps 3. Computing Technology Basics for Life Scientists Basic Infrastructure Components and Performance Bottlenecks Types of Processor Hardware: CPU, GPU, TPU, FPGA, OMG Levels of Compute Organization: Core, Node, Cluster, and Cloud Addressing Performance Bottlenecks Parallel Computing Parallelizing a Simple Analysis From Cores to Clusters and Clouds: Many Levels of Parallelism Trade-Offs of Parallelism: Speed, Efficiency, and Cost Pipelining for ParaUelization and Automation Workflow Languages Popular Pipelining Languages for Genomics Workflow Management Systems Virtualization and the CIoud VMs and Containers Introducing the Cloud Categories of Research Use Cases for Cloud Services Wrap-Up and Next Steps 4. First Steps in the Cloud Setting Up Your Google Cloud Account and First Project Creating a Project Checking Your Billing Account and Activating Free Credits Running Basic Commands in Google Cloud Shell Logging in to the Cloud Shell VM Using gsutil to Access and Manage Files Pulling a Docker Image and Spinning Up the Container Mounting a Volume to Access the Filesystem from Within the Container Setting Up Your Own Custom VM Creating and Configuring Your VM Instance Logging into Your VM by Using SSH Checking Your Authentication Copying the Book Materials to Your VM Installing Docker on Your VM Setting Up the GATK Container Image …… 6. GATK Best Practices for Germline Short Variant Discovery 7. GATK Best Practices for Somatic Variant Discovery 8. Automatina Analysis Execution with Workflows 9. Deciphering Real Genomics Workflows 10. Running Single Workflows at Scale with Pipelines API 11. Running Many Workflows Conveniently in Terra 12. Interactive Analysis in Jupyter Notebook 13. Assembling Your Own Workspace in Terra 14. Making a Fully Reproducible Paper Glossary Index

蜀ICP备2024047804号

Copyright 版权所有 © jvwen.com 聚文网