Biomedical Data Science Textbook Available

[This article was first published on R – r4stats.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

By Bob Hoyt & Bob Muenchen

Data science is being used in many ways to improve healthcare and reduce costs. We have written a textbook, Introduction to Biomedical Data Science, to help healthcare professionals understand the topic and to work more effectively with data scientists. The textbook content and data exercises do not require programming skills or higher math. We introduce open source tools such as R and Python, as well as easy-to-use interfaces to them such as BlueSky Statistics, jamovi, R Commander, and Orange. Chapter exercises are based on healthcare data, and supplemental YouTube videos are available in most chapters.

For instructors, we provide PowerPoint slides for each chapter, exercises, quiz questions, and solutions. Instructors can download an electronic copy of the book, the Instructor Manual, and PowerPoints after first registering on the instructor page.

The book is available in print and various electronic formats. Because it is self-published, we plan to update it more rapidly than would be possible through traditional publishers.

Below you will find a detailed table of contents and a list of the textbook authors.

Table of Contents​

​OVERVIEW OF BIOMEDICAL DATA SCIENCE

  1. Introduction
  2. Background and history
  3. Conflicting perspectives
    1. the statistician’s perspective
    2. the machine learner’s perspective
    3. the database administrator’s perspective
    4. the data visualizer’s perspective
  4. Data analytical processes
    1. raw data
    2. data pre-processing
    3. exploratory data analysis (EDA)
    4. predictive modeling approaches
    5. types of models
    6. types of software
  5. Major types of analytics
    1. descriptive analytics
    2. diagnostic analytics
    3. predictive analytics (modeling)
    4. prescriptive analytics
    5. putting it all together
  6. Biomedical data science tools
  7. Biomedical data science education
  8. Biomedical data science careers
  9. Importance of soft skills in data science
  10. Biomedical data science resources
  11. Biomedical data science challenges
  12. Future trends
  13. Conclusion
  14. References

​​SPREADSHEET TOOLS AND TIPS

  1. Introduction
    1. basic spreadsheet functions
    1. download the sample spreadsheet
  2. Navigating the worksheet
  3. Clinical application of spreadsheets
    1. formulas and functions
    2. filter
    3. sorting data
    4. freezing panes
    5. conditional formatting
    6. pivot tables
    7. visualization
    8. data analysis
  4. Tips and tricks
    1. Microsoft Excel shortcuts – windows users
    2. Google sheets tips and tricks
  5. Conclusions
  6. Exercises
  7. References

​​BIOSTATISTICS PRIMER

  1. Introduction
  2. Measures of central tendency & dispersion
    1. the normal and log-normal distributions
  3. Descriptive and inferential statistics
  4. Categorical data analysis
  5. Diagnostic tests
  6. Bayes’ theorem
  7. Types of research studies
    1. observational studies
    2. interventional studies
    3. meta-analysis
    4. orrelation
  8. Linear regression
  9. Comparing two groups
    1. the independent-samples t-test
    2. the wilcoxon-mann-whitney test
  10. Comparing more than two groups
  11. Other types of tests
    1. generalized tests
    2. exact or permutation tests
    3. bootstrap or resampling tests
  12. Stats packages and online calculators
    1. commercial packages
    2. non-commercial or open source packages
    3. online calculators
  13. Challenges
  14. Future trends
  15. Conclusion
  16. Exercises
  17. References

​​DATA VISUALIZATION

  1. Introduction
    1. historical data visualizations
    2. visualization frameworks
  2. Visualization basics
  3. Data visualization software
    1. Microsoft Excel
    2. Google sheets
    3. Tableau
    4. R programming language
    5. other visualization programs
  4. Visualization options
    1. visualizing categorical data
    2. visualizing continuous data
  5. Dashboards
  6. Geographic maps
  7. Challenges
  8. Conclusion
  9. Exercises
  10. References

​​INTRODUCTION TO DATABASES

  1. Introduction
  2. Definitions
  3. A brief history of database models
    1. hierarchical model
    2. network model
    3. relational model
  4. Relational database structure
  5. Clinical data warehouses (CDWs)
  6. Structured query language (SQL)
  7. Learning SQL
  8. Conclusion
  9. Exercises
  10. References

BIG DATA

  1. Introduction
  2. The seven v’s of big data related to health care data
  3. Technical background
  4. Application
  5. Challenges
    1. technical
    2. organizational
    3. legal
    4. translational
  6. Future trends
  7. Conclusion
  8. References

​​BIOINFORMATICS and PRECISION MEDICINE

  1. Introduction
  2. History
  3. Definitions
  4. Biological data analysis – from data to discovery
  5. Biological data types
    1. genomics
    2. transcriptomics
    3. proteomics
    4. bioinformatics data in public repositories
    5. biomedical cancer data portals
  6. Tools for analyzing bioinformatics data
    1. command line tools
    2. web-based tools
  7. Genomic data analysis
  8. Genomic data analysis workflow
    1. variant calling pipeline for whole exome sequencing data
    2. quality check
    3. alignment
    4. variant calling
    5. variant filtering and annotation
    6. downstream analysis
    7. reporting and visualization
  9. Precision medicine – from big data to patient care
  10. Examples of precision medicine
  11. Challenges
  12. Future trends
  13. Useful resources
  14. Conclusion
  15. Exercises
  16. References

​​PROGRAMMING LANGUAGES FOR DATA ANALYSIS

  1. Introduction
  2. History
  3. R language
    1. installing R & rstudio
    2. an example R program
    3. getting help in R
    4. user interfaces for R
    5. R’s default user interface: rgui
    6. Rstudio
    7. menu & dialog guis
    8. some popular R guis
    9. R graphical user interface comparison
    10. R resources
  4. Python language
    1. installing Python
    2. an example Python program
    3. getting help in Python
    4. user interfaces for Python
  5. reproducibility
  6. R vs. Python
  7. Future trends
  8. Conclusion
  9. Exercises
  10. References

​​MACHINE LEARNING

  1. Brief history
  2. Introduction
    1. data refresher
    2. training vs test data
    3. bias and variance
    4. supervised and unsupervised learning
  3. Common machine learning algorithms
  4. Supervised learning
  5. Unsupervised learning
    1. dimensionality reduction
    2. reinforcement learning
    3. semi-supervised learning
  6. Evaluation of predictive analytical performance
    1. classification model evaluation
    2. regression model evaluation
  7. Machine learning software
    1. Weka
    2. Orange
    3. Rapidminer studio
    4. KNIME
    5. Google TensorFlow
    6. honorable mention
    7. summary
  8. Programming languages and machine learning
  9. Machine learning challenges
  10. Machine learning examples
    1. example 1 classification
    2. example 2 regression
    3. example 3 clustering
    4. example 4 association rules
  11. Conclusion
  12. Exercises
  13. References

​​ARTIFICIAL INTELLIGENCE

  1. Introduction
    1. definitions
  2. History
  3. Ai architectures
  4. Deep learning
  5. Image analysis (computer vision)
    1. Radiology
    2. Ophthalmology
    3. Dermatology
    4. Pathology
    5. Cardiology
    6. Neurology
    7. Wearable devices
    8. Image libraries and packages
  6. Natural language processing
    1. NLP libraries and packages
    2. Text mining and medicine
    3. Speech recognition
  7. Electronic health record data and AI
  8. Genomic analysis
  9. AI platforms
    1. deep learning platforms and programs
  10. Artificial intelligence challenges
    1. General
    2. Data issues
    3. Technical
    4. Socio economic and legal
    5. Regulatory
    6. Adverse unintended consequences
    7. Need for more ML and AI education
  11. Future trends
  12. Conclusion
  13. Exercises
  14. References

Authors

Brenda Griffith
Technical Writer
Data.World
Austin, TX

Robert Hoyt MD, FACP, ABPM-CI, FAMIA
Associate Clinical Professor
Department of Internal Medicine
Virginia Commonwealth University
Richmond, VA

David Hurwitz MD, FACP, ABPM-CI
Associate CMIO
Allscripts Healthcare Solutions
Chicago, IL

Madhurima Kaushal MS
Bioinformatics
Washington University at St. Louis, School of Medicine
St. Louis, MO

Robert Leviton MD, MPH, FACEP, ABPM-CI, FAMIA
Assistant Professor
New York Medical College
Department of Emergency Medicine
Valhalla, NY

Karen A. Monsen PhD, RN, FAMIA, FAAN
Professor
School of Nursing
University of Minnesota
Minneapolis, MN

Robert Muenchen MS, PSTAT
Manager, Research Computing Support
University of Tennessee
Knoxville, TN

Dallas Snider PhD
Chair, Department of Information Technology
University of West Florida
Pensacola, FL

​A special thanks to Ann Yoshihashi MD for her help with the publication of this textbook.

To leave a comment for the author, please follow the link and comment on their blog: R – r4stats.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)