-
Notifications
You must be signed in to change notification settings - Fork 13
/
index.qmd
64 lines (34 loc) · 5.75 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
![](img/chapter_gp_plots/gp_plot_20.svg)
<!-- {width=100%} -->
<!-- TODO: Quarto (actually krantz.cls) bug prints this twice in ToC; tried various header tricks to no avail. Also, not unnumbered by default for krantz.cls. Also, parts don't work in ToC for krantz.cls-->
# Preface {.unnumbered #sec-preface}
Hello and welcome! This book is your companion to exploring the realm of modeling in data science. It is designed to provide you something useful whether you're a beginner looking to learn some fundamentals, or an experienced practitioner seeking a fresh perspective. Our goal is to equip you with a better understanding of how models work and how to use them, including both basic and more advanced techniques, where we touch on everything from linear regression to deep learning. We'll also show how different models relate to one another to better empower you to successfully apply them in your own data-driven projects. We aim to provide an overview on how to use both machine learning and traditional statistical modeling in a practical fashion, with a balanced emphasis on interpretability and predictive power. Join us on this exciting journey as we explore the world of models in data science!
>This is still a work in progress, with more to come and plenty of things to clean up still. We hope to have the print version out on CRC press by the end of 2024. We welcome any feedback in the meantime as it develops, so please feel free to create an issue. For contributions, please see the [contributing](https://github.com/m-clark/book-of-models/blob/main/CONTRIBUTING.md) page for more information. Thanks for reading!
## What Will You Get Out of This Book? {#sec-preface-what}
We're hoping for a couple things for you as you read through this book. In particular, if you're starting your journey into data science, we hope you'll leave with:
- A firm understanding of modeling basics from a practical perspective
- A toolset of models and related ideas that you can instantly apply for competent modeling
If you're already familiar with modeling, we hope you'll leave with:
- Additional context for the models you already know
- Some introduction to models you don't know
- Additional understanding of how to choose the right model for the job and what to focus on
For anyone reading this book, we especially hope you get a sense of the commonalities between different models and a good sense of how they work.
## Brief Prerequisites {#sec-preface-prereqs}
You'll definitely want to have some familiarity with R or Python (both are used for examples), and some very basic knowledge of statistics will be helpful. We'll try to explain things as we go, but we won't be able to cover everything. If you're looking for a good introduction to R, we recommend [R for Data Science](https://r4ds.had.co.nz/) or the [Python for Data Analysis](https://wesmckinney.com/book/) book for Python. Beyond that, we'll try to provide the context you need so that you can be comfortable trying things out.
Also, if you happen to be reading this book in print, you can find the book in web form at https://m-clark.github.io/book-of-models. There you'll find all the code, figures, and other content that you can interact with more easily, as well as the most up-to-date content, fixes, etc.
## Data & Code {#sec-preface-data}
All the data and code used in this book is available on the book's [GitHub repository](https://github.com/m-clark/book-of-models/tree/main/data). See the data descriptions in the [data section](#sec-data-descript) for more information on each of the datasets used. In addition, [notebooks with chapter code](https://github.com/m-clark/book-of-models/tree/main/chapter-notebooks/) are also available there (if applicable).
## About the Authors {#sec-preface-authors}
**Michael** is a senior machine learning scientist for [Strong Analytics](https://strong.io)[^onesix]. Prior to industry he honed his chops in academia, earning a PhD in Experimental Psychology before turning to data science full-time as a consultant. His models have been used in production across a variety of industries, and can be seen in dozens of publications across several disciplines. He has a passion for helping others learn difficult stuff, and has taught a variety of data science courses and workshops for people of all skill levels in many different contexts.
He also maintains a [blog](https://m-clark.github.io) covering many aspects of statistical and machine learning modeling, and has several posts and long-form documents on a variety of data science topics there. He lives in Ann Arbor Michigan with his wife and his dog, where they all enjoy long walks around the neighborhood. During the course of writing this book, he became a father to Juni, and is now learning the joys of sleep deprivation.
:::{.content-visible when-format='html'}
![Michael](img/me_for_web.jpeg){width=1in}
:::
---
**Seth** is the Academic Co-Director of the [Master of Science in Business Analytics (MSBA)](https://mendoza.nd.edu/graduate-programs/business-analytics-msba/) and Associate Teaching Professor at the University of Notre Dame for the IT, Analytics, and Operations Department. He likewise has a PhD in Applied Experimental Psychology and has been teaching and consulting in data science for over a decade. He is an excellent instructor, and teaches several data science courses at the undergraduate and graduate level.
Seth lives in South Bend, Indiana with his wife and three kids, and spends his free time lifting more weights than he should, playing guitar, and chopping wood.
:::{.content-visible when-format='html'}
![Seth](img/seth.png){width=1in}
:::
[^onesix]: By the time you're reading this, Strong's merger with [OneSix](https://www.onesixsolutions.com/) should be complete (2025).
<!-- © 2024 by the authors. Web-version is CC-BY-NC-SA. -->