index.hbs

<!DOCTYPE html>
<html lang="en">
  <head>
      <!-- Google Tag Manager -->
      <script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
                                                            new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
                                                                                                      j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
                                                                                                          'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
       })(window,document,'script','dataLayer','GTM-5QMT3JGN');</script>
      <!-- End Google Tag Manager -->

    <meta charset="utf-8">
    <title>Foundations of Machine Learning</title>
    <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
    <link rel="stylesheet" href="styles/style.css">
    <link rel="stylesheet" media="only screen and (max-width: 770px)" href="styles/tablet-and-phone.css">
    <link rel="stylesheet" media="only screen and (max-width: 420px)" href="styles/phone.css">
    <link rel="icon" href="favicon.ico" type="image/vnd.microsoft.icon">
    <link rel="canonical" href="https://bloomberg.github.io/foml/">
  </head>
  <body>
      <!-- Google Tag Manager (noscript) -->
      <noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-5QMT3JGN"
                        height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript>
      <!-- End Google Tag Manager (noscript) -->
    <nav>
        <a href="#home">Home</a>
        <a href="#about">About</a>
        <a href="#lectures">Lectures</a>
        <a href="#assignments">Assignments</a>
        <a href="#resources">Resources</a>
        <a href="#people">People</a>
    </nav>

    <section id="home">
        <a href="https://www.techatbloomberg.com/post-topic/data-science/"><img src="images/mlbanner.jpg" alt="Bloomberg ML EDU presents:"></a>
        <h1>Foundations of Machine Learning</h1>
        <table id="course-info">
            <tr>
                <th>Instructor</th>
                <td><a href="#people">David S. Rosenberg</a>, Office of the CTO at Bloomberg</td>
            </tr>
        </table>

        <p id="course-pitch"><strong>Understand the Concepts, Techniques and Mathematical Frameworks Used by Experts in Machine Learning</strong></p>
    </section>
    <section id="about">
        <h1>About This Course</h1>

        <div class="module">
            <p>Bloomberg presents "Foundations of Machine Learning," a training course that was initially delivered internally to the company's software engineers as part of its "Machine Learning EDU" initiative. This course covers a wide variety of topics in machine learning and statistical modeling. The primary goal of the class is to help participants gain a deep understanding of the concepts, techniques and mathematical frameworks used by experts in machine learning. It is designed to make valuable machine learning skills more accessible to individuals with a strong math background, including software developers, experimental scientists, engineers and financial professionals.</p>

            <p>The 30 lectures in the course are embedded below, but may also be viewed in this <a href="https://www.youtube.com/playlist?list=PLnZuxOufsXnvftwTB1HL6mel1V32w0ThI">YouTube playlist</a>. The course includes a complete set of homework assignments, each containing a theoretical element and implementation challenge with support code in Python, which is rapidly becoming the prevailing programming language for data science and machine learning in both academia and industry. This course also serves as a foundation on which more specialized courses and further independent study can build.</p>

            <p>Please fill out <a href="https://docs.google.com/forms/d/e/1FAIpQLSeyq3l0U3SOX5km78Bg_JcRZWg5XtWpy3n5dEw3kbt3YudIZw/viewform?usp=sf_link">this short online form</a> to register for access to our course's <a href="https://piazza.com/">Piazza</a> discussion board. Applications are processed manually, so please be patient.  You should receive an email directly from Piazza when you are registered. Common questions from this and previous editions of the course are posted in our <a href="https://github.com/davidrosenberg/mlcourse/blob/gh-pages/course-faq.md">FAQ</a>.</p>

            <!-- Without registering, you can also view an <a href="https://piazza.com/class/i2jg9qgaxwr5fq?cid=14">anonymized version of our Piazza board</a>.</p> -->

            <p>The first lecture, <a href="#lecture-black-box-machine-learning">Black Box Machine Learning</a>, gives a quick start introduction to practical machine learning and  only requires familiarity with basic programming concepts.</p>

<!--  
            <section>
                <h1>Highlights and Distinctive Features of the Course Lectures, Notes, and Assignments</h1>

                <ul>
                    <li>Geometric explanation for what happens with ridge, lasso, and elastic net regression in the case of correlated random variables.</li>
                    <li>Investigation of when the penalty (Tikhonov) and constraint (Ivanov) forms of regularization are equivalent.</li>
                    <li>Concise summary of what we really learn about SVMs from Lagrangian duality.</li>
                    <li>Proof of representer theorem with simple linear algebra, emphasizing it as a way to reparametrize certain objective functions.</li>
                    <li>Guided derivation of the math behind the classic diamond/circle/ellipsoids picture that "explains" why L1 regularization gives sparsity (Homework 2, Problem 5)</li>
                    <li>From scrach (in numpy) implementation of almost all major ML algorithms we discuss: ridge regression with SGD and GD (Homework 1, Problems 2.5, 2.6 page 4), lasso regression with the shooting algorithm (Homework 2, Problem 3, page 4), kernel ridge regression (Homework 4, Problem 3, page 2), kernelized SVM with Kernelized Pegasos (Homework 4, 6.4, page 9), L2-regularized logistic regression (Homework 5, Problem 3.3, page 4),Bayesian Linear Regession (Homework 5, problem 5, page 6), multiclass SVM (Homework 6, Problem 4.2, p. 3), classification and regression trees (without pruning)  (Homework 6, Problem 6), gradient boosting with trees for classification and regression (Homework 6, Problem 8), multilayer perceptron for regression (Homework 7, Problem 4, page 3)</li>
                    <li>Repeated use of a simple 1-dimensional regression dataset, so it's easy to visualize the effect of various hypothesis spaces and regularizations that we investigate throughout the course.</li>
                    <li>Investigation of how to derive a conditional probability estimate from a predicted score for various loss functions, and why it's not so straightforward for the hinge loss (i.e. the SVM) (Homework 5, Problem 2, page 1)</li>
                    <li>Discussion of numerical overflow issues and the log-sum-exp trick (Homework 5, Problem 3.2)</li>
                    <li>Self-contained introduction to the expectation maximization (EM) algorithm for latent variable models.</li>
                    <li>Develop a general computation graph framework from scratch, using numpy, and implement your neural networks in it.</li>
                </ul>
            </section>
-->
            <section>
                <h1>Prerequisites</h1>

                <p>The quickest way to see if the mathematics level of the course is for you is to take a look at this <a href="https://davidrosenberg.github.io/mlcourse/Notes/prereq-questions/math-questions.pdf">mathematics assessment</a>, which is a preview of some of the math concepts that show up in the first part of the course.</p>

                <ul>
                    <li><strong>Solid mathematical background</strong>, equivalent to a 1-semester undergraduate course in each of the following: linear algebra, multivariate differential calculus, probability theory, and statistics. The content of NYU's <a href="http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall15/index.html"><strong>DS-GA-1002: Statistical and Mathematical Methods</strong></a> would be more than sufficient, for example.</li>
                    <li><strong>Python programming required</strong> for most homework assignments.</li>
                    <li><em>Recommended:</em> At least one advanced, proof-based mathematics course</li>
                    <li><em>Recommended:</em> Computer science background up to a "data structures and algorithms" course</li>
                </ul>
            </section>
        </div>
    </section>

    <section id="lectures">
        <h1>Lectures</h1>

        <ul class="abbreviations">
            <li>(HTF) refers to Hastie, Tibshirani, and Friedman's book <a href="https://web.stanford.edu/~hastie/ElemStatLearn/"><cite>The Elements of Statistical Learning</cite></a></li>
            <li>(SSBD) refers to Shalev-Shwartz and Ben-David's book <a href="http://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/"><cite>Understanding Machine Learning: From Theory to Algorithms</cite></a></li>
            <li>(JWHT) refers to James, Witten, Hastie, and Tibshirani's book <a href="http://www-bcf.usc.edu/~gareth/ISL"><cite>An Introduction to Statistical Learning</cite></a></li>
        </ul>

        {{> lectures-new lecturesNew }}

    </section>
    <section id="assignments">
        <h1>Assignments</h1>

        {{> assignments assignments }}
    </section>

    <section id="resources">
        <h1>Resources</h1>

        <section id="textbooks">
            <h1>Textbooks</h1>

            <a href="https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/"><img src="images/geron-2nd-ed.jpg" alt="The cover of Hands-On Machine Learning with Scikit-Learn and TensorFlow"></a>

            <a href="https://web.stanford.edu/~hastie/ElemStatLearn/"><img src="images/hastie-1x.png" srcset="images/hastie-1x.png 1x, images/hastie-2x.jpg 2x, images/hastie-3x.jpg 3x" alt="The cover of Elements of Statistical Learning"></a>

            <a href="http://www-bcf.usc.edu/~gareth/ISL/"><img src="images/james-1x.jpg" srcset="images/james-1x.jpg 1x, images/james-2x.jpg 2x, images/james-3x.jpg 3x" alt="The cover of An Introduction to Statistical Learning"></a>

            <a href="http://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/"><img src="images/shalev-shwartz-1x.jpg" srcset="images/shalev-shwartz-1x.jpg 1x, images/shalev-shwartz-2x.jpg 2x, images/shalev-shwartz-3x.jpg 3x" alt="The cover of Understanding Machine Learning: From Theory to Algorithms"></a>

            <a href="https://research.microsoft.com/en-us/um/people/cmbishop/PRML/"><img src="images/bishop-1x.jpg" srcset="images/bishop-1x.jpg 1x, images/bishop-2x.jpg 2x, images/bishop-3x.jpg 3x" alt="The cover of Pattern Recognition and Machine Learning"></a>

            <a href="http://www.data-science-for-biz.com/"><img src="images/provost-fawcett-original.jpg" alt="The cover of Data Science for Business"></a>

            <dl>
                <dt><a href="https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/"><cite>Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition</cite> (Aurélien Géron)</a></dt>
                <dd>This is a practical guide to machine learning that corresponds fairly well with the content and level of our course.  While most of our homework is about coding ML from scratch with numpy, this book makes heavy use of scikit-learn and TensorFlow. We'll use the first two chapters of this book in the first two weeks of the course, when we cover "black-box machine learning."  It'll also be a handy reference for your projects and beyond this course, when you'll want to make use of existing ML packages, rather than rolling your own.</dd>

                <dt><a href="https://web.stanford.edu/~hastie/ElemStatLearn/"><cite>The Elements of Statistical Learning</cite> (Hastie, Friedman, and Tibshirani)</a></dt>
                <dd>This will be our main textbook for L1 and L2 regularization, trees, bagging, random forests, and boosting.  It's written by three statisticians who invented many of the techniques discussed. There's an easier version of this book that covers many of the same topics, described below. (Available for free as a PDF.)</dd>

                <dt><a href="http://www-bcf.usc.edu/~gareth/ISL/"><cite>An Introduction to Statistical Learning</cite> (James, Witten, Hastie, and Tibshirani)</a></dt>
                <dd>This book is written by two of the same authors as The Elements of Statistical Learning. It's much less intense mathematically, and it's good for a lighter introduction to the topics. Uses R as the language of instruction.  (Available for free as a PDF.)</dd>

                <dt><a href="http://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/"><cite>Understanding Machine Learning: From Theory to Algorithms</cite> (Shalev-Shwartz and Ben-David)</a></dt>
                <dd>This is our primary reference for kernel methods and multiclass classification, and possibly more towards the end of the course.  Covers a lot of theory that we don't go into, but it would be a good supplemental resource for a more theoretical course, such as Mohri's <a href="http://www.cs.nyu.edu/~mohri/ml16/">Foundations of Machine Learning</a> course. (Available for free as a PDF.)</dd>

                <dt><a href="https://www.microsoft.com/en-us/research/people/cmbishop/"><cite>Pattern Recognition and Machine Learning</cite> (Christopher Bishop)</a></dt>
                <dd>Our primary reference for probabilistic methods, including bayesian regression, latent variable models, and the EM algorithm.  (Available for free as a PDF.)</dd>

                <dt><a href="http://www.data-science-for-biz.com"><cite>Data Science for Business</cite> (Provost and Fawcett)</a></dt>
                <dd>Ideally, this would be everybody's first book on machine learning.  The intended audience is both the ML practitioner and the ML product manager.  It's full of important core concepts and practical wisdom.  The math is so minimal that it's perfect for reading on your phone, and I encourage you to read it in parallel to doing this class.  Have your managers read it too.</dd>
            </dl>
        </section>

        <section id="references">
            <h1>Other tutorials and references</h1>

            <ul>
                <li><a href="http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall15/notes.html">Carlos Fernandez-Granda's lecture notes</a> provide a comprehensive review of the prerequisite material in linear algebra, probability, statistics, and optimization.</li>
                <li><a href="http://nbviewer.ipython.org/github/briandalessandro/DataScienceCourse/tree/master/ipython/">Brian Dalessandro's iPython notebooks</a> from <a href="https://github.com/briandalessandro/DataScienceCourse/blob/master/ipython/references/Syllabus_2017.pdf"><strong>DS-GA-1001: Intro to Data Science</strong></a></li>
                <li><a href="http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=3274">The Matrix Cookbook</a> has lots of facts and identities about matrices and certain probability distributions.</li>
                <li><a href="http://cs229.stanford.edu/section/cs229-prob.pdf">Stanford CS229: "Review of Probability Theory"</a></li>
                <li><a href="http://cs229.stanford.edu/section/cs229-linalg.pdf">Stanford CS229: "Linear Algebra Review and Reference"</a></li>
                <li><a href="http://www.umiacs.umd.edu/~hal/courses/2013S_ML/math4ml.pdf">Math for Machine Learning</a> by Hal Daumé III</li>
            </ul>
        </section>

    </section>


    <section id="people">
        <h1>People</h1>

        <section>
            <h1>Instructor</h1>

            <div class="person module instructor">
                <img src="images/people/david-1x.jpg" srcset="images/people/david-1x.jpg 1x, images/people/david-2x.jpg 2x, images/people/david-3x.jpg 3x" alt="A photo of David Rosenberg">
                <div class="info">
                    <p class="name"><a href="http://www.linkedin.com/pub/david-rosenberg/4/241/598">David S. Rosenberg</a></p>
                    <p class="email"><a href="mailto:drosenberg44@bloomberg.net">Email</a></p>
                    <p class="email"><a href="https://twitter.com/drosen">Twitter</a></p>
                    <p class="bio">David Rosenberg is a data scientist in the data science group in the Office of the CTO at <a href="https://www.techatbloomberg.com/post-topic/data-science/">Bloomberg</a>, and an adjunct associate professor at the Center for Data Science at New York University, where he has repeatedly received NYU's Center for Data Science "Professor of the Year" award. He received his Ph.D. in statistics from UC Berkeley, where he worked on statistical learning theory and natural language processing. David received a Master of Science in applied mathematics, with a focus on computer science, from Harvard University, and a Bachelor of Science in mathematics from Yale University.
                </div>
            </div>
        </section>

<!--
        <section class="multiple-people">
            <h1>Teaching Assistants</h1>

            <ul>

            </ul>
        </section>
-->
    </section>
    <script async defer src="scripts/navigation.js"></script>
</body>
<footer>
    <p>This website is developed <a href="https://github.com/bloomberg/foml/">on GitHub</a>.  Feel free to <a href="https://github.com/bloomberg/foml/issues">report issues or make suggestions</a>.</p>
</footer>
</html>