index.json

[{"authors":null,"categories":null,"content":"MSAIL is a student organization devoted to artificial intelligence research. We strive to spread our passion for artificial intelligence throughout the University of Michigan student body, regardless of demographic or academic standing. As of Winter 2024, our main operations are education, projects, and reading groups.\n","date":1711663200,"expirydate":-62135596800,"kind":"term","lang":"en","lastmod":1711663200,"objectID":"2525497d367e79493fd32b198b28f040","permalink":"https://MSAIL.github.io/author/msail/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/author/msail/","section":"authors","summary":"MSAIL is a student organization devoted to artificial intelligence research. We strive to spread our passion for artificial intelligence throughout the University of Michigan student body, regardless of demographic or academic standing.","tags":null,"title":"MSAIL","type":"authors"},{"authors":["MSAIL"],"categories":null,"content":"University of Michigan alumni and former MSAIL member Wesley Tian, the Co-founder and CEO of Aragon.ai, will talk about his journey, share what he learned from starting his AI company, and offer career advice.\nRSVP for this event here!\n","date":1711663200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1711663200,"objectID":"e78f3a55e8a055eaa09d929ffe0ec61d","permalink":"https://MSAIL.github.io/talk/wesleytian_240324/","publishdate":"2024-03-24T22:00:00-04:00","relpermalink":"/talk/wesleytian_240324/","section":"talk","summary":"Speaker(s): Wesley Tian","tags":["MSAIL TECH TALK","alumni"],"title":"MSAIL TECH TALK w/ Wesley Tian","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"University of Michigan and Carnegie Mellon alumni Kiran Prasad will share his professional journey, thoughts about Grad School, and give a talk about machine learning concepts that he uses in industry. Kiran is a Senior ML Engineer at Gather, where he works on end-to-end AI system design. He previously was an Applied Scientist on the Microsoft Turing team (Microsoft\u0026rsquo;s NLP v-team that created CoPilot and spearheaded collaboration with OpenAI).\nRSVP for this event here!\n","date":1711062000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1711062000,"objectID":"fe8d9626b74df74bc320bad36752c9b8","permalink":"https://MSAIL.github.io/talk/prasad_210324/","publishdate":"2024-03-16T14:27:00-04:00","relpermalink":"/talk/prasad_210324/","section":"talk","summary":"Speaker(s): Kiran Prasad","tags":["MSAIL TECH TALK","alumni"],"title":"MSAIL TECH TALK w/ Kiran Prasad","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Binarized neural networks (BNNs) are an extreme version of quantized neural networks where all weights and activations are quantized to +/- 1. A key motivation for such a network is to enable one to run powerful neural networks on small battery-powered devices. This talk introduced BNNs, explained how one can train such a network and reviewed some recent work in the area.\nSupplemental Resources Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1, by Courbariaux et al. A review of Binarized Neural Networks, by Simons et al. An Empirical Study of Binarized Neural Networks\u0026rsquo; Optimization, by Alizadeh et al. ","date":1649800800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1649800800,"objectID":"6b231168c8b430672ad932153b716fc3","permalink":"https://MSAIL.github.io/talk/bnn_041222/","publishdate":"2022-04-30T18:00:38-04:00","relpermalink":"/talk/bnn_041222/","section":"talk","summary":"Speaker(s): Timothy Baker","tags":["Neural Networks","Binarized Neural Networks"],"title":"An Overview of Binarized Neural Networks","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Supplemental Resources Paper 1: Near real-time intraoperative brain tumor diagnosis using stimulated Raman histology and deep neural networks\nPaper 2: Rapid Automated Analysis of Skull Base Tumor Specimens Using Intraoperative Optical Imaging and Artificial Intelligence\n","date":1649196000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1649196000,"objectID":"6415d8b541f3a43fe40193e224258933","permalink":"https://MSAIL.github.io/talk/brain_tumor_040522/","publishdate":"2022-04-30T18:00:38-04:00","relpermalink":"/talk/brain_tumor_040522/","section":"talk","summary":"Speaker(s): Cheng Jiang","tags":["Medical Imaging","Representation Learning","Computer Vision","Convolutional Neural Networks"],"title":"Machine Learning for Intraoperative Diagnosis of Brain Tumors Imaged using Stimulated Raman Histology","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Physics-based global climate simulations are computationally expensive and limited to low spatial and temporal resolutions, making it difficult to predict and track highly localized extreme weather phenomena. To overcome these limitations, we present a novel application of super-resolution using deep convolutional generative adversarial networks (GANs) to increase the resolution of global climate models in both space and time. In this project, we demonstrate the potential to reduce climate simulation computation and storage requirements by two orders of magnitude, as well as democratize relevant and actionable climate information for disaster responses. This work won the Best Paper Award in the 2020 ProjectX international ML research competition hosted by the University of Toronto.\nSupplemental Resources Paper, by Chen et al.\n","date":1648591200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1648591200,"objectID":"cb4d47e6ca41ecc59edc94391bee2206","permalink":"https://MSAIL.github.io/talk/climate_adversarial_032922/","publishdate":"2022-04-30T18:00:38-04:00","relpermalink":"/talk/climate_adversarial_032922/","section":"talk","summary":"Speaker(s): Sanjeev Raja","tags":["Climate","Generative Models","Super-Resolution"],"title":"The Devil is in the Details: Spatial and Temporal Super-Resolution of Global Climate Models using Adversarial Deep Learning","type":"talk"},{"authors":["Mukundh Murthy","Nikhil Devraj"],"categories":null,"content":"This post was an accepted submission from MSAIL to the ICLR 2022 Blog Track. You can find the original post here.\nIn this post, we provide an in-depth overview of methods outlined in the paper \u0026ldquo;Learning Neural Generative Dynamics for Molecular Conformation Generation,\u0026rdquo; discuss the impact of the work in the context of other conformation generation approaches, and additionally discuss future potential applications to improve the diversity and stability of generated conformations.\nIntroduction An Overview of the Deep Generative Approach Modeling distributions of distances Modeling distributions of conformations Sampling Future Work References Introduction In drug discovery, generating molecular conformations is useful across a variety of applications. For example, docking of various molecular 3D conformations to a specific protein allows drug hunters to decide whether a small molecule binds to a specific pocket in a multitude of conformations or a select few.\nFigure 1: Autodock Vina is a computer program that takes a given 3D conformation of a molecule and protein and predicts the binding free energy. An algorithm like the one discussed in this blog could generate a wide variety of conformations for Autodock Vina to test. (Source) It may be helpful to define what we mean when we talk about conformations, whether we are talking about a small organic molecule or a macromolecule like a protein. We start off with a graph, with atoms as nodes connected by bonds as edges that represent intramolecular interactions. In essence, we are starting with a specified connectivity defining how atoms are connected to each other. This two-dimensional representation, however, doesn\u0026rsquo;t capture the three-dimensional coordinates of the atoms and how they are spatially arranged.\nTherefore, in theory, one molecular graph could capture an astronomical number of conformations capturing all possible permutations and combinations of spatial arrangements of atoms. However, not all of these possible spatial arrangements are relevant as some may be so unstable that they may not occur. The spatial proximity of bulky organic groups – more formally known as “steric clashing” – reduces the number of degrees of freedom when it comes down to which bonds can rotate and how much they can rotate. Therefore, we are only interested in conformations that fall in stable low energy minima.\nLevinthal’s Paradox is a principle stating that if a protein were to sample all of its possible molecular conformations before arriving in its native state, it would take longer than the age of the universe. Though it may seem excessive to directly extend this analogy to small molecules, which are orders of magnitude less complex than proteins, it becomes intuitive that computationally simulating all of the possible conformations for a large molecule with a large number of rotatable bonds is highly infeasible. For every single bond and the associated substituents, if there are three stable conformations, then there is a maximum bound of $3^n$ stable conformations for a molecule with $n$ bonds. For example, a molecule with ten rotatable bonds could have a maximum of 59,049 conformations. Now, we’ve arrived at the question that drives this blog post and the work that we’re about to discuss: Given a molecular graph and its associated connectivity constraints, can we generate a set of low energy stable molecular conformations (a multimodal distribution) that capture the relative spatial positions of atoms in three-dimensional space? There are two subtle components to the question above that address some deficiencies in prior attempts to solve this problem:\nA multimodal distribution – there are multiple low energy minima when it comes to the joint distribution of distances between atoms that defines a conformation. In approaches where distances between pairs of atoms or 3D coordinates are randomly sampled to construct a conformation, dependencies and correlations between atomic spatial positions are not captured and the corresponding joint distribution is inaccurate.\nRelative spatial positions – some approaches use graph neural networks directly on molecular graphs to compute representations for the individual nodes (atoms). These nodes can be further fed into other feedforward networks to predict the 3D coordinates of the atoms in a specified conformation. However, directly predicting the 3D coordinates does not capture the idea that a conformation is defined by the relative spatial arrangement and distances between atoms in 3D space. Put another way, if a rotation or translation transformation was applied to the 3D coordinates, the model should not classify that as an entirely different conformation (rotation/translation invariance is not captured). Distances, rather than 3D coordinates could also be predicted; however (mirroring the bullet point above), since distances are predicted independently of each other, there could only be one predicted conformational mode.\nAn Overview of the Deep Generative Approach In “Learning Neural Generative Dynamics for Molecular Conformation Generation,” Xu et. al approach the above deficiencies, generating low energy conformations while modeling dependencies between atoms.\nLet’s keep in mind – the final goal is to optimize a set of parameters $\\theta$ to predict the likelihood of a conformation $R$ given a graph $G$. (i.e. to find $ p_\\theta(R|G) $).\nTo model this distribution, it is necessary to model intermediate distributions and marginalize over one of the variables:\nWe also need to find $p_\\theta(d|G)$ (the distribution of distances $d_{uv}$ between pairs of atoms $u$ and $v$ in the graph).\nFinally, we need to find $p_\\theta(\\boldsymbol{R}|d,G)$ – the probability of a conformation (specified by a set of 3D coordinates given a set of intramolecular distances and an underlying graph).\nWith these two distributions, we can find our intended distribution by integrating over the possible distances.\n$$\\int{p(\\boldsymbol{R}|d,G)*p(d|G)dd}$$\nLet’s walk through the approaches to modeling each of these individual distributions.\nModeling Distributions of Distances In this approach, the distribution of distances given a graph is modeled using a continuous normalizing flow. To understand this approach, we need to define its sub-techniques and understand how they interact with each other.\nNormalizing flows: We initially sample $z_0$ from a starting distribution $p(z_0)$ and a series of invertible transformations transform the initial density function. Here’s a strong primer on flows.\nIn this work, $z(t)$ represents our distances between pairs of atoms $d(t)$. The initial distances are pulled from a normal distribution with mean zero and variance one (for all distances). Correspondingly, the initial probability density function $p(z_0)$ is represented by the initial distribution of distances $N(0, \\mathbf{I})$. Neural ODE systems: In a neural ODE, we specify an initial value problem that uses a neural network to specify the “dynamics” of the system (or the derivative of the “state” with respect to time). More concretely, we have that $y(0) = y_0$ and that $\\frac{dy}{dt} = f(y(t), t, \\theta)$. Using an ODE solver such as odeint, we can calculate the value of $y$ at any time $t$ as in any initial value problem. In fact, y can be thought of as a residual network where we take the limit with respect to the number of layers.\nCorrespondingly, in this work, the purpose of instantiating an ODE is to be able to predict $d(t)$ – the distances between each pair of atoms at any time point. $\\frac{d\\mathbf{d}}{dt}$ can be predicted at any time point given $d(t)$, the time point, $t$, the molecular graph, and the parameters of the assigned neural network (in our case an MPNN). $$\\boldsymbol{d} = F_\\theta(\\boldsymbol{d}(t_0), \\mathcal{G}) = \\boldsymbol{d}(t_0) + \\int_{t_0}^{t_1} f_\\theta(\\boldsymbol{d}(t), t; \\mathcal{G})dt$$\nTo combine the two methods above: We take $z_0$ and define it as the initial value. $\\frac{dz}{dt}$ is calculated using a neural network that takes in $z(t)$, $t$, and $\\theta$. With $z_0$ and a function $f$ to calculate $\\frac{dz}{dt}$ at any time point, $z(t)$ can be calculated as per the traditional initial value problem formulation. The ODESolver also predicts the $\\textrm{log}(p(z(t))$ at any time point, thereby encoding the density function for $z(t)$ in addition to just the values of $z(t)$ alone (Figure 2).\nFigure 2: The neural ODE system computes $\\boldsymbol{d}(t)$ and $\\textrm{log}(p(\\boldsymbol{d}(t))$ at various time points in order to try and approximate the actual functions for $\\boldsymbol{d}(t)$ and $\\textrm{log}(p(\\boldsymbol{d}(t))$. In this case, our $z(t)$ is $\\boldsymbol{d}(t)$, a function that outputs a vector with pairwise intramolecular distances. The “continuous-time dynamics\u0026quot; is a function that takes in neural network parameters, the time, and the current state to output the derivative of the distances with respect to time. The neural network is a graph message passing neural network (MPNN) that calculates node and edge representations and aggregates the node and edge representations for each bond to calculate $\\frac{dd_{uv}}{dt}$ – the change of the distance between two atoms with respect to time (Figure 3).\nFigure 3: First, the individual nodes and edges are embedded using feedforward networks and sent through message passing layers. For every single bond, the final embeddings for the edge and atoms on each (atoms $u$ and $v$) end are concatenated and sent into a final feedforward network to result in a prediction for $\\frac{dd_{uv}}{dt}$. At a higher level, by combining normalizing flows (Figure 4a) with an ODE system, the authors intended to effectively create a normalizing flow with an infinite number of transformations (in the limit) that can therefore model very long-range dependencies between atoms in all the transformations that occur from time $t_0$ to $t_1$ (Figure 4b).\nFigure 4a (Left): Traditional normalizing flow. Figure 4b (Right): Continuous normalizing flow with $z(t)$ as $d(t)$. Modeling Distributions of Conformations After the distances are sampled and predicted based on the graph, the conformations can be sampled so as to minimize the difference between the a priori distances generated by the continuous graph normalizing flow (CGNF) and the pairwise distances in the sampled conformation.\n$$p(\\boldsymbol{R}|d, \\mathcal{G}) = \\frac{1}{Z}\\textrm{exp}{-\\sum_{e_{uv}\\in{\\mathcal{E}}} a_{uv}(\\lVert r_u - r_v \\rVert_2 - d_{uv})^2}$$\nThe euclidean norm of the difference between the position vectors represents the distance between two atoms in a sampled conformation ($\\lVert r_u - r_v \\rVert_2$). The distance associated with the edge between atoms u and v from the distribution modeled using the CGNF is ($d_{uv}$). The lower the difference between these two values, the higher the numerator. The higher the numerator, the higher the probability of the conformation given the proposed distances and molecular graph.\nIn the way that LeCun et. al initially describe energy-based models, they describe the energy-based function $E(X, Y)$ to calculate the “goodness” or the “badness” of the possible configurations of $X$ and $Y$ or the “degree of compatibility” between the values of $X$ and $Y$. The same idea can be applied when considering the meaning of the energy function taking in a molecular conformation and a graph as input.\nThe loss function with which the energy-based model (EBM) is optimized provides additional insight into how it helps guide the generation of conformations.\nHere, $p_{data}$ and $p_{\\theta}$ are two different distributions that generate distances between pairs of atoms. $p_{data}$ pulls from vectors of true distances between atoms in actual conformations, while $p_{\\theta}$ pulls from vectors of generated distances from the continuous flow. Therefore, the conformations represented in the second term on the right-hand side of this equation are noisier than the conformations represented in the first term. By being trained against this objective function, the model learns to distinguish real conformations based on true distances from unreal noisy conformations.\nSampling Conformations are sampled by pulling an initial vector of distances from a normal distribution, passing it through the continuous graph normalizing flow, and finding an initial conformation $R_0$ that minimizes the energy. Then, conformations are sampled using two steps of stochastic gradient Langevin Dynamics. As in traditional stochastic gradient descent, we subtract the gradient of a secondary energy function that uses both the initial EBM parameters and CGNF parameters from the coordinates from the prior iteration. The “Langevin” part of this stochastic gradient descent implies there is a noise term ($w$) added, the variance of which is equal to the square root of the step size ($\\epsilon$). This noise term, and Langevin dynamics more generally, are inspired by modeling Brownian motion in particles and have been repurposed for sampling in molecular dynamics.\nThe secondary function takes into account both the initial energy function and the $\\textrm{log}(p(\\boldsymbol{R}|\\mathcal{G}))$. Minimizing $E_{\\theta, \\phi}(R|\\mathcal{G})$ involves $E_{\\phi}(R|\\mathcal{G})$ and simultaneously minimizing $p(\\boldsymbol{R}|\\mathcal{G})$.\n$$R_k = R_{k-1} - \\frac{\\epsilon}{2}\\nabla_RE_{\\theta, \\phi}(R|\\mathcal{G}) + \\sqrt{\\epsilon}\\omega, \\omega \\sim \\mathcal{N}(0, \\mathcal{I})$$\nFuture Work One could explore different variations on the approach used to compute the continuous-time dynamic – for example, large-scale pretrained transformers applied on SMILES strings – to compare how different architectures that are also able to capture long-range dependencies between atoms perform in generating distance distributions and subsequently conformations. Similar to the way that message passing allows for encoding of long-range dependencies, attention also allows for the same. In fact, attention applied to protein sequences has been shown to recover high-level elements of a three-dimensional structural organization; attention weights are a well-calibrated estimator of the probability that two amino acids are in contact in three-dimensional space (Vig et. al).\nOne caveat to note concerning the idea above is many models pretrained on protein sequences include evolutionary information regarding the sequences through featurizations such as multiple sequence alignments ( MSA) and position-specific scoring matrices ( PSSM) (Rao et. al). There are currently no featurizations for small molecules that encode their “structural evolution.”\nOne could also verify the ability of the different molecular conformation generation methods to generate more stable conformations. Towards the end of the paper, the authors proposed that the EBM shifts generation towards more stable conformations. Developing a metric or computational experiment – for example, calculating the free energy of generated molecules – would verify if this is the case. Or we could potentially even ask the question – is there an architectural or algorithmic knob that we could turn to control the tradeoff the algorithm makes between choosing conformational stability over diversity? To evaluate the model’s ability to especially generate low energy stable conformations, one could re-calculate all metrics solely across reference conformations for molecules bound to a protein in the protein data bank (PDB) (Figure 5) or Cambridge Structural Database (CSD) in a solid-state crystal structure.\nFigure 5: Example of conformational variability for a single PDB ligand between different protein structures (Source: Hawkins et. al). Finally, Hawkins et. al make the distinction between systematic methods and stochastic methods for molecular conformation generation. Systematic methods involve a deterministic brute force search through all possible pairwise distances and torsion angles while stochastic methods involve random sampling and are not deterministic. Rather, in stochastic methods, the final generated conformation is in part determined by some initially sampled random variable). Under these definitions, the current method proposed in this work is stochastic, as the generated conformations are a function of the initial $d(t_0)$’s sampled from a normal distribution.\nFor stochastic approaches to finding multiple local minima, it is necessary to have multiple “starts” in order to cover all local minima. To evaluate the efficiency of the approach, one could measure the number of starts it takes to get a certain threshold of coverage over significant low-energy conformations.\nAll in all, the approach that Xu et. al employ to generate 3D conformers from a 2D molecular graph is part of a recent frontier in research that involves fewer brute-force physical simulations and more convenient ML-guided predictions that can help accelerate drug discovery.\nReferences Chen, R. T. Q., Rubanova, Y., Bettencourt, J., \u0026amp; Duvenaud, D. (2019). Neural Ordinary Differential Equations. arXiv [cs.LG]. Opgehaal van http://arxiv.org/abs/1806.07366\nHawkins, P. C. D. (2017). Conformation Generation: The State of the Art. Journal of Chemical Information and Modeling, 57(8), 1747–1756. doi:10.1021/acs.jcim.7b00221\nMadani, A., Krause, B., Greene, E. R., Subramanian, S., Mohr, B. P., Holton, J. M., … Naik, N. (2021). Deep neural language modeling enables functional protein generation across families. bioRxiv. doi:10.1101/2021.07.18.452833\nRao, R., Bhattacharya, N., Thomas, N., Duan, Y., Chen, X., Canny, J., … Song, Y. S. (2019). Evaluating Protein Transfer Learning with TAPE. arXiv [cs.LG]. Opgehaal van http://arxiv.org/abs/1906.08230\nVig, J., Madani, A., Varshney, L. R., Xiong, C., Socher, R., \u0026amp; Rajani, N. F. (2020). BERTology Meets Biology: Interpreting Attention in Protein Language Models. bioRxiv. doi:10.1101/2020.06.26.174417\nWeng, L. (2018). Flow-based Deep Generative Models. lilianweng. github. io/lil-log. Opgehaal van http://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html\nXu, M., Luo, S., Bengio, Y., Peng, J., \u0026amp; Tang, J. (2021). Learning Neural Generative Dynamics for Molecular Conformation Generation. arXiv [cs.LG]. Opgehaal van http://arxiv.org/abs/2102.10240\nYann LeCun, Sumit Chopra, Raia Hadsell, M Ranzato, and F Huang. A tutorial on energy-based learning. Predicting structured data, 1(0), 2006.\n","date":1648166400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1648166400,"objectID":"64a5ff66e47ae98c4689966695831d48","permalink":"https://MSAIL.github.io/post/molecular_iclr_2022/","publishdate":"2022-03-25T00:00:00Z","relpermalink":"/post/molecular_iclr_2022/","section":"post","summary":"Post accepted to the ICLR 2022 Blog Track. In this post, we provide an in-depth overview of methods outlined in the paper “Learning Neural Generative Dynamics for Molecular Conformation Generation,” discuss the impact of the work in the context of other conformation generation approaches, and additionally discuss future potential applications to improve the diversity and stability of generated conformations.","tags":["Cheminformatics","Drug Discovery","Normalizing Flows","Neural ODEs","Generative Models"],"title":"Generating Molecular Conformations via Normalizing Flows and Neural ODEs","type":"post"},{"authors":["MSAIL"],"categories":null,"content":"The purpose of this presentation was to introduce everyone to fairness aspects of machine learning and discuss Serafina\u0026rsquo;s ( award-winning!) research in the area.\nSupplemental Resources Robustness of Fairness: An Experimental Analysis, by Kamp et al.\nSlides with additional links and resources\n","date":1647986400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1647986400,"objectID":"dc35b13f7b0e2b0a7705fc8c3a8b58dd","permalink":"https://MSAIL.github.io/talk/fairness_032222/","publishdate":"2022-04-30T18:00:38-04:00","relpermalink":"/talk/fairness_032222/","section":"talk","summary":"Speaker(s): Serafina Kamp","tags":["Fairness"],"title":"Fairness in Machine Learning","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Nisreen explained the technical aspects of attention and self attention mechanisms, as well as explored how attention is used in the transformer architecture in order to aid in machine translation tasks.\nSupplemental Resources Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau et al.\nAttention is all you Need, Vaswani et al.\n","date":1647381600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1647381600,"objectID":"8ca059ed89389b7a286e48b95ce368a6","permalink":"https://MSAIL.github.io/talk/attention_031522/","publishdate":"2022-04-30T18:00:38-04:00","relpermalink":"/talk/attention_031522/","section":"talk","summary":"Speaker(s): Nisreen Bahrainwala","tags":["Natural Language Processing","Transformers","Attention","Neural Networks","Machine Translation"],"title":"An Overview of Attention and Transformer Mechanisms for NLP","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"A recent area of study in AI has been focused on the problem of cooperation amongst machine learning agents. These cooperation problems are widespread, from routine challenges such as driving on highways and working collaboratively, all the way up to global challenges like commerce, peace, and pandemic preparedness. If AI is to play a larger role in society, it is important that AI agents will be able to cooperate effectively with other agents (other AI, humans, etc).\nSupplemental Resources Open Problems in Cooperative AI, Dafoe et al.\nCooperative AI: machines must learn to find common ground\n","date":1645567200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1645567200,"objectID":"97a17f07249c214fb45ec0a6afddd1a9","permalink":"https://MSAIL.github.io/talk/cooperative_022222/","publishdate":"2022-04-30T18:00:38-04:00","relpermalink":"/talk/cooperative_022222/","section":"talk","summary":"Speaker(s): Ashwin Sreevatsa","tags":["Cooperative AI"],"title":"Open Problems in Cooperative AI","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Kevin presented Scaling Neural Tangent Kernels via Sketching and Random Features, which uses sketching and random feature generation to speed up neural tangent kernels (NTKs). This presentation was really all about introducing the NTK, a mechanism for analyzing the behavior of very wide / infinitely wide neural networks. NTKs made a huge splash in machine learning theory in 2018 for offering a novel approach to analyzing the behavior of neural networks, and there\u0026rsquo;s plenty of ground left to cover with them in research.\nSupplemental Resources Kernel trick\nProfessor Feizi\u0026rsquo;s lecture\nRajat\u0026rsquo;s blog post\nPaper #1, introduces NTKs\nPaper #2, polynomial bounds NTK complexity and introduces CNTK\nPaper #3, enhanced CNTK\n","date":1643752800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1643752800,"objectID":"882e8d5d972fa5759187e2c4f82871b5","permalink":"https://MSAIL.github.io/talk/ntk_020122/","publishdate":"2022-04-30T18:00:38-04:00","relpermalink":"/talk/ntk_020122/","section":"talk","summary":"Speaker(s): Kevin Wang","tags":["Neural Tangent Kernel","Neural Networks"],"title":"Scaling Neural Tangent Kernels via Sketching and Random Features","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Ashwin gave an introduction to the field of AI safety, which studies how to ensure that AI, especially artificial general intelligence and super intelligence, will be safe and trustworthy.\nSupplemental Resources Concrete Problems in AI Safety, by Amodei et al.\n","date":1643148000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1643148000,"objectID":"5d0bc11359effc014185e362113953a2","permalink":"https://MSAIL.github.io/talk/aisafety_012522/","publishdate":"2022-04-30T18:00:38-04:00","relpermalink":"/talk/aisafety_012522/","section":"talk","summary":"Speaker(s): Ashwin Sreevatsa","tags":["AI Safety","AGI"],"title":"Concrete Problems in AI Safety","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Dr. Jonathan Kummerfeld\nTopic: Human-in-the-Loop Natural Language Processing\nDr. Kummerfeld works on NLP, with many projects crossing into HCI, either in the process of creating datasets or developing systems. In this talk, he gave a brief introduction to NLP and Crowdsourcing + Human Computation, and then dove into two research projects. First, he discussed work on task-oriented dialogue (e.g. Siri), where his team developed new ways to collect more diverse data, which in turn leads to more robust models. Second, he discussed work on understanding a set of conversations occurring in a shared channel (e.g. in Slack).\nSupplemental Resources Dr. Kummerfeld\u0026rsquo;s Website\n","date":1638835200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1638835200,"objectID":"9f9e943d0d3acff93bfb3f9f021ed3dd","permalink":"https://MSAIL.github.io/talk/jkk_120621/","publishdate":"2021-12-06T20:00:00-04:00","relpermalink":"/talk/jkk_120621/","section":"talk","summary":"Speaker(s): Dr. Jonathan Kummerfeld","tags":["Natural Language Processing","Jonathan Kummerfeld","Human-Computer Interaction","Crowdsourcing"],"title":"Human-in-the-Loop Natural Language Processing","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Anthony Zheng, Kiran Prasad, Andong Li Zhao, Christian Kavouras\nTopic: Michigan AI Alumni Panel\nWe hosted a virtual speaker panel with a few UMich alumni who are currently doing very exciting work with AI in industry and academia.\nAnthony Zheng is an Apple research engineer working on search algorithms for Apple Media Products Kiran Prasad is an Applied Scientist at Microsoft working on NLP models for Microsoft products and was at CMU for his MS in AI and Innovation Andong Li Zhao is a CS PhD student at Northwestern working on making information more democratically accessible Christian Kavouras is a former Applied Scientist intern at Amazon working on ML/NLP applications and graduated from UWashington for his MS in Computational Linguistics Supplemental Resources Andong\u0026rsquo;s Slides\nAnthony\u0026rsquo;s Slides\nChristian\u0026rsquo;s Slides\nKiran\u0026rsquo;s Slides\nCheck Slack or contact us if you\u0026rsquo;re interested in getting their contact info!\n","date":1637020800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1637020800,"objectID":"35627e2787b5123d2ba2ca3d81517f9e","permalink":"https://MSAIL.github.io/talk/alumni_panel111521/","publishdate":"2021-11-15T20:00:00-04:00","relpermalink":"/talk/alumni_panel111521/","section":"talk","summary":"Speaker(s): Anthony Zheng, Kiran Prasad, Andong Li Zhao, Christian Kavouras","tags":["Alumni Panel","Industry","Datature"],"title":"Michigan AI Alumni Panel","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Datature Team\nTopic: Understanding MLOps for Computer Vision Pipelines\nWe hosted an industry talk with Datature, a no-code platform that allows teams and enterprises to build computer vision models. In this session, they covered key MLOps practices and the shift from \u0026lsquo;model-centric AI\u0026rsquo; development to a \u0026lsquo;data-centric\u0026rsquo; approach in the context of computer vision. There was also a \u0026lsquo;hands-on\u0026rsquo; aspect where students were able to build a facemask detection / chess piece detection model in under 30 minutes using Datature\u0026rsquo;s no-code platform.\nSupplemental Resources We have a link with a tutorial for using Datature\u0026rsquo;s MLOps platform, but it is UMich only. If you are a UMich student interested in seeing it, please reach out to the MSAIL admin team and we will happily pass it along.\nMLOps Website\nNVIDIA MLOps Blog Post\n","date":1636414200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1636414200,"objectID":"5bb938a119ac183ae3b4344374cce6a1","permalink":"https://MSAIL.github.io/talk/datature_110821/","publishdate":"2021-11-08T19:30:00-04:00","relpermalink":"/talk/datature_110821/","section":"talk","summary":"Speaker(s): Datature Team","tags":["MLOps","Computer Vision","Datature","Industry"],"title":"Understanding MLOps for Computer Vision Pipelines","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Kevin Wang\nTopic: Contrastive Learning with Hard Negative Samples\nKevin talked about the paper Contrastive Learning with Hard Negative Samples, by Joshua Robinson, Ching-Yao Chuang, Suvrit Sra, Stefanie Jegelka. In the past two years, contrastive learning has emerged as a powerful unsupervised computer vision technique for learning effective representations of data for downstream tasks. This theory-focused paper proposes a technique for sampling \u0026ldquo;hard\u0026rdquo; negative examples in contrastive learning. The authors note improved performance on downstream tasks compared to SimCLR and faster training.\nSupplemental Resources Lilian Weng\u0026rsquo;s blog post on contrastive representation learning\nEkin Tiu\u0026rsquo;s post on contrastive learning\nGoogle SimCLR\n","date":1635804000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1635804000,"objectID":"7ebf23d04054037c75f57d155f0693f8","permalink":"https://MSAIL.github.io/talk/contrastive_hns_110121/","publishdate":"2021-11-01T18:00:00-04:00","relpermalink":"/talk/contrastive_hns_110121/","section":"talk","summary":"Speaker(s): Kevin Wang","tags":["Contrastive Learning","Representation Learning","Computer Vision","Unsupervised Learning"],"title":"Contrastive Learning with Hard Negative Samples","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Nisreen Bahrainwala\nTopic: Latent Semantic Analysis: An Overview\nLatent Semantic Analysis is one of the many methods used to help computers \u0026ldquo;understand\u0026rdquo; meaning behind words and phrases, aiding with tasks such as search response relevance. This discussion will introduce the concept of LSA, some of the methods used during its development, and then explore how this technology has shaped modern NLP methods.\nSupplemental Resources Paper(s):\nA Solution to Plato\u0026rsquo;s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge\nAn Introduction to Latent Semantic Analysis\n","date":1635199200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1635199200,"objectID":"2588ac7f32d28531f23db60c752d1aa3","permalink":"https://MSAIL.github.io/talk/lsa_102521/","publishdate":"2021-10-25T18:00:00-04:00","relpermalink":"/talk/lsa_102521/","section":"talk","summary":"Speaker(s): Nisreen Bahrainwala","tags":["Natural Language Processing","NLP","Latent Semantic Analysis","NLU"],"title":"Latent Semantic Analysis","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Lance Ying\nTopic: Speech Emotion Recognition with Machine Learning\nThis talk begins with a brief introduction of speech emotion recognition (SER) with machine learning and its applications. A few challenges in SER tasks and existing solutions are discussed. The second half of the talk focused on a recent paper and methods (Nonparametric Hierarchical Neural Network) to account for variations in emotional expression due to demographic and contextual factors for SER tasks.\nYou can find a link to the recording here. (UM only)\nSupplemental Resources Accounting for Variations in Speech Emotion Recognition with Nonparametric Hierarchical Neural Network\nEECS 498 - ML and Affective Computing\n","date":1633388400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1633388400,"objectID":"84b0f0e7099c750f328e0cff5c438cca","permalink":"https://MSAIL.github.io/talk/speech_emotion_recog_100421/","publishdate":"2021-10-04T18:00:38-04:00","relpermalink":"/talk/speech_emotion_recog_100421/","section":"talk","summary":"Speaker(s): Lance Ying","tags":["Speech","Emotion Recognition","Hierarchical NN","Affective Computing"],"title":"Speech Emotion Recognition with ML","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Mukundh Murthy\nTopic: Bioactivity Descriptors for Uncharacterized Chemical Compounds\nQuantitative structure-activity modeling (QSAR) in computational chemistry is a task that involves predicting the binding affinity of a small molecule to a protein target given solely its molecular structure. Now, however, we are also interested in predicting more downstream properties including toxicity, side effects, and effects on gene expression – properties that concern both the biological and chemical properties of a molecule. This talk discussed the paper \u0026ldquo;Bioactivity Descriptors for uncharacterized chemical compounds,\u0026rdquo; which revolves around learning a generalizable and multi-modal representation for small molecules that can be applied across a large array of drug-discovery related tasks through integration of 25 small molecule datasets and a triplet network training task.\nYou can find a recording of this talk here (UM only).\nSupplemental Resources Bioactivity Descriptors for Uncharacterized Chemical Compounds\nComputational Biochemistry Primer by Mukundh Murthy and Michael Trinh\nML for Molecular Property Prediction\nMoleculeNet\n","date":1632783600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1632783600,"objectID":"266e0da019bff3d0763cefd28783f12b","permalink":"https://MSAIL.github.io/talk/chemdesc_092721/","publishdate":"2021-09-27T18:00:38-04:00","relpermalink":"/talk/chemdesc_092721/","section":"talk","summary":"Speaker(s): Mukundh Murthy","tags":["Computational Biochemistry","Representation Learning"],"title":"Learning Effective Representations for Small Molecules","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Ashwin Sreevatsa\nTopic: AlphaFold 2\nThe protein folding problem is one of the central challenges of biology over the past 50 years. The challenge is to identify the 3D structure of a protein given its amino acid sequence. Recently, DeepMind released a deep learning model called AlphaFold 2 that outperformed the state-of-the-art computational methods and predicted the 3D structures of proteins so accurately that many in the field now consider protein folding to be \u0026lsquo;solved\u0026rsquo;. This talk discussed a brief history of the protein folding problem, the architecture behind AlphaFold 2, and the next steps for protein folding and computational biology as a whole.\nSupplemental Resources AlphaFold 2 Paper in Nature\nAlphaFold 2 Source Code on Github\nDeepMind blog post on the initial AlphaFold\nDeepMind blog post on AlphaFold 2\nVideo: AlphaFold and the Grand Challenge to solve protein folding by Arxiv Insights\nAlphaFold 2 Example on Google Colab\n","date":1632178800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1632178800,"objectID":"6fd332c27e32addc5f21d5d607b6479e","permalink":"https://MSAIL.github.io/talk/alphafold_2_092021/","publishdate":"2021-04-13T18:00:38-04:00","relpermalink":"/talk/alphafold_2_092021/","section":"talk","summary":"Speaker(s): Ashwin Sreevatsa","tags":["DeepMind","Computational Biology"],"title":"AlphaFold 2 and the Protein Folding Problem","type":"talk"},{"authors":["Yash Gambhir"],"categories":null,"content":"We interact with machine learning algorithms every day\u0026ndash;from scrolling through social media, to navigating around town, to selecting word recommendations when we text. But what do we know about these machine learning methods? They’re data hungry. More specifically, they’re hungry for OUR data. Many of our interactions with technology must be tracked and stored to develop and improve these algorithms, and this need comes with an inherent privacy risk. By giving up complete access to our sensitive data, we allow ourselves to be vulnerable to both the companies that collect our information and third parties who could be interested in exploiting us. Luckily, there is a set of machine learning and statistical techniques that allow model developers to learn from our data while protecting our privacy first. In this post, we’ll specifically discuss how large companies like Google, Apple, and Microsoft use federated learning and differential privacy to develop the state of art AI algorithms and relevant insights that we benefit from every day.\nFederated Learning The first concept we’ll discuss is called federated learning, and it\u0026rsquo;s vital for making sure your data stays local to your device.\nNow let\u0026rsquo;s say we wanted to train a next word prediction model to give word recommendations to users as they type. This task involves predicting the next word in a sentence given the previous words, and these predictions can be seen above the keyboard as you type.\nNext word predictions are shown above your keyboard as your type. Machine learning models are trained to predict the next word you will type given the previous words in the sentence. If we had access to all of our users\u0026rsquo; mobile devices, we would need to access samples of text messages and collect them in a central server for us to train our model. But wait\u0026ndash;text messages contain extremely sensitive data, and this would be an extremely invasive data collection process. Instead of bringing all of the data to a central server with the model, we can use federated learning to learn a global model without a user’s data leaving their device.\nIn a typical federated learning setting, one central server communicates with several clients (in this case, mobile phones) and trains a global model in several rounds. Each round consists of the following steps:\nSelect a sample of available phones for training and send the current global model to each phone\nIn parallel, train the global model on each phone on their local text messages for one (or a few) epoch(s). Now each phone contains model updates to the global model that were calculated using their own small datasets\nThe model updates from each phone are sent back to the central server and averaged to calculate the gradient update for the global model in this training round.\nAn illustration of how a central server communicates with client servers in federated learning. Produced by ML@CMU. For next word prediction, some of the text data from the messages you send everyday are accessed for federated learning. Your phone might be selected for federated learning when it is being charged (to ensure your phone’s performance doesn’t drop) and you have strong wifi bandwidth. Next word prediction is a semi-supervised task (meaning we can automatically construct labels from the raw text data), so there is no additional need for labelling your local data. In a 2018 paper, Google showcased a 1.4 million parameter LSTM model that was trained for 5 days (3000 total rounds!) on over a million users that matched the performance of a centralized server trained model.\nNow, let’s examine how Apple uses a similar technique to personalize their news recommendation algorithm. As discussed in a recent paper, they use federated learning to not only learn a global model, but to personalize their models to individual users in a privacy-preserving manner. Apple news is an app that curates personalized news article feeds for users to select and read articles from. These recommendations are based on your interactions with specific articles (e.g what articles you click on, how long you read them, etc.). Article-label pairs can be easily constructed and stored on your device (positive label if read, negative if not read). After learning a global recommendation model, the model can be tested on each client device and remotely fine tuned based on the prediction loss on each user’s specific dataset. The model first takes advantage of the huge knowledge base of users to create a powerful global model, and then can further personalize to each user without ever collecting any data.\nAn example of how news is broadcasted on the Apple News interface. Now federated learning itself has some privacy limitations\u0026ndash;although Google or Apple might not have access to your specific data, they could theoretically learn about your data from the model updates sent back to the server. For better privacy guarantees, Google implements a method called Secure Aggregation. This method encrypts each individual client model update and sends them to a trusted, third party server for encrypted aggregation\u0026ndash;the central server can now only decrypt the aggregated data and has no access to individual model updates.\nThe left picture represents data being sent to a central server to model training. The middle showcases how models are sent to local devices to be trained, and aggregate together on the central server. The right demonstrates how FL can be better secured by sending encrypted updates to a third party server for aggregation, and the decrypted result is made available to the central server. Figure used from this paper. Differential Privacy The next technique that companies use to protect privacy aims to prevent your individual information from leaking through statistical queries, model predictions, and other analyses. This technique is called differential privacy.\nThe simplest definition of differential privacy from its original authors is: “the data holder makes a promise to the data subject that they will not be affected adversely or otherwise by allowing their data to be used in the study/analysis, no matter what other sources are available”. So if you participate in a dataset, differential privacy gives you some mathematical certainty that any statistical query or model trained on it will not reveal your personal data. From a researcher’s perspective, we want to learn about a dataset without learning about individuals that participate in that dataset. Let’s look at a quick example often used in surveys that collect sensitive information.\nSince the 1960s, sociologists have used a technique called randomized response in order to get statistics of a population regarding a sensitive topic while protecting each individual\u0026rsquo;s privacy. If a researcher wanted to know what percentage of a population has jaywalked, for example, they could give each participant in the study the following procedure:\nFlip a coin without me seeing it. If it lands on Heads, answer truthfully about whether you’ve jaywalked previously.\nIf it lands on Tails, answer your question according to the next coin flip. “Yes” if Heads, and “No” if tails.\nNow, every participant can have plausible deniability if it is discovered they answered yes or no to the question. Furthermore, researchers can still get a rough idea of the true global statistic to the question they asked: if the final amount of “Yes” answers occurs 70% of the time, we know half answered with the probability of the coin flip (50%), and the other half must have answered with the probability of 90%. So in this population, the true statistic must be around 90%. This still may not be exactly accurate due to randomness of the coin flip\u0026ndash;there is some tradeoff between privacy and accuracy when applying random noise to each data point.\nThis procedure above is an example of local differential privacy, where the goal is to add random noise to each individual data point before it is entered into a database. The great thing here is that each individual participant does not need to trust the central data curator\u0026ndash;an ideal setting for the relationship between millions of users and big tech companies.\nNow, let’s take a look at how Apple uses this technique when collecting statistics on user activity. Apple must collect user data to determine what emojis are most popular among users or what specific online domains drain the most energy on Safari. The first step is encoding their data of interest (in this example, let’s say an emoji you used recently) into a fixed size matrix representation using a hash function. Then, each bit of this matrix representation is changed to an incorrect value with some tunable probability value (anywhere from 1-25%).\nAn example graph of top emojis of US speakers from Apple’s differential privacy overview. Read more about how Apple obtains this visualization here. After this noise is added to each individual’s records, IP and other personal identifiers are stripped before the data is sent to the server. The final statistics are then aggregated on Apple’s servers for their internal use\u0026ndash;for example, Apple can identify the most popular emojis being used and design better ways of accessing/recommending them. If a user’s specific activity data was leaked from Apple’s central server, each user could have some level of plausible deniability that it wasn’t their correct data.\nLocal DP is used by Google in order to track changes to user\u0026rsquo;s Chrome settings and combat malicious software that changes these settings without user permission. Google also employs DP in user facing analysis features like Google Search Trends and Google Maps\u0026rsquo; \u0026ldquo;busyness\u0026rdquo; feature, which tells you how busy a place may be at any given time. Whether your data is being used by these companies to improve products or collected and aggregated for users to see, differential privacy is a useful technique that prevents malicious actors from personally identifying your data from a dataset.\nOther interesting notes Large tech companies have also built open-source tools that support federated learning and differential privacy, which opens the door for researchers and developers to easily adopt these techniques in their applications. Google has built RAPPOR for differential privacy and TensorFlow Federated for federated learning. Microsoft is using their differential privacy library SmartNoise with nonprofits and health care companies to provide privacy protections to the most sensitive personal data domains.\nConcluding remarks In this post, we learned about the fundamental concepts of federated learning and differential privacy and how Google and Apple access our data while protecting our individual privacy and ownership of that data. The examples discussed are just early use cases of these tools, and future applications are likely to arise. The discussion of data privacy is a very complex one, and these techniques by themselves won’t be the end all solution. At the end of the day, companies control these algorithms and protocols, and can manipulate them however they choose to. But in the world of big data, privacy preserving machine learning techniques can be the technical gateway to allowing users to regain control of their most personal and sensitive data, while maintaining the utility we gain from powerful machine learning models.\nWhat other data hungry applications or research projects do you think could take advantage of these useful techniques?\nReferences Communication-efficient learning of deep networks from decentralized data\nTowards federated learning at scale: system design\nThe algorithmic foundations of differential privacy\nRandomized response\nApple\u0026rsquo;s differential privacy overview\nHow Google anonymizes data\nUdacity\u0026rsquo;s Secure and Private AI course\n","date":1621382400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1621382400,"objectID":"b70b6dafeea4e2bdbb5529dabab143f5","permalink":"https://MSAIL.github.io/post/federated_learning/","publishdate":"2021-05-19T00:00:00Z","relpermalink":"/post/federated_learning/","section":"post","summary":"We interact with machine learning algorithms every day\u0026ndash;from scrolling through social media, to navigating around town, to selecting word recommendations when we text. But what do we know about these machine learning methods?","tags":["Data Privacy","Big Tech","Federated Learning","Differential Privacy"],"title":"Data privacy and AI","type":"post"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Dan Hepp and John Dillon\nTopic: How We Built This: TDM Studio and Sentiment Analysis\nDan Hepp is a Data Scientist Lead at ProQuest. Dan has thirty years of experience in research and production settings developing complex systems. He has a demonstrated track record of finding creative solutions to difficult technical problems and making them effective in real-world situations. Dan has expertise in machine learning, data mining, information extraction, pattern recognition, information retrieval, natural language processing, computer vision, artificial intelligence, and optical character recognition.\nJohn Dillon, Ph.D., is the Text and Data Mining Product Manager at ProQuest. His work focuses on pairing computational text analysis methods with traditional Humanities and Cultural Studies disciplines. He has published papers on Machine Learning and Sentiment Analysis and has worked previously as a postdoctoral researcher with the University of Notre Dame, USAID, and IBM Research.\nThis presentation consisted of two parts: The first part provided a history and overview of what it took to build TDM Studio from a product development standpoint. TDM Studio is a text and data mining solution offered by ProQuest. In the first part of the presentation, they gave us some practical insights into what to do and what not to do when trying to create a startup-esque product within a mid-sized company. The second portion of the presentation dug a little deeper into one aspect of TDM Studio, sentiment analysis. They discussed their work with the 2020 MDP Sentiment Analysis team and the results of their approach to the problem.\nYou can view a recording of his talk here.\nSupplemental Resources TDM Studio\nMDP Team Description\n","date":1618351200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1618351200,"objectID":"a9c9a4d023cd8514ebc1f732e17364cf","permalink":"https://MSAIL.github.io/talk/proquest_041321/","publishdate":"2021-04-13T18:00:38-04:00","relpermalink":"/talk/proquest_041321/","section":"talk","summary":"Speaker(s): Dan Hepp and John Dillon","tags":["ProQuest","Sentiment Analysis","Datasets","Data Mining","Production"],"title":"How We Built This: TDM Studio and Sentiment Analysis","type":"talk"},{"authors":["Robert Aung"],"categories":null,"content":"Richard Feynman once said “What I cannot create I do not understand.” Therefore, to truly understand the human visual system, we must learn to create it. One of the most effective forms of such creation is the Convolutional Neural Network (CNN) system to mimic the human visual system. Computer Vision models that use the CNN system have achieved near human-level performances on tasks such as image classification and object detection. There is no question the CNNs have shown us mind-blowing performance, but the question is: do they actually resemble the biological visual system? Are we really creating it? As a quest to answer this question, in this article we will explore the similarities and differences between the CNNs and the biological visual system.\nTo start off, let’s explore the roots of the CNN - how does a CNN function and what are its capabilities? In simple terms, the CNN is able to learn features from images, for example - it is able to deduce if an image has a dog; it is able to deduce if the image is a desert, it can tell if an image is a painting. Let’s take a look at some example features that CNNs can learn to extract from an image. In the image underneath, we can see that for a baseball, the CNN meshes the unique features of a baseball such as rounded shape, stripes, into a filter image. This filter image is then used to cross check against input images to determine whether each input image contains a baseball. The same is done for dogs, clouds and buildings, etc, with their own respective filter images. This way, we can use CNNs for tasks such as classification (labeling an image to a group/class), object detection (detecting the presence of certain objects in an image), and image generation (imitating certain styles and patterns to generate unique images), among many other tasks.\nUse of filters for detecting image features [1] So now that we have an idea of how CNNs function, we can move on to discussing how its design compares to the biological visual system. The CNN is composed of image processing layers that deduce and pass down information from one layer to the next. At each layer, information of different abstractions is deduced. Generally, in the earlier layers, simpler and more basic ideas will be deduced while later layers will use the gathered information from the previous layers to deduce more complex ideas. The following figure lays down an example CNN - in the figure, a boat image is passed from the left end layer to the next until it reaches the right-most layer, where it classifies the class of the image; in this case, it should be “Boat”.\nLayer Structure of a CNN [6] Due to the way the layers learn more complex features the deeper into the network, we can call this a hierarchical learning structure. To see clearly what we mean by the complexity of features, we can observe the following figure. The first group shows the detection of edges - which are simpler features compared to the textures in the second group. The CNN layers detect such “textures” through the combination of “edges” detected in the previous layer. Likewise, the following layers learn patterns through combining textures, and so on and so forth, until the last layers learn the presence of certain objects. In this way, the complexity of features increases as we go deeper down the network, demonstrating a hierarchical learning mechanism.\nDifferent complexity of learned features [1] This hierarchical layer structure is actually also utilized in the visual cortex ventral pathway, which is a layerlike pathway consisting of the sequence LGN-V1-V2-V4-IT where each of them represents a certain information processing layer (Figure 4). As we proceed through the visual pathway, the features learned become more complex, just as in the CNN. The receptive visual field size increases as well, as larger receptive field suggests a more holistic and general feature in the image. In a way, this makes sense - to recognize a baseball, the network has to learn stripes and some circular shapes. Likewise, for a dog, the features might be the dog snouts, black and white eyes, furry texture, etc. Such features cannot be instantly detected in just a single step, but rather gathered throughout different layers’ learning, which forms the basis of hierarchical learning.\nHuman Visual Cortex Ventral Pathway [3] In addition to the concept of hierarchical information processing in CNNs, another fundamental concept called pooling is utilized. Pooling is basically the idea of generalizing or approximating a set of values in an area into a smaller set of values. This concept is explained in the following figure. The input image is a grid of 16 values and pooling is applied on the image to result in a grid of 4 values; each of the 4 values are the maximum values taken from their respective areas represented by the color. By taking the maximum, we reduce the size of the information we are looking at and select the most important values that need to be paid attention to.\nPooling Mechanism [8] To understand at the higher level, the pooling is used to aggregate information gathered into summarized information. This idea of aggregation allows for the hierarchical information processing - the basic features learned are aggregated and then the details are gotten rid of to learn high level features. Pooling reduces the dimension of the representation and “creates an invariance to small shifts and distortions”. Basically this means that switching an image around by slight pixel changes will not affect the information being deduced from the image. Through pooling, we eliminate repeated learning of similar features that are right next to each other in the image feature representation. Interestingly, this idea behind the pooling layer is found in the relation between simple and complex cells in the biological neural system, where simple cells simply evoke a response on each of their particular spatial locations, while complex cells seem to be pooling over responses from the simple cells and thus showing more spatial invariance in their responses.\nNo matter how neat CNN is in capturing visual information like the biological system, there are some outright flaws in it. One is the possibility of adversarial attacks, which involve hacking the CNN by slightly changing the pixel values of images in a way that is undetectable to a human eye but enough to fool the CNN to make faulty conclusions. An example is shown below, where the panda image is altered to be recognized as a gibbon by a CNN although there seems to be no difference to the human eye (Figure 6). This example shows how the CNN is perceiving ideas through meticulous attention to every single pixel in an image, which might not be the case with human visuals; for humans, perception likely happens through directly seeing the patterns and lines rather than individual pixels.\nAdversarial Attack on CNN (OpenAI) On the other hand, this makes us wonder, “Can the human visual system be hacked as well? Are there ways to fool our eyes although maybe to another species there isn’t noticeable change?” It turns out that there are ways to fool our visual perception as well through small image change. Researchers have found a way to generate images that are designed to tip the perception towards a different idea although there isn’t much change in the image composition. Look at the example below - the left image looks like a cat, but when altered slightly to form the right image, it starts to look more like a dog. Such a hack is akin to the idea of subliminal stimuli - visual or auditory stimuli that the conscious mind cannot detect but that the brain subconsciously processes - maybe adversarial attacks are subliminal messages for the CNN.\nAdversarial Attack on Human Visual System [9] While it’s quite interesting to ponder such ideas and even question our sensual perception, the conclusion is that there are evidently parallels between the way CNN works and the way the human visual system works. However, there are also some fundamental differences between them - although these differences could possibly be reduced through more complex layers and architectural changes in the CNN design.\nReferences [1] Feature Visualization - What are CNNs learning?\n[2] Interpretation with building blocks\n[3] Neural networks as models of the visual system\n[4] LeCun - Nature Deep Learning Review\n[5] How Conv layers work\n[6] Fundamentals of Deep Neural Networks\n[7] Simple and Complex cells in the Human visual system\n[8] Understanding Convolutions and Pooling in Neural Networks\n[9] Hacking the Brain with Adversarial Images\n","date":1618099200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1618099200,"objectID":"79627741575f586b130a654963a91509","permalink":"https://MSAIL.github.io/post/cnn_human_visual/","publishdate":"2021-04-11T00:00:00Z","relpermalink":"/post/cnn_human_visual/","section":"post","summary":"Richard Feynman once said “What I cannot create I do not understand.” Therefore, to truly understand the human visual system, we must learn to create it. One of the most effective forms of such creation is the Convolutional Neural Network (CNN) system to mimic the human visual system.","tags":["Deep Learning","Computer Vision","Convolutional Neural Network","CNNs","Adversarial Attacks"],"title":"Do convolutional neural networks mimic the human visual system?","type":"post"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Andong Luis Li Zhao\nTopic: Intelligent Politics: How AI Can Improve Our Political Institutions and Systems\nAndong Luis Li Zhao is a Computer Science PhD student at Northwestern University, working in the C3 Lab under Prof. Kristian Hammond. His main research focus is modernizing our political systems through AI. He is currently working on making political information more transparent by building systems that can understand vaguely-articulated questions, obtain the correct data analysis, and identify the most appropriate representation of that analysis.\nWhile his specific focus is currently on providing the public with access to information about our political system, this work is part of a broader goal of improving how society functions through socially-conscious AI grounded in real systems. Too often technologists abdicate their social responsibility by focusing on technical development. Instead, by developing human-centered AI technology that helps inform people and uncover novel insights, we can focus on the betterment of social, political, and economic systems and their impact.\nYou can view a recording of his talk here.\nSupplemental Resources SCALES: Transforming the Accessibility and Transparency of Federal Courts\nC3 Lab\n","date":1617746400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1617746400,"objectID":"7dcadba630665a58e8cc41538068392d","permalink":"https://MSAIL.github.io/talk/politics_ai_systems_040521/","publishdate":"2021-04-06T22:35:38-04:00","relpermalink":"/talk/politics_ai_systems_040521/","section":"talk","summary":"Speaker(s): Andong Luis Li Zhao","tags":["Politics","Human-centered AI","Transparency","Public Information"],"title":"Intelligent Politics: How AI Can Improve Our Political Institutions and Systems","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Yashmeet Gambhir\nTopic: Harmful Bias in Natural Language Generation\nLarge language models have taken over the NLP scene and have led to a surge of state-of-art development in natural language generation tasks (machine translation, story generation, chatbots, etc.). However, these models have been shown to reflect many harmful societal biases that exist in text around the internet. This talk will go over two major papers studying harmful bias in large LMs: the first identifies and quantifies this bias, the second will attempt to mitigate bias.\nSupplemental Resources Papers:\nThe Woman worked as a Babysitter: On Biases in Language Generation Towards Controllable Biases in Language Generation\n","date":1617141600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1617141600,"objectID":"3964067f171a427a6334d7ebc44d239e","permalink":"https://MSAIL.github.io/talk/harmful_bias_nlg/","publishdate":"2021-03-30T22:35:38-04:00","relpermalink":"/talk/harmful_bias_nlg/","section":"talk","summary":"Speaker(s): Yashmeet Gambhir","tags":["NLP","Natural Language Generation","Neural Network","Language Models","Bias","Ethical AI"],"title":"Harmful Bias in Natural Language Generation","type":"talk"},{"authors":["Kierra Davis"],"categories":null,"content":"10% of the world’s population lives in extreme poverty. This means that around 700 million people are living on less than $1.90 a day (1). And while $1.90 is the standard set by the World Bank defining the international poverty line (2), by no means does it imply that a person earning more than this amount has anywhere near suitable standards of living. Poorer populations in countries such as South Sudan, Bolivia, and India (2) still struggle to access basic features that are usually taken for granted in more developed countries, such as clean water, sanitation, education, and healthcare. 2.1 billion people in the world do not have access to safe drinking water (3). Around 1.8 billion people do not have access to adequate sanitation (4). And a report from the World Health Organization and the World Bank found that 400 million do not have access to essential health services (5).\nMuch work is already being undertaken in attempts to address these key issues, but more is still needed. Artificial intelligence (AI) is one such area where there is potential in alleviating some of the problems accompanying poverty. AI encompasses areas such as natural language processing and robotics, and has seen success in recent years with technology like autonomous vehicles, disease diagnosis, and recommendation algorithms. It is a powerful tool that also has the potential to be applied in ways beneficial to those dealing with the circumstances and living standards that typically go with poverty, especially in the realms of sanitation and education.\nSanitation Issues of sanitation for impoverished regions encompass topics such as lack of clean water, proper toilet systems, and waste management.\nOne initiative attempting to address the problem of unsafe drinking water comes from an organization called Clean Water AI. Developers created a device that uses a “deep learning neural network to detect dangerous bacteria and harmful particles in water” (6). This technology employs computer vision to continuously monitor water quality and observe at a microscopic level. Widespread adoption of such an application could be beneficial for avoiding illnesses that arise from consumption of unsafe drinking water, such as cholera, typhoid, and polio (11).\nHere\u0026rsquo;s a short video detailing what Clean Water AI does: Lack of proper waste management is another huge issue for underdeveloped regions. Without systems in place to take care of waste, it has become common for garbage to be thrown out in the surrounding areas or tossed into nearby water sources, where it sometimes becomes a playground for the neighboring children (7). While the establishment and utilization of governmental systems and resources could help manage this waste problem, there are also innovations in the AI sector that could be applied to target the problem. Recycling is an important process for these communities, and a robot designed by engineers from Simon Fraser University could help distinguish waste from recyclable materials (8). Growing company CleanRobotics is also taking steps to aid the waste management process with an autonomous system called TrashBot, which uses “robotics, computer vision and artificial intelligence to detect and separate landfill from recyclables” (12). Establishing automated systems for removing trash build-up could make the process easier in underdeveloped regions that lack the proper systems, and this could be crucial for lowering rates of disease caused by unhygienic environments. Such use of artificial intelligence could also potentially improve financial outlooks for the citizens in these areas through reuse and repurposing of recyclable materials.\nCredit: Arvinder Singh Education The standard of education and the availability of educational resources also suffer in impoverished regions. Improved educational resources correlates positively with poverty statistics, as better education can benefit a region overall and positively impact its economy; “Education promotes economic growth because it provides skills that increase employment opportunities and income” (13)(14). It is no surprise, then, that poorer regions have less access to proper education and fewer members of the community acquire an education. Lack of nearby schools, teachers, transportation, and time are all contributors to this problem.\nUniversal education is a popular initiative and there is work being done with the AI field to progress this goal. The rise of the internet has provided opportunities for bringing education into the home with the growing format of online schooling. Having the capability to view pre-taught lessons online is an important development in tackling the aforementioned impediments to education in underdeveloped regions. While there already exist many online resources that provide information on a variety of subjects for anyone to browse, language barriers still persist. The Presentation Translator from Microsoft is one such tool involving AI that is attempting to address language barriers (15). It uses speech recognition and translation to create subtitles of presentations in the desired language, which users can access through the tool’s affiliated app or browser. Google’s translation application can also be used for reading, writing, and speaking translations. There are many other recent developments in the field of natural language processing that can help tackle language barriers, and further expansion of available languages in these applications would be advantageous for poorer regions with lesser-known tongues.\nCredit: Arvinder Singh Challenges to AI Access There is much more growth happening in the AI field than what has been discussed here, and these technological advancements may be useful for targeting the persisting problems of poverty. But a major obstacle of using these tools to benefit developing countries is that many of these regions do not have the necessary structures in place to even implement such technologies.\nIf specialists are needed for the establishment and maintenance of these AI applications, it is unlikely there will be someone in these areas who can fulfill that role, especially with the lack of access to education. It could become even more costly to hire outside help for maintaining these applications than it already is just to instate them. It must also be determined who will fund these tools – if the creators of these applications are not providing them for free, who will the responsibility fall upon for payment? The impoverished region presumably cannot bear that burden. Governmental assistance is often what is needed in these situations, but these are developing areas and governmental instability is a common feature, as well as the fact that the developing economy often does not yet supply the financial resources needed to establish such systems. Lack of internet access is another deterrent to providing these AI tools. When the poorer regions of Asia and Africa are the ones more likely to have fewer people using the internet (16), and less than 5% of people are online in the poorest countries (10), this creates difficulty in making use of resources such as online education.\nSome initiatives seeking to resolve these obstacles are in the works, but they encounter their own obstacles in the sheer size of the issue and lack of financial resources (like Project Loon, which was attempting to provide worldwide internet access through essential parts of cell towers attached to balloons (10) but has since shut down in January of 2021, stating they “haven’t found a way to get the costs low enough to build a long-term, sustainable business” (17)). The barriers to technological access in underdeveloped areas are widespread and persistent, and it is difficult for one well-meaning company or organization to solve them. These significant concerns will need to be addressed if artificial intelligence is to play a role in reducing poverty in our world.\nReferences [1]: World Poverty Statistics via lifewater.org\n[2]: World Bank\n[3]: Water Access Data\n[4]: CDC Wash Statistics\n[5]: New report shows that 400 million do not have access to essential health services via WHO\n[6]: Clean Water AI\n[7]: via PBS\n[8]: Artificial intelligence robot reduces waste contamination at SFU\n[9]: Flight systems via Loon\n[10]: Internet Access\n[11]: Contaminated water can transmit diseases\n[12]: TrashBot via CleanRobotics\n[13]: Poverty Education Statistics\n[14]: Millions could escape poverty by finishing secondary education\n[15]: Microsoft Presentation Translator\n[16]: Internet access growing worldwide but remains higher in advanced economies via Pew Research\n[17]: Loon for all\n","date":1615680000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1615680000,"objectID":"bed14115cf0946c7093ba6a2b273332e","permalink":"https://MSAIL.github.io/post/developing_countries/","publishdate":"2021-03-14T00:00:00Z","relpermalink":"/post/developing_countries/","section":"post","summary":"10% of the world’s population lives in extreme poverty. This means that around 700 million people are living on less than $1.90 a day (1). And while $1.90 is the standard set by the World Bank defining the international poverty line (2), by no means does it imply that a person earning more than this amount has anywhere near suitable standards of living.","tags":["Impacts","Education","Environment"],"title":"What could AI do for developing countries?","type":"post"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Andrew Awad and Drake Svoboda\nTopic: Using Transformers for Computer Vision\nIn recent years we\u0026rsquo;ve seen the rise of transformers in natural language processing research, burgeoning the field to incredible heights. However, these very same transformers were seldom applied to computer vision tasks until recently. Andrew and Drake discussed how transformers have been used in vision tasks in recent years in a presentation covering two papers. The first, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (via Google Brain), is the \u0026ldquo;Attention is All You Need\u0026rdquo; of vision. Namely, this paper covers how one can construct a vision architecture devoid of the commonly applied CNN and still achieve comparable or better performance results while possibly cutting down computing resources. The second paper, End-to-End Object Detection with Transformers (via FAIR), formalizes the object detection task in a unique way that affords the usage of transformers.\nSupplemental Resources Papers:\nAN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE\nEnd-to-End Object Detection with Transformers\n","date":1615327200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1615327200,"objectID":"4ab8cc4c6b55780e52ade57c6274a792","permalink":"https://MSAIL.github.io/talk/image-worth-16x16-words/","publishdate":"2021-03-09T22:35:38-04:00","relpermalink":"/talk/image-worth-16x16-words/","section":"talk","summary":"Speaker(s): Andrew Awad and Drake Svoboda","tags":["Transformers","Image Recognition","Neural Network","Computer Vision","Language Models","Deep Learning"],"title":"Using Transformers for Vision","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Ashwin Sreevatsa\nTopic: Generative Language Modeling for Automated Theorem Proving Presentation\nIn the past decade, deep learning and artificial neural networks have been incredibly successful at a variety of tasks such as computer vision, translation, game playing, and robotics among others. However, there have been less examples of deep learning making progress with reasoning related tasks- such as automated theorem proving, the task of proving mathematical theorems using computer programs. This paper explores the use of transformer-based models to automated theorem proving and presents GPT-f, a deep learning-based automated prover and proof assistant.\nSupplemental Resources Papers:\nGenerative Language Modeling for Automated Theorem Proving Presentation\n","date":1614636000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1614636000,"objectID":"6fda3fa743d82f9d19cd89667fdec252","permalink":"https://MSAIL.github.io/talk/generative_language_modeling/","publishdate":"2021-03-01T22:35:38-04:00","relpermalink":"/talk/generative_language_modeling/","section":"talk","summary":"Speaker(s): Ashwin Sreevatsa","tags":["Automated Prover","Deep Learning","Neural Network","GPT","Language Models"],"title":"Proving Theorems with Generative Language Models","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Topic: Lightning Round \u0026ndash; Assorted AI Topics\nPresenter: Kevin Wang\nHere, we talk about a wide range of topics in AI that haven\u0026rsquo;t received their own slide decks \u0026ndash; the list includes reinforcement learning, optimization, adversarial machine learning, meta learning, active learning, multi-agent systems, and more. We hope that showcasing the breadth of AI research inspires you to dig deeper on your own and find what interests you!\nYou can view a recording of this lesson here.\nSupplemental Resources Lesson slides\n","date":1614438000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1614438000,"objectID":"837229da7a92c9f0dc6cc53ff8084e68","permalink":"https://MSAIL.github.io/previous_material/lightning/","publishdate":"2021-02-27T15:00:00Z","relpermalink":"/previous_material/lightning/","section":"previous_material","summary":"Presented by Kevin Wang","tags":["Reinforcement Learning","Deep Learning","Meta Learning","Adversarial ML","Optimization","ML"],"title":"Lightning Round -- Assorted AI Topics","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"Topic: Ethics in AI Research\nPresenter: Kevin Wang\nWe discuss the various ethical problems AI research presents, including well-known problems like bias and weaponized AI and less publicized problems like interpretability and environmental impact of large machine learning models. We also talk about some of the solutions that researchers are attempting to implement and what we can do to contribute.\nYou can view a recording of this lesson here.\nSupplemental Resources Lesson slides\n","date":1614358800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1614358800,"objectID":"dde8f7c95a38e906223149f3473102de","permalink":"https://MSAIL.github.io/previous_material/ethics/","publishdate":"2021-02-26T17:00:00Z","relpermalink":"/previous_material/ethics/","section":"previous_material","summary":"Presented by Kevin Wang","tags":["Ethics","Neural Networks","Deep Learning","ML"],"title":"Ethics","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Patrick Morgan\nTopic: Cognitive Load Estimation\nCognitive load has been shown, over hundreds of studies, to be an important variable for understanding human performance. However, establishing practical, non-contact, automated methods for estimating cognitive loads under real-world conditions is an un-solved problem. In this paper, Fridman et. al. proposes two novel vison-based methods for cognitive-load estimation. These methods address a important and challenging problem that has huge implications and can be used to ensure safety in tasks ranging from driving cars to operating machinery.\nSupplemental Resources Papers:\nCognitive Load Estimation in the Wild\n","date":1614117600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1614117600,"objectID":"3be6be4245392b85b8a503b5ce511f29","permalink":"https://MSAIL.github.io/talk/cognitive_load_estimation/","publishdate":"2021-02-23T22:35:38-04:00","relpermalink":"/talk/cognitive_load_estimation/","section":"talk","summary":"Speaker(s): Patrick Morgan","tags":["Machine Learning","Cognitive Science","Pose Estimation","CNN"],"title":"Cognitive Load Estimation","type":"talk"},{"authors":["Vaarun Muthappan"],"categories":null,"content":"Employing forest guards to protect endangered species. Organising volunteers for beach cleanups. Encouraging people to lead greener lifestyles. These are just some of the traditional methods that have been used to protect our environment. But, as Alan Turing once said, \u0026lsquo;At some stage,\u0026hellip; we should have to expect the machines to take control\u0026rsquo; [1] and this couldn’t be more true in these conservation efforts. Be it preserving wildlife, cleaning up the environment or reducing green-house gas emissions, AI has started playing an increasingly dominant role as it has become good at numerous tasks such as visually identifying and tracking poachers or designing the most energy efficient buildings.\nGoogle, for example, has used AI to increase the efficiency and significantly reduce the carbon footprint of its data centers. Data centers are massive energy consumers, being turned on 24/7 and requiring significant cooling. In 2016 it was reported that data centers used more electricity than Britain, producing roughly the same carbon footprint as the aviation industry [2]. A google search of the video \u0026ldquo;Despacito\u0026rdquo; activates six to eight of Google\u0026rsquo;s data centers [3]. In line with Google’s effort to stay carbon neutral (which they achieved in 2007 [3]), Google implemented a reinforcement learning algorithm in 2016 to determine what “cooling configurations (in its data centers) would reduce energy consumption” [5]. A reinforcement algorithm is a subset of machine learning in which the environment sends a ‘state’ and ‘reward’ to the ‘agent’ - the reinforcement learning algorithm - which in turn tries to maximise the reward [4]. The data centers using this reinforcement algorithm now consistently use 30% less energy than expected [6].\nA Google data center in Iowa (MIT Technology Review) Big companies are not the only ones that are taking advantage of AI to be more environmentally friendly. Small startups are using AI to tackle environmental problems as well. In a remote reserve in Ecuador, Topher White - founder of Rainforest connection (a startup that aims to combat deforestation) - climbs a tree using a small harness and installs a small box containing an old cell phone and solar cells [7]. These phones recoprd the natural rainforest sounds and use AI to detect logging noises, upon which a notification is sent to a ranger. He had started his firm after learning of the severity of deforestation: the number of trees has fallen by almost 50% since the beginning of human civilisation, and still over 15 billion trees are currently cut down every year [8]. Topher White\u0026rsquo;s first encounter with deforestation came ironically when he was on a trip to the rainforests of Borneo and was shocked to find illegal loggers just a hundred meters from the ranger station. Monitoring forests using AI to identify deforestation on satellite imagery often comes too late as satellite imagery requires some amount of deforestation before it becomes visible. However, the sound, he recalls, was deafening, which gave him the idea to mount these small phones on the treetops to identify logging. The very next day after the first installation, a logger was apprehended, and it is now preventing deforestation in 10 countries [7].\nTopher White (Rolex/Stefan Walter via FastCompany) In fact even industries have turned to using AI to reduce costs and also increase their sustainability. One example is in Norway, where AI is being used to increase the efficiency of salmon farms [9]. Telenor Research - the research wing of the Norwegian telecoms company Telenor that is conducting research on the use of AI in fish farms - has come up with a neural network that uses a video feed from underwater cameras to determine when the salmon have finished feeding. The salmon swim in schools which disperse when food is thrown, but when the salmon are full, there are tiny deviations in this behaviour which the neural network picks up with 80% accuracy. Identifying these cues to better feed the right amount of food is beneficial economically (40% of the cost of salmon farms comes from fish food), and also helps prevent low oxygen levels, algae blooms and high nitrate levels which are toxic to fish, among other problems [12].\nThere are numerous other AI applications in this area: IBM has come up with a machine learning tool named AquaCloud to predict lice outbreaks in Salmon with 70% accuracy using a random forest algorithm [10]. The industries\u0026rsquo; increasingly intensive salmon and rainbow trout fishing encourages lice growth, which in turn makes them unsellable because lice feed on salmon skin [9].\nThe results of AquaCloud: the green line shows the lice outbreak predicted two weeks in advance. (NCE Seafood Innovation Center) The use of AI on fish farms is especially beneficial as fish farms are sustainable and help to reduce the dependency on precious ocean fish populations. Wild salmon are being depleted at an alarming rate, yet Norway’s salmon exports are only rising. Even here, AI is also being used to track the wild salmon population in efforts to sustain its population [9].\nThese examples are but a small subset of AI-related projects that have been done. A brief search of new competitions regarding environmental solutions (most of which use AI, even if the competition implies only a possible technological solution), confirms the increasing interest in sustainability. The Xprize - a nonprofit that runs public competitions - is currently running a $10 million prize rainforest competition to “develop novel technologies to rapidly and comprehensively survey rainforest biodiversity and use that data to improve our understanding of this complex ecosystem” [11]. Microsoft’s AI for Good grants are given to projects that enhance climate, water, and biodiversity. Prince William recently launched the Earthshot prize - a series of annual one million pound prize competitions in five categories such as reviving the oceans, cleaning the air, and protecting nature. There are countless other competitions and the examples discussed in this article are just a taste of the AI environmental projects to come.\nReferences [1]: Ten Famous Quotes about Artificial Intelligence\n[2]: Why Data Centres are the New Frontier in the Fight Against Climate Change\n[3]: The Internet Cloud has a Dirty Secret\n[4]: Introduction to Various Reinforcement Learning Algorithms\n[5]: Google Just Gave Control Over Data Center Cooling to an AI\n[6]: Google 2019 Environmental Report\n[7]: This Network of Microphones Listens for the Chainsaws of Illegal Loggers in the Rainforest\n[8]: Crowther et al., Mapping tree density at a global scale\n[9]: AI Could Help Find Cheaper and Smarter Ways to Raise Fish\n[10]: The Seafood Innovation Cluster\n[11]: Rainforest XPrize\n[12]: Why overfeeding fish is a problem and how to avoid it\n","date":1613952000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1613952000,"objectID":"af6501b9bf4c0c790d3fb7f661b156ca","permalink":"https://MSAIL.github.io/post/ai_environment/","publishdate":"2021-02-22T00:00:00Z","relpermalink":"/post/ai_environment/","section":"post","summary":"Employing forest guards to protect endangered species. Organising volunteers for beach cleanups. Encouraging people to lead greener lifestyles. These are just some of the traditional methods that have been used to protect our environment.","tags":["Environment","Climate Change"],"title":"Protecting the environment with AI","type":"post"},{"authors":["MSAIL"],"categories":null,"content":"Topic: Unsupervised Learning\nPresenter: Kevin Wang\nThis lesson went over the unsupervised side of AI, where labels don\u0026rsquo;t exist and models are left on their own to learn useful information. We presented machine learning approaches with and without deep learning that tackle unsupervised problems.\nYou can view a recording of this lesson here.\nSupplemental Resources Lesson slides\n","date":1613833200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1613833200,"objectID":"37761432a0fc47aab46a29dc89a958d9","permalink":"https://MSAIL.github.io/previous_material/unsupervised/","publishdate":"2021-02-20T15:00:00Z","relpermalink":"/previous_material/unsupervised/","section":"previous_material","summary":"Presented by Kevin Wang","tags":["Unsupervised Learning","Clustering","Generative adversarial network","Deep Learning","ML"],"title":"Unsupervised Learning","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"Topic: Natural Language Processing\nPresenter: Kevin Wang\nThis lesson gave a high level overview of NLP (natural language processing) and how AI can be used to work with text and speech data. Points of discussion included recurrent neural networks, LSTMs/GRUs, and GPT-3 and other transformer models.\nYou can view a recording of this lesson here.\nSupplemental Resources Lesson slides\n","date":1613754000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1613754000,"objectID":"2ab1a6fc0070b086fd1d7783e96270b7","permalink":"https://MSAIL.github.io/previous_material/nlp/","publishdate":"2021-02-19T17:00:00Z","relpermalink":"/previous_material/nlp/","section":"previous_material","summary":"Presented by Kevin Wang","tags":["Natural Language Processing","Recurrent neural network","LSTM","Transformer","GPT-3","Deep Learning","ML"],"title":"Natural Language Processing","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Nikhil Devraj\nTopic: A Brief Overview of Deep RL in Robotics\nDeep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low level sensor observations. Although a large portion of deep RL research has focused on applications in video games and simulated control, which does not connect with the constraints of learning in real environments, deep RL has also demonstrated promise in enabling physical robots to learn complex skills in the real world.\nThis discussion focused predominantly on the following questions: (1) What is deep RL and how does it relate to robotics?\n(2) What are some examples of studies done with Deep RL in robotics?\n(3) What are major challenges faced by researchers who apply deep RL to robotics?\nThis discussion is heavily inspired by Ibarz et al., although it does not dive into that level of detail.\nYou can find the recording of this talk here.\nSupplemental Resources Papers:\nHow to Train Your Robot with Deep Reinforcement Learning - Lessons We\u0026rsquo;ve Learned\nDeep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates\nReinforcement learning in robotics: a survey\nArticles:\nMedium: Reinforcement learning in robotics\nOther:\nBetter slides (in our presenter\u0026rsquo;s opinion)\nDeep RL Towards Robotics by Shane Gu (Google Brain)\nDeep RL in Robotics with NVIDIA Jetson\n","date":1613512800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1613512800,"objectID":"bef970daee88186ddb8cce5e1cfa7888","permalink":"https://MSAIL.github.io/talk/deeprlrobotics_021621/","publishdate":"2021-02-16T22:35:38-04:00","relpermalink":"/talk/deeprlrobotics_021621/","section":"talk","summary":"Speaker(s): Nikhil Devraj","tags":["Reinforcement Learning","Deep Learning","Robotics","Skill Learning"],"title":"Deep RL for Robotics: A Short Overview","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"​ Topic: An Overview of Computer Vision\nPresenter: Kevin Wang​\nThis lesson gave a basic overview of the computer vision problem space. We discussed historically significant developments including convolutional neural networks, AlexNet, ResNet, and more, and we gave a glimpse at ongoing research.​\nYou can view a recording of this lesson here.\nSupplemental Resources Lesson slides\n","date":1613228400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1613228400,"objectID":"24c070e3ed144245018016ba03504db4","permalink":"https://MSAIL.github.io/previous_material/computer_vision/","publishdate":"2021-02-13T15:00:00Z","relpermalink":"/previous_material/computer_vision/","section":"previous_material","summary":"Presented by Kevin Wang​","tags":["Convolutional Neural Networks","Deep Learning","Computer Vision","ML"],"title":"Computer Vision","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"​ Topic: Introduction to AI Research and Basics of Deep Learning\nPresenter: Kevin Wang ​\nThis lesson introduced the format of lessons for the winter 2021 semester, briefly introducing the topics to be presented in the coming weeks. We then gave a high-level overview of neural networks, which form the basis of deep learning and drive much of AI research today. ​\nYou can view a recording of this lesson here.​\nSupplemental Resources Lesson slides\n","date":1613149200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1613149200,"objectID":"db0fa17fe315cf1d3624080a78dfee34","permalink":"https://MSAIL.github.io/previous_material/intro_dl_basics/","publishdate":"2021-02-12T17:00:00Z","relpermalink":"/previous_material/intro_dl_basics/","section":"previous_material","summary":"Presented by Kevin Wang​","tags":["Neural Networks","Deep Learning","Introduction","ML"],"title":"Introduction and Basics of Deep Learning","type":"previous_material"},{"authors":["Nikhil Devraj"],"categories":null,"content":"If you\u0026rsquo;re involved in research, you\u0026rsquo;re probably going to give a reading group presentation at some point. Many professors push their PhD students to give talks. Giving these talks helps researchers build the ability to read and understand papers quickly, and the ability to communicate findings effectively.\nVolunteer or be volunTOLD.\n— Prof. David Fouhey\nDr. Fouhey told this joke multiple times during the computer vision reading group last semester and the other professors agreed. It succinctly summarizes the emphasis placed on giving these talks.\nWhat\u0026rsquo;s a reading group? Reading groups regularly meet to discuss topics in research. Most of the time, the group will focus on one specific paper detailing an important finding. In AI, many of these reading groups may be focused on award-winning papers from recent conferences or on methods relevant to the participants\u0026rsquo; research.\nReading groups exist mainly to enrich the participants\u0026rsquo; knowledge. Sometimes the talks will focus on a broader topic as an introduction and sometimes the talks will focus on a specific method in a specific paper. The more niche the audience of the reading group, the more advanced the topic tends to be. At MSAIL, we try to strike a balance between our younger, less-experienced audience (i.e. underclassmen) and our older, experienced audience (upperclassmen, graduate students, etc.).\nChoosing a topic Choosing a topic may be the hardest part of the presentation process. Generally, you can present any topic you want, given that it hasn\u0026rsquo;t already been presented recently. Present on something that grabs your interest immediately, or something you have some familiarity with - it\u0026rsquo;ll make the preparation process more bearable. If you\u0026rsquo;re open to topics and are confident you can adapt, then just go to a conference or journal page and search through some of the accepted papers that catch your eye (at the time of writing, I\u0026rsquo;ve been looking at ICLR 2021 papers).\nIn my opinion, the main question you should ask yourself when you\u0026rsquo;ve identified a potential topic is:\nAm I willing to read about this topic in depth, even to the extent of falling into a rabbit hole?\nYou obviously don\u0026rsquo;t need to know everything about the topic you choose (no one does), but persistence is the key to having a strong presentation. The more comfy you are with the overall subject area, the more natural your presentation will flow and the less likely you are to trip up. (For MSAIL, however, if you\u0026rsquo;re a newcomer and haven\u0026rsquo;t really done a reading group presentation before, we\u0026rsquo;ll help you out!)\nHere are some questions you should ask yourself when looking at a paper or topic that you\u0026rsquo;re about to choose:\nWhy is the topic important? What do I hope to get out of it? If it\u0026rsquo;s not clearly important or you don\u0026rsquo;t gain anything from knowledge of the topic itself, you\u0026rsquo;ll just be wasting time. Is this interesting to the naked eye? Important for getting people to attend your talk, and also helpful in gauging whether your audience will stay engaged. If they aren\u0026rsquo;t engaged in the beginning how can they be expected to in the end? Can I learn about this topic within a reasonable amount of time? A \u0026ldquo;reasonable amount of time\u0026rdquo; is generally a week or so. You need to choose papers of reasonable length. We often suggest presenting on conference papers because they\u0026rsquo;re less than 10 pages on average. Longer papers and topics are more feasible down the line when you\u0026rsquo;ve become comfortable with these types of presentations. For the examples in this post, I will go through the process of choosing a topic for one of my previous talks. I\u0026rsquo;ve given plenty of talks on uninteresting topics and papers, but some were received particularly well. I will talk about my process for presenting VideoBERT, which I presented way back in Fall 2019. This was actually my first ever MSAIL talk, and at the time I had only recently become acquainted with AI research. The talk had plenty of faults, which I\u0026rsquo;ll try to use as examples.\nThe central flow diagram from the VideoBERT paper by Sun et al. Reading relevant sources For a specific paper Even if you choose one paper, that paper is probably not the only source you\u0026rsquo;re looking at to understand all the content. When you first read through the paper itself, you should annotate the key points (this is just a common reading skill, but it\u0026rsquo;s easy to forget!) and note the portions that confuse you. Depending on your background, you may or may not be able to finish the first pass. You should aim to have a big picture understanding of the paper, so maybe about 30%. If you can\u0026rsquo;t reach that level on your first read - don\u0026rsquo;t fret. You need to go read some supplementary materials. In particular, any decent paper will reference prior/related work in a section near the introduction - this is where you can dive into their citations and read up on the things that confuse you. Alternatively, I\u0026rsquo;ve found that Medium posts are particularly helpful as well for understanding more basic content.\nRelated Work section in VideoBERT paper. Note the underlined portions here from the Related Work section of the VideoBERT paper. These highlight topics that might be worth searching up. You don\u0026rsquo;t need to dive into everything, but having a general understanding of what cross-modal learning and BERT are would help to better understand this paper.\nIf after all that, you still can\u0026rsquo;t understand 30% of the material in the paper, then I\u0026rsquo;m afraid you probably need to read further on basic material and possibly postpone your talk. I don\u0026rsquo;t expect this to happen because the pool of people who choose to present is self-selecting (as in, you\u0026rsquo;re more likely to want to present in a reading group if you already have basic background), but just in case, don\u0026rsquo;t be afraid to start at the beginning. I too have had to withdraw after signing up for a reading group before because I just did not understand what I was reading at all.\nAfter the first pass, you have an idea of what the paper\u0026rsquo;s central ideas are. You can then start outlining what you want to talk about. Any subsequent passes will simply be to reinforce your understanding of the paper.\nFor a broader topic For a broader topic, you should still choose to focus on a few papers in order to narrow the scope of your presentation. If you choose this, you likely have an idea in mind for how you wish to synthesize the ideas in the papers. Knowing this, you should focus your reading based on which points you hope to elucidate most. The process will very much feel like the process in the above section, except you\u0026rsquo;ll spend less time focusing on the intricate details of any one paper and you\u0026rsquo;ll focus more on the key ideas that you can use to build toward whatever main idea you\u0026rsquo;re focusing on.\nIn general, giving these types of talks is difficult. Even professors struggle to present so much content in a clear way. If you intend to give a talk like this, make sure to spend extra time in advance to really nail a cohesive argument. Otherwise, just stick to one paper since usually the time you have is only enough for one.\nAn example of a decent talk that synthesizes ideas in multiple papers is Justin Johnson\u0026rsquo;s lecture on Object Segmentation. This is obviously not a reading group talk and is an entire course lecture - but the principles are relatively similar since the topics presented here are from recent papers. Another good example is the talk Dr. Chai gave us in Fall 2020.\nSome of our own, more tame talks presenting multiple papers include John Day\u0026rsquo;s Brain-Inspired AI talk, Yash Gambhir\u0026rsquo;s Text Summarization talk, and my talk on using reinforcement learning for optimization in COVID-19 problems. If you watch them you\u0026rsquo;ll notice some of the difficulties we had with balancing our content and finishing in time.\nCreating slides Most of the time you\u0026rsquo;ll be preparing slides to assist you in your talk. Organizing your slides properly is the key to getting a good presentation going.\nSomething that helps me is using a general slide outline and then identifying where in the paper I can get the information for a specific section. Then I fill in the sections and occasionally add subsections based on the subtitles in the paper.\nIn general, you want to introduce the following points in any regular paper presentation. You can change the order to suit your preferred flow, but the one presented here works well normally. Note that you can use any number of slides for each section:\nMotivation Why did the authors explore this topic? Who and how does it help solve a big problem? Major Contributions What are the authors proposing or introducing? Make this clear at the beginning. Then your audience will know what to expect. Background What does your audience need to know (at a high level) before you dive into the details of the topic? This is not always necessary, but if you\u0026rsquo;re presenting something technically challenging you may want to briefly introduce this. Method/Theory This is the novel part of the paper. What did the authors do and how did they do it? Results/Experiments How did they validate their methods and what did they compare it to? What are the deliverables? Discussion/Takeaways/Future Work Restate the major contributions. Also, talk about the implications for the future. General Principles You\u0026rsquo;ve probably presented to someone before. In that case, you should be well aware of standard principles, but I\u0026rsquo;ll write some in case you aren\u0026rsquo;t:\nLow amount of text on any one slide This is a technical talk. Please don\u0026rsquo;t make your readers lose you. Personally, I tend to put around 2 lines of text on a slide and then explain the rest verbally. Putting less text and explaining it instead helps me better understand the content too! Tables, images, diagrams, and videos wherever possible I don\u0026rsquo;t need to tell you that a picture is worth a thousand words, but they\u0026rsquo;ll help a ton. You can usually just steal these from the paper and its supplementary materials. If they don\u0026rsquo;t have any and you feel that one would be appropriate, don\u0026rsquo;t be afraid to create one! Avoid equations when you can Sometimes the talk is devoted to an equation or the theory you\u0026rsquo;re discussing is heavily reliant on equations (I can\u0026rsquo;t imagine some reinforcement learning papers without Bellman\u0026rsquo;s equation.). But if the paper has a lot of equations, try only to include the most important ones. Take a look at my VideoBERT slides and note that I absolutely did not follow these principles and the above listed structure during that talk. I consider my VideoBERT talk to be of poor quality. Don\u0026rsquo;t worry about the technical content. (Note that this link is Michigan only)\nHere are a few sample slides depicting how I would\u0026rsquo;ve roughly modified my VideoBERT talk to be easier to follow and listen to. I only wrote up to the methods section, because I just wanted to depict some of the principles in action. Again, don\u0026rsquo;t worry about the technical content. (This link is open to everyone)\nAlso, feel free to take a look at this slidedeck for general tips.\nPresenting your slides Presentation is very important for a technical talk. I\u0026rsquo;m pretty sure most presenters don\u0026rsquo;t want to bore their audience. During one reading group a while ago, I delivered a one hour talk that included even professors in the audience. After that talk I didn\u0026rsquo;t receive a single question. I can only speculate whether they got lost, whether we were out of time, or whether I just completely bored them. Let\u0026rsquo;s hope that doesn\u0026rsquo;t happen to you.\nHere are some steps you can take to reduce the chance of losing your audience:\nReiterating the importance of preparing your slides properly. Prepare them as if you were presenting them, and then practice presenting them at least once before your talk. This is a given - you should be speaking and never reading. Don\u0026rsquo;t go on diversions. Save them till the end. Leave room for questions during your presentation. I doubt most people will remember their questions by the end. A good rule might be to ask for questions every 5 minutes. Similarly, you should be gauging understanding as you go along. If the audience can attest to understand what you\u0026rsquo;re saying, you\u0026rsquo;re fine. Speak slowly. I don\u0026rsquo;t know about you, but I\u0026rsquo;d rather have my entire audience understand 80% of my presentation and not finish within time than finishing and not having anyone understand anything. I sometimes break this rule without realizing. There are probably many more principles to follow, but in reading groups these are the ones I\u0026rsquo;ve found to be the most blatant errors that I wish I corrected.\nCan I forgo preparing slides? If you don\u0026rsquo;t want to prepare slides, you can either:\nWalk through the paper itself Prepare questions and facilitate a discussion rather than giving a talk I wouldn\u0026rsquo;t advise a newcomer (or anyone, for that matter) to choose the first option. The point of preparing slides is to make material more presentable and to help you, the presenter, understand the paper better. I\u0026rsquo;ve only experienced people presenting straight from the paper when they knew what they were talking about but had last minute obligations come up. For reference, the last two times I saw this done were from a student who wrote the paper he was presenting on, and from a senior research scientist at Google Brain. It is generally okay, however, to supplement your slides during your talk by briefly visiting the paper to discuss something like a figure or a table, or to answer questions.\nThe second option is far more feasible, and at MSAIL we actually recommend this format. Discussion questions help the audience engage with the material more. However, good discussions usually occur around people with background, so be wary of your audience. You\u0026rsquo;ll usually be presenting something in addition to the questions - for example, last Winter we had a discussion about using vision to analyze CT scans for the purpose of detecting COVID-19. All we did was play a video prepared by another organization and then discussed it in detail. This is perfectly fine, given that you have an interesting topic.\nGoing forward Yeah, preparing to present at a reading group is a lot of work the first time around. After a while, you\u0026rsquo;ll be comfortable enough with both approaching novel technical content and with your presentation skills, so you\u0026rsquo;ll be able to take shortcuts and structure things as you wish. You\u0026rsquo;ll also just become faster. In the long term, this skill will certainly help you as a researcher.\nGone are the days when the MSAIL Admin team was scrambling to prepare entire talks within 5 hours on the day of (we were quite notorious for this during the \u0026lsquo;19-\u0026lsquo;20 school year). This happened because we had very few speakers, but we\u0026rsquo;re much better off now. I hope you never prepare a talk within such a constraint because I can guarantee that the talk will fail miserably. The further along you go as a researcher, the later you\u0026rsquo;ll be able to start preparing reading group presentations, but you\u0026rsquo;ll still wish you started earlier.\nIf you\u0026rsquo;re ready to try your hand at a talk, sign up with your reading group(s). For University of Michigan students, here are some reading groups you might be interested in:\nGroup Name Page MSAIL Reading Group https://msail.github.io/join/ Computer Vision Reading Group https://sites.google.com/umich.edu/cv-reading-group/home Natural Language Processing Reading Group https://lit.eecs.umich.edu/reading_group.html Reach out to us at msail-admin@umich.edu if you\u0026rsquo;re interested in giving a talk at MSAIL or for help with preparing a talk. Happy presenting!\n","date":1610841600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1610841600,"objectID":"42caca902c662b18bab13cc56fe09b35","permalink":"https://MSAIL.github.io/post/presenting_reading_group/","publishdate":"2021-01-17T00:00:00Z","relpermalink":"/post/presenting_reading_group/","section":"post","summary":"If you\u0026rsquo;re involved in research, you\u0026rsquo;re probably going to give a reading group presentation at some point. Many professors push their PhD students to give talks. Giving these talks helps researchers build the ability to read and understand papers quickly, and the ability to communicate findings effectively.","tags":["Reading Group","Presenting","Meta-skills"],"title":"Presenting in a Reading Group","type":"post"},{"authors":["Kevin Wang"],"categories":null,"content":" Say you are a researcher in autonomous vehicles, and as part of a project for a funder, you want to make an algorithm to automatically tell if a frame of video has a sidewalk or not. So you get a dataset of driver-view video that you recently collected, hand it off to your labelers, and a month or so later train the model on the labeled data. You get about 90% accuracy on both your train and test sets, and so you push the code to your team’s GitHub, tell your team you have a working algorithm, and celebrate. When the team uses it, however, they make an unpleasant discovery: the algorithm does not detect sidewalks at all! Your troubleshooting brings you to that labeled dataset, and you hang your head in your hands: 90% of the frames in the dataset don’t have sidewalks in them! Instead of learning meaningful decision-making rules, your model has learned to predict that all images do not contain sidewalks.\nOr let’s say you’re working at a security company, and your team is tasked with creating a facial recognition algorithm. You spend a few months labeling a dataset of human faces, where the distribution of ethnicities in the dataset matches the distribution of ethnicities in the United States. According to the US Census from 2010, the dataset will be roughly 75% “white alone” faces and 13% “black or African American” faces (as seen in [1], Tab. P3). Then you tuck the dataset away into some folder somewhere and move on to designing the algorithm. When it comes to test time, your algorithm does significantly worse at recognizing dark-skinned faces. This reflects findings from a paper published in 2018 by Buolamwini and Gebru [2], where three commercial facial recognition algorithms trained on two benchmark datasets consisting of roughly 80% / 85% “white” faces had an astonishing 35% error rate on dark-skinned female faces. The consequences of an algorithm biased like this are not theory; less than an hour away from Ann Arbor, a dark-skinned man in Farmington Hills was wrongfully arrested based on a mistake made by a facial recognition algorithm, forcing him to spend thousands of dollars on an attorney to defend himself for a crime he did not commit [3].\nObama being upsampled into a light-skinned man (The Verge) In either case, you have a critical failure, and your project manager firmly tells you that the product needs to ship in three weeks. One possible solution is to add more data to “balance out” the different image classes: have your labelers find and label more images with sidewalks, or have them find and label more images of people of color. But labeling the original dataset took months, and with the degree of imbalance that you have, it would take a similar amount of resources to successfully address that. And you only have three weeks! Even if you had more time, how do you tell your boss that you now need the team to spend more resources on more labeling work? If only there was a way that you could have balanced the dataset earlier!\nBoth of these are motivating examples for the problem I am working on in my lab, the Data Elements from Video using Impartial Algorithm Tools for Extraction lab at the University of Michigan Transportation Research Institute (UMTRI-DEVIATE). This is a subproblem we are trying to solve as part of our broader objective to create tools for the Federal Highway Administration (FHWA) to assist labelers with video data. Specifically in this subproblem, we want to save FHWA labelers time with future research tasks, and we want the models they train to avoid bias. We call this problem the “human sampling problem”.\nLabeling lots of data is costly. Hiring labelers requires money, and those labelers need time to accurately label a dataset. When it comes to a non-trivial image / video labeling task (for example, drawing bounding boxes around cell phones in videos of someone driving a vehicle), you also need to create a rubric so that all of your labelers are on the same page.\nIn addition, you may not want to use all that data. If a large dataset is skewed, an algorithm trained on that dataset will in turn be skewed towards making a particular decision. This is the root of both of the failures in the examples presented above – a sidewalk detector trained on skewed data cannot recognize sidewalks, and a facial recognition algorithm trained on skewed data struggles to identify the faces of dark-skinned women.\nConsidering both of these problems, one possible remedy would be to sample a subset of your original dataset before handing it over to the labelers such that the subset we select is as informative as possible. We hypothesize that a subset with maximum information will not be skewed, i.e. we think a dataset of 10 cats and 90 dogs will probably have less information than a dataset with 50 cats and 50 dogs. The way we have framed this is that we want to maximize gain in information per unit of cost in terms of labeler time.\nIn short, we want to reduce labeling time as much as possible while minimizing loss in accuracy by sampling a subset of our dataset that contains as much information as possible.\nAt this point, we are still investigating possible solutions. We are currently testing our ideas on images, with plans to transfer our work to applications on video data. Here’s what we’ve already done and what we’re working on right now:\nSanity checks: We first confirmed that skewed datasets tend to have lower accuracy. Using the Stanford Cars and the FGCV-Aircraft datasets, we looked at samples of 50 of one class versus 150 of the other and compared the results of a model trained on those samples to a model trained on 100 of each. We confirmed that the model trained on the balanced sample tended to outperform the model trained on the skewed samples.\nSome examples from the Stanford Cars dataset The rest of our techniques essentially attempt to learn some representation of each example image in our dataset, then find a sampling strategy such that the examples we sample into our subset are as dissimilar as possible.\nSSD: We tried using SSD (sum of square differences) as a distance between every pair of images in our dataset and sampling a subset that maximizes the sum of SSDs between images in the dataset. We did not find a correlation between sum of SSDs in a dataset and model accuracy, because SSD does not encode relevant information. Moreover, we would have to solve an NP-hard problem to accurately maximize the sum of SSDs in a sampled subset, making this strategy unviable on large datasets.\nKeypoint/descriptors-based methods: Algorithms like SIFT, SURF, and ORB detect some number of keypoints that describe an image, along with descriptors for each of those keypoints. We first tried getting a single value for each sample image by summing the squared sums of descriptor components, but this did not produce correlations because we lost the spatial information from the keypoints. We are currently using k-means clustering to separate example images into k different image classes determined via unsupervised learning.\nThe scale-invariant feature transform (SIFT) algorithm in action. Edge detection: Edges can imply what content is in an image, and a dataset whose images have varying amounts of horizontal and vertical edges may have more information than a dataset whose images’ horizontal and vertical edges are not as varied. We are currently investigating ways to implement and evaluate this idea.\nUnsupervised feature encoding with neural networks: By having a neural network (possibly sequence model, GAN) attempt to perform a task, it will learn the features of the data it attempts to perform that task on; we can use those learned features to describe our examples. This is likely appropriate to our problem, but we have not started investigating this yet. I personally think that this approach has merit simply because of how much recent work there has been on this subject, but I wonder if using a learning algorithm to address a problem with a learning algorithm may recurse the problem back into the solution.\nFor DEVIATE, successfully solving the human sampling problem means we’ll have found a way to reduce labeler workload while also mitigating problems with accuracy and bias that accompany skewed datasets. Our hope is that creating such a tool will let organizations train more effective models with fewer resources, allowing them to invest effort elsewhere in their projects.\nIf working on this problem interests you, consider following up with me ( musicer@umich.edu) or the advisor for the subteam working on this problem, Dr. Carol Flannagan ( cacf@umich.edu).\nCitations: [1]\tU.S. Government, “2010 Census Summary File 1,” U.S. Census Bureau, United States, SF1/10-4 RV, 2010.\n[2] J. Buolamwini and T. Gebru, “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification”, presented at the 1st Conference on Fairness, Accountability, and Transparency, New York, NY, United States, February 23-24, 2018.\n[3]\tA. Winn, “A Local Case Amplifies Opposition To Facial Recognition Technology”, Hour Detroit, Sept. 14, 2020. [Online]. Available: https://www.hourdetroit.com. [Accessed Oct. 25, 2020].\n","date":1609545600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1609545600,"objectID":"69412b8d30e722978d9920c52bd4fc83","permalink":"https://MSAIL.github.io/post/conserving_label_musicer/","publishdate":"2021-01-02T00:00:00Z","relpermalink":"/post/conserving_label_musicer/","section":"post","summary":"Say you are a researcher in autonomous vehicles, and as part of a project for a funder, you want to make an algorithm to automatically tell if a frame of video has a sidewalk or not.","tags":["Dataset Bias"],"title":"Conserving Labeling Resources and Mitigating Bias from Skewed Datasets","type":"post"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Prof. Joyce Chai\nTopic: Situated Language Processing Towards Interactive Task Learning\nProf. Chai discussed some of her research on situated language processing, which is a field describing the interaction of language and visual/motor processing in embodied, situated, and language-for-action research traditions. This research also aims to unite converging and complementary evidence from behavioral, neuroscientific, neuropsychological and computational methods.\nYou can find a recording of her talk here.\nSupplemental Resources Language to Action: Towards Interactive Task Learning with Physical Agents\nSituated Language and Embodied Dialogue (SLED) Group\n","date":1606860000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1606860000,"objectID":"41ead330fc9bdcbd70be1c2aa9e2e65e","permalink":"https://MSAIL.github.io/talk/chai_120120/","publishdate":"2020-12-01T18:00:00-04:00","relpermalink":"/talk/chai_120120/","section":"talk","summary":"Speaker(s): Prof. Joyce Chai","tags":["Natural Language Processing","Situated Language Processing","Embodied AI","Grounded Language","Joyce Chai"],"title":"Faculty Talk: Situated Language Processing and Embodied Dialogue","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Dr. Sindhu Kutty\nTopic: Prediction Markets, Recommender Systems, Fairness in AI\nDr. Kutty discussed some of her research on prediction markets, recommender systems, and fairness in AI. The talk mostly focused on some derivations for prediction markets, such as scoring functions for data collection.\nDr. Kutty asked us not to post the recording for this talk. We apologize for any inconvenience.\nSupplemental Resources Scoring Rules\nInformation Aggregation in Exponential Family Markets\n","date":1605650400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1605650400,"objectID":"9f183718edcc51e5dfed886c688c5f8a","permalink":"https://MSAIL.github.io/talk/kutty_111720/","publishdate":"2020-11-17T18:00:00-04:00","relpermalink":"/talk/kutty_111720/","section":"talk","summary":"Speaker(s): Dr. Sindhu Kutty","tags":["Fairness","Prediction Markets","Sindhu Kutty","Recommender Systems"],"title":"Faculty Talk: Prediction Markets and More","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Yashmeet Gambhir\nTopic: Text Summarization with Deep Learning\nYash discussed text summarization, where the goal is to\u0026hellip; summarize text. More specifically, he discussed abstractive summarization, of which the goal is to generate novel sentences using natural language generation techniques. One such method for doing this is using pointer-generator networks. After discussing PGNs, he went on to discuss a paper describing extreme summarization to combat model hallucination for this task. The papers discussed are linked below.\nYou can find a recording of his talk here. Unfortunately this only includes the second half of the talk about abstractive summarization, because we forgot to record starting at the beginning.\nSupplemental Resources Get To The Point: Summarization with Pointer-Generator Networks\nOn Faithfulness and Factuality in Abstractive Summarization\n","date":1605045600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1605045600,"objectID":"7798e6a29e3c4df8124f97067baeebfc","permalink":"https://MSAIL.github.io/talk/textsummarization_111020/","publishdate":"2020-11-10T18:00:00-04:00","relpermalink":"/talk/textsummarization_111020/","section":"talk","summary":"Speaker(s): Yashmeet Gambhir","tags":["Transformers","Pointer-Generator Networks","Natural Language Processing","Text Summarization","Natural Language Generation"],"title":"Text Summarization with Deep Learning","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Topic: Convolutional Neural Networks\nPresenter: Kevin Wang\nThis lesson covered convolutional neural networks, which serve as the backbone for many modern-day deep learning applications. Most commonly, convolutional neural networks are used for vision tasks (although not exclusively).\nSupplemental Resources Slides on CNNs\nSlides on Neural Networks\n","date":1604534400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1604534400,"objectID":"2640836ac41575a069394bff4e6ca51d","permalink":"https://MSAIL.github.io/previous_material/cnn/","publishdate":"2020-11-05T00:00:00Z","relpermalink":"/previous_material/cnn/","section":"previous_material","summary":"Presented by Kevin Wang","tags":["Convolutional Neural Network","CNN","Neural Nets","ML","Computer Vision","Deep Learning"],"title":"Convolutional Neural Networks","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): John Day\nTopic: Brain-inspired AI\nJohn started his talk by discussing brain-inspired AI in general, which involves studies like neural modeling, artificial consciousness, spiking neural nets, and cognitive architectures. Afterwards, he focused on deep predictive coding networks.\nYou can find a recording of his talk here.\nSupplemental Resources Brain-inspired AI\nDeep Predictive Coding Networks for Video Prediction and Unsupervised Learning\nDeep Predictive Coding Network for Object Recognition\n","date":1604440800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1604440800,"objectID":"c037a102bdd46d876c4d49ce9f20fcb0","permalink":"https://MSAIL.github.io/talk/brain_insp_110320/","publishdate":"2020-11-03T18:00:00-04:00","relpermalink":"/talk/brain_insp_110320/","section":"talk","summary":"Speaker(s): John Day","tags":["Brain-inspired AI","Cognitive Architecture","Deep Predictive Coding","Computer Vision"],"title":"Brain-Inspired AI","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Prof. John Laird\nTopic: Cognitive Architecture\nProf. Laird discussed cognitive architecture - more specifically, he discussed SOAR, a cognitive architecture that his research group has been developing and maintaining for decades. SOAR is simultaneously a theory of cognition and an architecture, with the ultimate goal of enabling general intelligent agents to realize the full cognitive capabilities of humans.\nYou can find a recording of his talk here.\nSupplemental Resources SOAR Group Homepage\nThe SOAR Cognitive Architecture, written by John E. Laird\nA similar talk given to MSAIL back in 2011!\n","date":1603836000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1603836000,"objectID":"8cf2f88e1e6595347a067b174955e578","permalink":"https://MSAIL.github.io/talk/laird_102720/","publishdate":"2020-10-27T18:00:00-04:00","relpermalink":"/talk/laird_102720/","section":"talk","summary":"Speaker(s): Professor John Laird","tags":["Cognitive Architecture","SOAR","John Laird"],"title":"Faculty Talk: Cognitive Architecture","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Andrew Awad\nTopic: Image-to-Image Translation with Conditional Adversarial Networks\nAndrew presented on a CVPR 2017 paper by Isola et al. This paper aimed to investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. The networks in question were CGANs, proposed earlier by Mirza et al. Isola et al. also proposed the PatchGAN discriminator.\nA certain forgetful lead admin forgot to record this discussion. We apologize for the inconvenience.\nSupplemental Resources Paper(s):\nImage-to-Image Translation with Conditional Adversarial Networks\nCGAN\nOther:\nPatchGAN\npix2pix GitHub Page\nMedium article on pix2pix\n","date":1603231200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1603231200,"objectID":"164e78d089c81834e3cac9a4bbc072c8","permalink":"https://MSAIL.github.io/talk/cgan_102020/","publishdate":"2020-10-20T18:00:00-04:00","relpermalink":"/talk/cgan_102020/","section":"talk","summary":"Speaker(s): Andrew Awad","tags":["GAN","Conditional Generative Adversarial Networks","Computer Vision","Image Translation","im2im","pix2pix"],"title":"Image-to-Image Translation with Conditional Adversarial Networks","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Nikhil Devraj\nTopic: Reinforcement Learning for COVID-19 Optimization Problems\nIf you\u0026rsquo;re not living under a rock, you know that COVID-19 is ravaging current-day society and requires monumental efforts on all scales, be it from individuals or from entire governments. In particular, governments play a major role in helping control the spread of COVID-19 by instituting policies to help with efforts such as lockdown enforcement and vaccine distribution. During this talk Nikhil talked about some previously proposed approaches to modeling such policy problems as control problems that could be solved with reinforcement learning.\nYou can find a recording of this discussion here.\nSupplemental Resources Paper(s):\nCOVID-19 Pandemic Cyclic Lockdown Optimization Using Reinforcement Learning\nOptimal policy learning for COVID-19 prevention using reinforcement learning\nVacSIM: LEARNING EFFECTIVE STRATEGIES FOR COVID-19 VACCINE DISTRIBUTION USING REINFORCEMENT LEARNING\nArticle(s):\nReinforcement learning for Covid-19: Simulation and Optimal Policy\n","date":1602626400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1602626400,"objectID":"01ff83653b235e73fe4e4f88298cbda9","permalink":"https://MSAIL.github.io/talk/rlcovid_101320/","publishdate":"2020-10-13T18:00:00-04:00","relpermalink":"/talk/rlcovid_101320/","section":"talk","summary":"Speaker(s): Nikhil Devraj","tags":["Reinforcement Learning","COVID-19","Deep Q-Network","Q-Learning","Optimization","Control"],"title":"Reinforcement Learning Applied to COVID-19 Optimization Problems","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Topic: Applications of Regression Presenter: Robert Aung\nThis lesson covered some interesting applications of regression.\nSupplemental Resources Colab notebook\n","date":1602115200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1602115200,"objectID":"feee9af1e86a169eedb32865c00d3134","permalink":"https://MSAIL.github.io/previous_material/regression_2/","publishdate":"2020-10-08T00:00:00Z","relpermalink":"/previous_material/regression_2/","section":"previous_material","summary":"Presented by Robert Aung","tags":["Linear Regression","Gradient Descent","Introduction","ML","Application"],"title":"Regression, Part 2 (Application)","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Patrick Morgan\nTopic: Human-Centered Autonomous Vehicles\nPatrick focused on discussing Human-Centered Autonomous Vehicle Systems: Principles of Effective Shared Autonomy. This paper proposes that we should build autonomous vehicles with humans in mind, and that getting humans and artificial intelligence systems to collaborate effectively is an achievable and worthy goal. In this light, they propose a human-centered paradigm for engineering shared autonomy systems in the car that erase the boundary between human and machine in the way the driving task is experienced. The researchers propose a 7 principle engineering design process that will make autonomous vehicles safer and greatly lower the cost of development. This discussion also ended up touching on other fundamental issues in AI, such as data privacy.\nYou can find a recording of this discussion here.\nSupplemental Resources Paper(s):\nHuman-Centered Autonomous Vehicle Systems: Principles of Effective Shared Autonomy\n","date":1602021600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1602021600,"objectID":"f0eca5e52d979e191cfb8b132015849a","permalink":"https://MSAIL.github.io/talk/av_100620/","publishdate":"2020-10-06T18:00:00-04:00","relpermalink":"/talk/av_100620/","section":"talk","summary":"Speaker(s): Patrick Morgan","tags":["Autonomous Vehicles","Robotics","Human-centered AI","AI Safety"],"title":"Human-Centered Autonomous Vehicles","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Kevin Wang\nTopic: AlphaZero and its Impact on Chess\nThe world was appalled when AlphaGo first played Lee Sedol in Go, winning 4 matches to 1. DeepMind subsequently released AlphaGo Zero, an iteration on AlphaGo that beat it 100-1. Going even further, they released AlphaZero, which learned how to play games such as Shogi and Chess. Kevin, an avid chess enthusiast, wanted to discuss what this meant for the Chess world.\nYou can find a recording of this discussion here.\nSupplemental Resources Paper(s):\nAssessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess\nMastering Chess and Shogi by Self-Play with a General RL Algorithm\nVideo(s):\nVideo From DeepMind\nAlphaZero VS AlphaZero || THE PERFECT GAME\n","date":1601416800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1601416800,"objectID":"9ded540f887d048074d8ff623ae47655","permalink":"https://MSAIL.github.io/talk/alphazero_chess_092920/","publishdate":"2020-09-29T18:00:00-04:00","relpermalink":"/talk/alphazero_chess_092920/","section":"talk","summary":"Speaker(s): Kevin Wang","tags":["Reinforcement Learning","AlphaZero","Chess","AlphaGo"],"title":"AlphaZero and its Impact on the World of Chess","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Topic: Theory and Implementation of Regression\nPresenter: Robert Aung\nThis lesson covered the theory behind and implementation of a linear regression model with gradient descent.\nYou can view a recording of this lesson here.\nSupplemental Resources Lesson slides\nLesson Colab notebook\n","date":1600905600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1600905600,"objectID":"4241dd26505a3e380b9791ad1a53d2a9","permalink":"https://MSAIL.github.io/previous_material/regression_1/","publishdate":"2020-09-24T00:00:00Z","relpermalink":"/previous_material/regression_1/","section":"previous_material","summary":"Presented by Robert Aung","tags":["Linear Regression","Gradient Descent","Introduction","ML"],"title":"Regression, Part 1 (Theory and Implementation)","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Prof. Michael Wellman\nTopic: Strategic Reasoning in Dynamic Environments\nDr. Wellman presented about his group\u0026rsquo;s research, which generally specializes in game theory and multi-agent reasoning in dynamic environments. Much of his work lies in the domain of markets and commerce. You can find his group\u0026rsquo;s page here.\nProf. Wellman asked us not to post the recording publicly. A recording is available within our Slack channel, so please search in there if you\u0026rsquo;re interested. We apologize for any inconvenience.\nSupplemental Resources Article(s):\nEmpirical Game-Theoretic Analysis\nComputational Finance\nWorld with Autonomous Agents\nVideo(s):\n\u0026ldquo;Artificially Intelligent Decision Makers in the Real World\u0026rdquo; with Michael Wellman \u0026amp; Bill Powers\n","date":1600812000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1600812000,"objectID":"6f9879747500d58fdd26c287b9e84d4f","permalink":"https://MSAIL.github.io/talk/wellman_092220/","publishdate":"2020-09-22T18:00:00-04:00","relpermalink":"/talk/wellman_092220/","section":"talk","summary":"Speaker(s): Professor Michael Wellman","tags":["Reinforcement Learning","Michael Wellman","Strategic Reasoning Group","Game Theory"],"title":"Faculty Talk: Strategic Reasoning in Dynamic Environments","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Topic: Classification with Logistic Regression\nPresenter: Kevin Wang\nKevin taught some members of MSAIL some basics of machine learning, culminating in building out a classification model for MNIST from scratch using logistic regression. Classification is the process of categorizing data into predetermined groups, and logistic regression is a means to build a classifier (though certainly not the only means to do so).\nWe couldn\u0026rsquo;t find the recording of this session, but be sure to check out the supplemental materials.\nSupplemental Resources Lesson slides\nLesson Colab notebook\nBasic matrix operations\nBasic Python programming\n","date":1600300800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1600300800,"objectID":"e81b67dd6c2c953097c644d750037676","permalink":"https://MSAIL.github.io/previous_material/classification_logreg/","publishdate":"2020-09-17T00:00:00Z","relpermalink":"/previous_material/classification_logreg/","section":"previous_material","summary":"Presented by Kevin Wang","tags":["Classification","Logistics Regression","Introduction","ML","Linear Classifier"],"title":"Classification with Logistic Regression","type":"previous_material"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Sean Stapleton\nTopic: GPT-3 and its Implications\nIn recent years, we’ve seen natural language processing (NLP) performance accelerate drastically across a number of tasks, including text completion, machine translation, and question answering. Much of this performance gain has been attributed to two trends in the NLP community, namely the introduction of transformers, and the increase in model size (and consequent need for intense computational power). Capitalizing on these trends, OpenAI recently released a transformer-based model called GPT-3 with 175 billion parameters, that was trained on roughly 500 billion tokens scraped from the internet. This MSAIL discussion focused predominantly on three questions addressed in the paper:\nDoes a substantial increase in model size actually lead to better performance in downstream tasks? Can language models effectively model intelligent and adaptable thought? What are the biases and risks associated with training a language model on the entire internet? Sean also covered the transformer and GPT-3 model architectures, though the focus of the discussion was not on this aspect of the paper.\nYou can find the recording of this talk here.\nSupplemental Resources Papers:\nLanguage Models are Few-Shot Learners (Brown et al.)\nArticles:\nThe Illustrated Transformer\nThe Illustrated GPT-2\n","date":1600207200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1600207200,"objectID":"7b271409ec365353689a1f93fd5c9aca","permalink":"https://MSAIL.github.io/talk/gpt3_091520/","publishdate":"2020-09-09T22:35:38-04:00","relpermalink":"/talk/gpt3_091520/","section":"talk","summary":"Speaker(s): Sean Stapleton","tags":["GPT-3","Natural Language Processing","Transformers","Language Models","BERT","OpenAI"],"title":"The Trend Towards Large Language Models","type":"talk"},{"authors":["MSAIL"],"categories":null,"content":"Speaker(s): Daniel Preotiuc-Pietro and Mayank Kulkarni\nTopic: Applied Named Entity Recognition\nDuring this talk, two senior research scientists from Bloomberg\u0026rsquo;s AI Group presented on some of their work on Applied Named Entity Recognition. Their discussion focused on applications of NER at Bloomberg, multi-domain NER, and analysis of NER using temporal data.\nDue to restrictions from Bloomberg, we were unable to record this session. We apologize for the inconvenience.\nSupplemental Resources Papers from Bloomberg AI:\nTemporally-Informed Analysis of Named Entity Recognition, ACL 2020\nMulti-Domain Named Entity Recognition with Genre-Aware and Agnostic Inference, ACL 2020\nA Semi-Markov Structured Support Vector Machine Model for High-Precision Named Entity Recognition, ACL 2019\nSlides from Bloomberg AI:\nMulti-Domain NER\n","date":1599688800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1599688800,"objectID":"f561dfc79210630f618d6d6bce2f1d2e","permalink":"https://MSAIL.github.io/talk/ner_090920/","publishdate":"2020-09-09T22:35:38-04:00","relpermalink":"/talk/ner_090920/","section":"talk","summary":"Speaker(s): Daniel Preotiuc-Pietro and Mayank Kulkarni","tags":["Named Entity Recognition","NER","Bloomberg"],"title":"Bloomberg Tech Talk: Applied Named Entity Recognition","type":"talk"},{"authors":null,"categories":null,"content":"MSAIL dates back to 2008 and has a history of many students who went on to do great things in the tech industry. We\u0026rsquo;d like to showcase as many of our alumni here as possible. If you are a previous member of MSAIL and would like to be featured on this page, please let the admin team know!\nRobert Aung Fall 2020 Andrew Awad Winter 2021 Kierra Davis Winter 2021 Nikhil Devraj Fall 2020 Isaac Fung Winter 2022 Abhay Shakhapur Winter 2024 Andrew Li Winter 2024 Anthony Liang Winter 2020 (BS), Winter 2021 (MS) Patrick Morgan Winter 2021 Ashwin Sreevatsa Winter 2022 Sean Stapleton Fall 2020 Kevin Wang Fall 2021 Nina Li Winter 2023 Chloe Snyders Winter 2023 William Wang Winter 2024 ","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"c61bd17da54e6598c71ab43046ec8671","permalink":"https://MSAIL.github.io/alumni/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/alumni/","section":"","summary":"MSAIL Alumni","tags":null,"title":"Alumni","type":"page"},{"authors":null,"categories":null,"content":"Contact us at msail-admin@umich.edu!\nYou can find our professors\u0026rsquo; and admin team\u0026rsquo;s individual emails here.\n","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"6d99026b9e19e4fa43d5aadf147c7176","permalink":"https://MSAIL.github.io/contact/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/contact/","section":"","summary":"Contact for MSAIL","tags":null,"title":"Contact Us","type":"page"},{"authors":null,"categories":null,"content":"Add yourself to the email list on MCommunity by logging in and then clicking \u0026ldquo;Join Group\u0026rdquo; in the top-left corner of the panel (see the image below). After doing this, you will receive emails from us.\nAlso, don\u0026rsquo;t forget to join our Slack group!\nLeaving MSAIL If you later wish to leave, simply go to the MCommunity page and click \u0026ldquo;Resign\u0026rdquo;, which will be in the top left corner in place of \u0026ldquo;Join Group\u0026rdquo; in the above photo.\n","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"8795eba9bd19b87b0616d17da3c16590","permalink":"https://MSAIL.github.io/join/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/join/","section":"","summary":"Here we describe how to join MSAIL.","tags":null,"title":"Joining MSAIL","type":"page"},{"authors":null,"categories":null,"content":"Building a community to debate all the exciting news from novel models, to disruptful applications, to potential regulation. Hone your presentation skills on challenging topics with optional paper presentations\nMeeting Times Thursday from 7:00 - 8:00 at NUB 1567 Previous Material Winter 2021\n","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"cbbb9a333220e2d9aaa04e916be12c4f","permalink":"https://MSAIL.github.io/education/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/education/","section":"","summary":"Building a community to debate all the exciting news from novel models, to disruptful applications, to potential regulation. Hone your presentation skills on challenging topics with optional paper presentations\nMeeting Times Thursday from 7:00 - 8:00 at NUB 1567 Previous Material Winter 2021","tags":null,"title":"ML DIscussion","type":"page"},{"authors":null,"categories":null,"content":"[Last updated on 2017-10-02]\nTerminology Cosine: The lead admin.\nSine: A member of the admin team.\nArticle 0: MSAIL seeks to increase its members' Machine Learning knowledge. Indeed. MSAIL will maintain a list of Members, defined as participants in at least one major communications channels such as Slack or a mailing list. MSAIL will strive to discuss Machine Learning literature regularly throughout each school year. Upon joining the organization, all members agree not to undermine the purpose or mission of MSAIL. It falls to the Cosine to execute this Article. Article 1: Legislative powers lie in the Sines. The body of Sines possesses complete power: a simple majority of Sines will suffice to amend this Constitution, to appoint a Cosine, to override any decision of the Cosine, or to act in place of the Cosine. Sines who are informed of a decision to be made but offer no prompt response are not counted in the denominator of the \"simple majority\". The Sines will not allow the number of Cosines to fall below 1. By default, the Sines lie dormant and all powers and responsibilities lie with the Cosine. Article 2: Executive powers lie in the Cosine. Specifically, the Cosine is responsible for the day-to-day functioning of MSAIL, and to that end may act and delegate arbitrarily within Constitutional bounds. It is traditional for the Cosine to distribute significant short-term tasks among the Sines. The Cosine will report to the Sines, and the Sines will vote promptly on presented issues. The Cosine may appoint new Sines with the advice and consent of the old Sines. The Cosine will not allow the number of Sines to fall below 3. The Cosine shall break ties among the Sines. The Cosine may be a Sine. Article 3: MSAIL may also have Faculty Mentors. A faculty mentor can help us just by association. MSAIL may mention their names on its official communications. MSAIL shall inform each Faculty Mentor of its activities via brief weekly emails with no reply needed. Faculty mentors can point us to literature. Faculty mentors are always welcome to share cool papers. MSAIL may also request recommendations within a specific topic. A Faculty Mentor need not take on additional responsibilities, but may choose to do so if requested. The Sines and Cosine will endeavor to use Faculty Mentors’ time effectively. Faculty Mentors need not make any administrative decisions. Faculty Mentors are always welcome but never obliged to attend MSAIL meetings. Article 4: MSAIL is committed to inclusivity and transparency. MSAIL will not discriminate based on academic affiliation(s) or lack thereof, age, breastfeeding or lack thereof, career status, color, criminal record, disability or lack thereof, ethnicity, employment status, gender expression, gender identity, HIV status, marital status, national origin, parental status, personal association, physical features such as height and weight, political activity, pregnancy or lack thereof, race, religion or lack thereof, sex, sexual orientation, socioeconomic background, or veteran status. MSAIL will, moreover, actively include members, no matter the properties listed above. The creation and maintenance of an inclusive environment touches all aspects of our activities, from communications to recruitment and from discussion topics to leadership positions. MSAIL’s motto will be \"the more, the merrier\"; a corollary is that information such as planning discussions will be available to all members, so long as it does not conflict with privacy concerns. To rephrase a subset of the above in a university-required formula: MSAIL is committed to a policy of equal opportunity for all persons and does not discriminate on the basis of race, color, national origin, age, marital status, sex, sexual orientation, gender identity, gender expression, disability, religion, height, weight, or veteran status in its membership or activities unless permitted by university policy for gender specific organizations. It falls to the Cosine to execute this Article. Article 5: This Constitution may be amended by the Sines. Any Member may propose an amendment’s text. See Article 1 for voting details. Article 6: This Constitution will be MSAIL's supreme law. (Modulo University Policy.) Article 7: This Constitution will be ratified by the Sines. This Constitution will be re-written and ratified at least once in any 1024-day window. See Article 1 for voting details. ","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"d14bb2106071d93c9047331396e5072a","permalink":"https://MSAIL.github.io/constitution/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/constitution/","section":"","summary":"MSAIL's Constitution","tags":null,"title":"MSAIL Governance: A Brief Constitution","type":"page"},{"authors":null,"categories":null,"content":"Winter 2024: Building MLP (Multi-layer Perceptron) from scratch using Python Learn how to use python Learn about various machine learning techniques (Perceptron, GD) Have more questions? Contact msail-admin@umich.edu! Meeting time: Projects 7-8pm at NUB 1567 Fall 2023: Car accident prediction Learn how to predict the next car accident in a specific area of the U.S. You can check out the project material here Have more questions? Contact kevindw@ or andson@ ","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"6087c0ef875554f4409ac52928d79279","permalink":"https://MSAIL.github.io/projects/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/projects/","section":"","summary":"Winter 2024: Building MLP (Multi-layer Perceptron) from scratch using Python Learn how to use python Learn about various machine learning techniques (Perceptron, GD) Have more questions? Contact msail-admin@umich.edu! Meeting time: Projects 7-8pm at NUB 1567 Fall 2023: Car accident prediction Learn how to predict the next car accident in a specific area of the U.","tags":null,"title":"Projects","type":"page"},{"authors":null,"categories":null,"content":"This page will be updated periodically with new resources. Keep an eye out!\nFeel free to send any resource requests that you\u0026rsquo;d like listed here to msail-admin@umich.edu.\nFurthermore, if you are a resource owner and would like links to your work removed from this page, contact us at the email above.\nTable of Contents Learning Concepts Intro to Deep Learning Computer Vision Natural Language Processing Reinforcement Learning Campus Involvement Conferences Past Talks Medium/Blog articles Meta-skills and Mindset About this page This page serves as a conglomeration of resources for MSAIL members to gain access to knowledge regarding AI. This includes technical content, ways to get involved on campus, opportunities for research, jobs, and networking, and other advice regarding involvement in the field.\nWe intend to continually update this page with more resources and will occasionally restructure it to provide better depth.\nLearning Concepts It\u0026rsquo;s important to know that AI is a broad field spanning the past century. We shouldn\u0026rsquo;t simply think of AI as \u0026ldquo;deep learning\u0026rdquo;, because it\u0026rsquo;s not. We\u0026rsquo;re listing resources regarding the most popular topics first because we understand they\u0026rsquo;re likely what initially grabbed your interest, but do note that there\u0026rsquo;s plenty of research being carried out on topics that we\u0026rsquo;ve probably never even heard of.\nSam Finlayson from Harvard/MIT has a fantastic page on resources you can use to dive into ML. It\u0026rsquo;s quite advanced but provides a good starting point for those looking for a comprehensive list. We\u0026rsquo;ll be using some of these resources in our own lists.\nKeep in mind that just because we link a large list doesn\u0026rsquo;t mean you should be going through the entire thing. There\u0026rsquo;s way too much stuff to look at. The links within links within links are all meant to provide options; choose a specific topic that interests you (for example, generative models) and slowly explore it.\nIntro to Deep Learning Understanding deep learning to a satisfactory degree requires working familiarity (but not necessarily mastery) with the following prerequisite topics:\nVector and Matrix Operations Calculus (Partial and total derivatives, gradients) Probability Basic statistics Check out the slides/recordings from our education sessions, where some of these concepts are explained.\nFor a more thorough introduction to the field, we suggest the following resources:\nFast AI\u0026rsquo;s Deep Learning course EECS 498/598 - Deep Learning for Computer Vision @ University of Michigan CS231n - Convolutional Neural Networks for Visual Recognition @ Stanford University In particular, the second listed resource (EECS 498/598) is a course offered here every Fall. It is very similar to CS231n, so just one of the two would be satisfactory. These two courses are extremely well designed and we recommend them as a starting point.\nComputer Vision EECS 598 and CS231n (linked above) are a good start for getting involved with vision. These courses are heavily focused on deep learning, so if you want to learn about some of the methods that were popular before deep learning took off, try materials from EECS 442.\nHere\u0026rsquo;s a massive resource list called Awesome Computer Vision\nThe reason we link those courses above is because they cover a good breadth of topics in vision. You will know what you need to in order to make proper searches for your own research and projects once you\u0026rsquo;ve gone through one of them.\nNatural Language Processing NLP also has a few courses worth looking at:\nCS224n - NLP with Deep Learning @ Stanford University EECS 598 - NLP with Deep Learning @ University of Michigan This class was a seminar. It was less focused on educational material and more focused on contemporary research. So scroll this page if you\u0026rsquo;re looking for interesting papers. Other resources:\nAwesome NLP Similar to Awesome CV linked in the previous section, this is a massive list of resources to get acquainted with the field. NLP Textbook by Jacob Eisenstein Jay Alammar\u0026rsquo;s Blog The Annotated Transformer This is a really nice guide going through a \u0026ldquo;line by line\u0026rdquo; implementation of Attention is All You Need (the seminal transformer paper) Reinforcement Learning Here are some courses you can look at to learn about reinforcement learning:\nCS 285 - Deep Reinforcement Learning @ UC Berkeley CS 234 - Reinforcement Learning @ Stanford University This links to a series of lecture videos because the course website was taken down for some reason. Open AI put a ton of effort into creating a comprehensive resource for people to learn RL:\nSpinning Up in Deep RL Other resources:\nAwesome RL Similar to Awesome CV/NLP linked in the previous sections, this is a massive list of resources to get acquainted with the field. Resources from DeepMind Sutton \u0026amp; Barto - Intro to RL Textbook This is the de facto textbook for people to self-study RL. We can\u0026rsquo;t guarantee that this link will always work, but if it\u0026rsquo;s taken down, \u0026ldquo;Sutton and Barto\u0026rdquo; is all you\u0026rsquo;d need to search up. Lilian Weng\u0026rsquo;s Lil\u0026rsquo; Log Her blog contains more than just RL, but her RL posts are thorough and accessible (provided you have a basic ML background). In general, we really recommend blog posts from professionals because they\u0026rsquo;re easy to read yet rife with information. Finally, we also have the reinforcement learning theory course (EECS 598) here at Michigan. However, materials aren\u0026rsquo;t posted online and enrollment is, as usual, heavily limited - so we recommend looking at the materials from other courses in the meantime.\nWe\u0026rsquo;re in the process of adding more learning resources!\nCampus involvement Getting involved during your time on campus is the fastest way to learn about AI. You should definitely take relevant courses, but we also recommend joining a research group or relevant team to get more practice and familiarity with relevant topics. This includes participating in MSAIL-sponsored projects. MSAIL has a reading group, but it\u0026rsquo;s hard to balance all the different subfields of AI in just ~15 sessions in a semester. We highly recommend joining reading groups for more depth regarding the topics you\u0026rsquo;re interested in.\nClasses Last updated on 9/23/21. This is a listing of courses related to AI (from a technical perspective) here at the University of Michigan. We tried to be as comprehensive as possible, but there are far too many courses to sift through, so we may have missed some. More information is available on the LSA course guide.\nAs a side note, there are many courses that we can argue are related to AI from a less technical perspective. For example, take classes in the cognitive sciences - the development of human-like AI is heavily motivated by studies in this field. We leave these classes out for brevity\u0026rsquo;s sake.\nUndergraduate-level Classes:\nClass Code Class Name Last offered? EECS 442 Computer Vision F21 EECS 445 Machine Learning F21 EECS 492 Intro to AI F21 EECS 467 Autonomous Robotics (MDE) F21 LING 441 Introduction to Computational Linguistics F21 ROB 102 Introduction to Robotics Algorithms and Programming F21 EECS 367 Introduction to Autonomous Robotics F20 ROB 464 Hands-on Robotics W20 Graduate-level Classes:\nClass Code Class Name Last offered? EECS 505 Computational Data Science and Machine Learning F21 EECS 545 Machine Learning F21 EECS 592 Foundations of AI F21 EECS 542 Advanced Topics in Computer Vision F21 EECS 551 Matrix Methods for Signal Processing, Data Analysis, and Machine Learning F21 EECS 595/LING 541 Natural Language Processing F21 ROB 535 Self Driving Cars: Perception and Control F21 AEROSP 567 AEROSP 567: Inference Estimation and Learning F21 EECS 568/ROB 530 Mobile Robotics W21 EECS 692 Advanced Artificial Intelligence W21 EECS 504 Foundations of Computer Vision F20 Special Topics Classes:\nEach of these classes is listed under EECS 498, 598 or both - you will need to select the relevant section when registering.\nClass Code Class Name Last offered? EECS 498 Principles of Machine Learning F21 EECS 598 Randomized Numerical Linear Algebra for Machine Learning F21 EECS 498 Intro to Algorithmic Robotics F21 EECS 498 Conversational AI F21 EECS 598 Human-Computer Interaction F21 EECS 498 Intro to Natural Language Processing W21 EECS 498/598 Ethics for AI and Robotics W21 EECS 498/598 Applied Machine Learning for Affective Computing W21 EECS 598 Statistical Learning Theory W21 EECS 598 Unsupervised Visual Learning W21 EECS 598 Adversarial Machine Learning W21 EECS 598 Systems for AI W21 EECS 556/598 Image Processing W21 EECS 498/598 Deep Learning for Computer Vision F20 EECS 598 Reinforcement Learning Theory F20 EECS 598 Deep Learning for NLP F20 EECS 598 Situated Language Processing for Embodied AI Agents W20 EECS 598 The Ecological Approach to Vision W20 Research Labs A list of professors is available on the Michigan AI Lab faculty page. Each professor\u0026rsquo;s lab will be linked to on their homepage.\nReading Groups Right now, we are aware of three relevant reading groups that allow for public participation. Many research labs have internal reading groups as well.\nGroup Name Page Computer Vision Reading Group https://sites.google.com/umich.edu/cv-reading-group/home Natural Language Processing Reading Group https://lit.eecs.umich.edu/reading_group.html Reinforcement Learning Reading Group https://sites.google.com/umich.edu/rl-reading-group Conferences AI researchers nowadays usually write papers with the goal of submitting them to a conference. Conferences are a great way to meet other people in the field, get feedback on your work, and discuss ideas about further research. These are usually good places to start if you want to look for recent literature on a given topic.\nListed below are links to the pages of some highly-ranked machine learning conferences. Note that most of these links are for specific years (mostly 2021 because that\u0026rsquo;s when this list was first made); use Google to look up pages for other years.\nThis is not intended to be a complete list of AI/ML conferences. Usually, your first paper will be at a lower-tier conference as you get used to publishing.\nGeneral ML Conferences ICML\nICLR\nNeurIPS\nComputer Vision Conferences CVPR\nICCV\nNatural Language Processing Conferences EMNLP\nACL\nNAACL\nPrevious Talks MSAIL has hosted a number of talks over the years given by Michigan professors and students. You might find them insightful in providing a survey of current and past research.\nMedium/Blog Articles Medium articles are nice because they tend to be much shorter and easier to read than bona fide research papers. However, not all Medium articles are high quality. As such, we have provided a selection of high-quality Medium articles and blog posts in the vein of a Medium article. MSAIL also publishes its own blog.\nUnder construction \u0026ndash; we will be adding more articles in the near future!\nIntroductory \u0026ldquo;A Gentle Introduction to Machine Learning Concepts (Robbie Allen)\u0026rdquo;\nComputer Vision Overview of GANS (Zak Jost): \u0026ldquo;Part 1 (GAN)\u0026rdquo;, \u0026ldquo;Part 2 (DCGAN)\u0026rdquo;, \u0026ldquo;Part 3 (InfoGAN)\u0026rdquo;\n\u0026ldquo;Understanding Variational Autoencoders (VAES) (Joseph Rocca)\u0026rdquo;\nNLP \u0026ldquo;The Illustrated Transformer (Jay Alammar)\u0026rdquo;\n\u0026ldquo;How GPT3 Works - Visualizations and Animations (Jay Alammar)\u0026rdquo;\n\u0026ldquo;Transformer Architecture: The Positional Encoding\u0026rdquo;\nUncategorized \u0026ldquo;How Graph Neural Networks (GNN) work: introduction to graph convolutions from scratch (Nikolas Adaloglou)\u0026rdquo;\n\u0026ldquo;Understanding Latent Space in Machine Learning (Ekin Tiu)\u0026rdquo;\n\u0026ldquo;Papers we love\u0026rdquo; repository\nMeta-skills and Mindset Conducting Research Richard Hamming: “You and your research”\nMichael Nielsen: “Principles of Effective Research”\nJohn Schulman: “An Opinionated Guide to ML Research”\nReading research papers https://web.stanford.edu/class/ee384m/Handouts/HowtoReadPaper.pdf\nhttps://www.eecs.harvard.edu/~michaelm/postscripts/ReadPaper.pdf\nGiving Talks https://web.eecs.umich.edu/~cscott/talk_advice.htm\nGrad School Mor Harschol-Balter (CMU): Applying to CS PhD programs Eric Gilbert (Umich CSE, SI): Advice to his students Sebastian Ruder (Deepmind): 10 Tips for Research and a PhD Ronald Azuma (UNC): “So long, and thanks for the Ph.D.!” Andrej Karpathy (Tesla, OpenAI): A Survival Guide to a PhD Philip Guo (UCSD): Advice for early-stage Ph.D. students Andrey Kurenkov (Stanford): Lessons Learned the Hard Way in Grad School ","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"c6625328e7b2e36b114847f299065f54","permalink":"https://MSAIL.github.io/resources/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/resources/","section":"","summary":"Resources for MSAIL members.","tags":null,"title":"Resources for MSAIL Members","type":"page"},{"authors":[],"categories":[],"content":"VideoBERT C. Sun et al. Google Research Presented by: Nikhil Devraj\nMSAIL Motivation Representations of video data generally capture only low-level features and not semantic data BERT performs really well on language modeling tasks Contributions Combined ASR, Vector Quantization, and BERT to learn high-level features over long time spans in video tasks A first step in the direction of learning high-level joint representations Background BERT Pretrained language model used to generate a probability distribution of tokens Obtained by training model on \u0026ldquo;masking\u0026rdquo; task Supervised Learning Expensive to get labeled data Short term events in video data Unsupervised Learning Learns from unlabeled data Normal approaches used latent variables (i.e. GAN, VAE) differ from BERT Self-supervised Learning More on self supervised learning\nCross-Modal Learning Synchronized audio and visual signals allow them to supervise each other Use ASR as a source of crossmodal supervision Instructional Video Datasets Papers used LMs to analyze these videos with manually provided data Datasets were too small Method Omitted the rest You get the principles I\u0026rsquo;m getting at though right?\n","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"9e04e6fc6f1f1d87a0af5a917b8016ab","permalink":"https://MSAIL.github.io/slides/videobert/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/slides/videobert/","section":"slides","summary":"VideoBERT presentation if it were prepared better","tags":[],"title":"VideoBERT Revised","type":"slides"},{"authors":null,"categories":null,"content":"MSAIL is a large organization of over 400 members, and as such requires our Admin Team to help keep operations running smoothly. Our Admin Team is advised by faculty mentors involved in AI research at the University of Michigan.\nFaculty Mentors These astounding professors make MSAIL possible with their advice and support. Sindhu Kutty Lecturer III, EECS skutty@ Laura Balzano Associate Professor, EECS girasole@ Danai Koutra Assistant Professor, EECS dkoutra@ Admin Team Our administrative team is responsible for planning MSAIL's activities and holding the organization together. Our constitution codifies our roles. The following details our current admin team's roles and emails (at umich.edu). Alex Ji ajys@ Asad Khan asadk@ Michael Moffatt mmoffatt@ Terry Shi weiceica@ Andrew Carlson andson@ Hemil Shah heshah@ Chinmay Purushottam chinzo@\tAman Nagesh amannag@\tJaiden Schraut jaidenxs@\tHere are some of our graduated members.\n","date":-62135596800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":-62135596800,"objectID":"58368f3a1e154afb0e61aff0e497556d","permalink":"https://MSAIL.github.io/aboutus/","publishdate":"0001-01-01T00:00:00Z","relpermalink":"/aboutus/","section":"","summary":"MSAIL Mentors and Admin Team","tags":null,"title":"Who are we?","type":"page"}]