diff --git a/README.md b/README.md index 06b5634b50..2f729cf979 100644 --- a/README.md +++ b/README.md @@ -1,47 +1,42 @@ # Machine Learning for .NET -[ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) is a cross-platform open-source machine learning framework which makes machine learning accessible to .NET developers with the same code that powers machine learning across many Microsoft products, including Power BI, Windows Defender, and Azure. +[ML.NET](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet) is a cross-platform open-source machine learning (ML) framework for .NET. -ML.NET allows .NET developers to develop/train their own models and infuse custom machine learning into their applications using .NET, even without prior expertise in developing or tuning machine learning models. It provides data loading from files and databases, enables data transformations and includes many ML algorithms. +ML.NET allows developers to easily build, train, deploy, and consume custom models in their .NET applications without requiring prior expertise in developing machine learning models or experience with other programming languages like Python or R. The framework provides data loading from files and databases, enables data transformations, and includes many ML algorithms. -ML.NET enables machine learning (ML) tasks like classification (for example, text classification, sentiment analysis), regression (for example, price prediction), and many other ML tasks such as anomaly detection, time-series-forecast, clustering, ranking, etc. +With ML.NET, you can train models for a [variety of scenarios](https://docs.microsoft.com/dotnet/machine-learning/resources/tasks), like classification, forecasting, and anomaly detection. -## Getting started with machine learning by using ML.NET +You can also consume both TensorFlow and ONNX models within ML.NET which makes the framework more extensible and expands the number of supported scenarios. -If you are new to machine learning, start by learning the basics from this collection of resources targeting ML.NET: +## Getting started with machine learning and ML.NET -[Learn ML.NET](https://dotnet.microsoft.com/learn/ml-dotnet) +- Learn more about the [basics of ML.NET](https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet). +- Build your first ML.NET model by following our [ML.NET Getting Started tutorial](https://dotnet.microsoft.com/learn/ml-dotnet/get-started-tutorial/intro). +- Check out our [documentation and tutorials](https://docs.microsoft.com/dotnet/machine-learning/). +- See the [API Reference documentation](https://docs.microsoft.com/dotnet/api/?view=ml-dotnet). +- Clone our [ML.NET Samples GitHub repo](https://github.com/dotnet/machinelearning-samples) and run some sample apps. +- Take a look at some [ML.NET Community Samples](https://github.com/dotnet/machinelearning-samples/blob/main/docs/COMMUNITY-SAMPLES.md). +- Watch some videos on the [ML.NET videos YouTube playlist](https://aka.ms/mlnetyoutube). -## ML.NET Documentation, tutorials and reference +## Roadmap -Please check our [documentation and tutorials](https://docs.microsoft.com/en-us/dotnet/machine-learning/). - -See the [API Reference documentation](https://docs.microsoft.com/en-us/dotnet/api/?view=ml-dotnet). - -## Sample apps - -We have a GitHub repo with [ML.NET sample apps](https://github.com/dotnet/machinelearning-samples) with many scenarios such as Sentiment analysis, Fraud detection, Product Recommender, Price Prediction, Anomaly Detection, Image Classification, Object Detection and many more. - -In addition to the ML.NET samples provided by Microsoft, we're also highlighting many more samples created by the community showcased in this separate page [ML.NET Community Samples](https://github.com/dotnet/machinelearning-samples/blob/main/docs/COMMUNITY-SAMPLES.md) - - -## ML.NET videos playlist at YouTube - -The [ML.NET videos playlist](https://aka.ms/mlnetyoutube) on YouTube contains several short videos. Each video focuses on a particular topic of ML.NET. +Take a look at ML.NET's [Roadmap](ROADMAP.md) to see what the team plans to work on in the next year. ## Operating systems and processor architectures supported by ML.NET -ML.NET runs on Windows, Linux, and macOS using [.NET Core](https://github.com/dotnet/core), or Windows using .NET Framework. +ML.NET runs on Windows, Linux, and macOS using .NET Core, or Windows using .NET Framework. + +ML.NET also runs on ARM64, Apple M1, and Blazor Web Assembly. However, there are some [limitations](docs/project-docs/platform-limitations.md). -64 bit is supported on all platforms. 32 bit is supported on Windows, except for TensorFlow and LightGBM related functionality. +64-bit is supported on all platforms. 32-bit is supported on Windows, except for TensorFlow and LightGBM related functionality. -## ML.NET Nuget packages status +## ML.NET NuGet packages status [![NuGet Status](https://img.shields.io/nuget/vpre/Microsoft.ML.svg?style=flat)](https://www.nuget.org/packages/Microsoft.ML/) ## Release notes -Check out the [release notes](docs/release-notes) to see what's new. +Check out the [release notes](docs/release-notes) to see what's new. You can also read the [blog posts](https://devblogs.microsoft.com/dotnet/category/ml-net/) for more details about each release. ## Using ML.NET packages @@ -52,7 +47,7 @@ Once you have an app, you can install the ML.NET NuGet package from the .NET Cor dotnet add package Microsoft.ML ``` -or from the NuGet package manager: +or from the NuGet Package Manager: ``` Install-Package Microsoft.ML ``` @@ -65,7 +60,7 @@ Daily NuGet builds of the project are also available in our Azure DevOps feed: ## Building ML.NET (For contributors building ML.NET open source code) -To build ML.NET from source please visit our [developers guide](docs/project-docs/developer-guide.md). +To build ML.NET from source please visit our [developer guide](docs/project-docs/developer-guide.md). [![codecov](https://codecov.io/gh/dotnet/machinelearning/branch/main/graph/badge.svg?flag=production)](https://codecov.io/gh/dotnet/machinelearning) @@ -81,7 +76,9 @@ To build ML.NET from source please visit our [developers guide](docs/project-doc ## Release process and versioning -Check out the [release process documentation](docs/release-notes) to understand the different kinds of ML.NET releases. +Major releases of ML.NET are shipped once a year with the major .NET releases, starting with ML.NET 1.7 in November 2021 with .NET 6, then ML.NET 2.0 with .NET 7, etc. We will maintain release branches to optionally service ML.NET with bug fixes and/or minor features on the same cadence as .NET servicing. + +Check out the [Release Notes](docs/release-notes) to see all of the past ML.NET releases. ## Contributing @@ -89,15 +86,15 @@ We welcome contributions! Please review our [contribution guide](CONTRIBUTING.md ## Community -Please join our community on Gitter [![Join the chat at https://gitter.im/dotnet/mlnet](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/dotnet/mlnet?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) +- Join our community on [Discord](https://aka.ms/dotnet-discord). +- Tune into the [.NET Machine Learning Community Standup](https://dotnet.microsoft.com/live/community-standup) every other Wednesday at 10AM Pacific Time. This project has adopted the code of conduct defined by the [Contributor Covenant](https://contributor-covenant.org/) to clarify expected behavior in our community. For more information, see the [.NET Foundation Code of Conduct](https://dotnetfoundation.org/code-of-conduct). - ## Code examples -Here is a snippet code for training a model to predict sentiment from text samples. You can find complete samples in [samples repo](https://github.com/dotnet/machinelearning-samples). +Here is a code snippet for training a model to predict sentiment from text samples. You can find complete samples in the [samples repo](https://github.com/dotnet/machinelearning-samples). ```C# var dataPath = "sentiment.csv"; @@ -125,16 +122,11 @@ var prediction = predictionEngine.Predict(new SentimentData }); Console.WriteLine("prediction: " + prediction.Prediction); ``` -A cookbook that shows how to use these APIs for a variety of existing and new scenarios can be found [here](docs/code/MlNetCookBook.md). ## License -ML.NET is licensed under the [MIT license](LICENSE) and it is free to use commercially. +ML.NET is licensed under the [MIT license](LICENSE), and it is free to use commercially. ## .NET Foundation -ML.NET is a [.NET Foundation](https://www.dotnetfoundation.org/projects) project. - -There are many .NET related projects on GitHub. - -- [.NET home repo](https://github.com/Microsoft/dotnet) - links to 100s of .NET projects, from Microsoft and the community. +ML.NET is a part of the [.NET Foundation](https://www.dotnetfoundation.org/projects). diff --git a/ROADMAP.md b/ROADMAP.md index 9dbc2d8016..d7556716c0 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -1,21 +1,131 @@ # The ML.NET Roadmap -The goal of the ML.NET project is to make .NET developers great at machine learning. This document describes the plan for the project. +The goal of ML.NET is to democratize machine learning for .NET developers. This document outlines the current roadmap for the ML.NET framework and APIs. -ML.NET is a community effort and we welcome community feedback on our plans. The best way to give feedback is to open an issue in this repo. +To see the plans for ML.NET tooling, check out the [Model Builder repo](https://github.com/dotnet/machinelearning-modelbuilder/issues/1707). -We also invite contributions. The [up-for-grabs issues](https://github.com/dotnet/machinelearning/issues?q=is%3Aopen+is%3Aissue+label%3Aup-for-grabs) on GitHub are a good place to start. +## Feedback and contributions -## Goals through June 30, 2020 -### Test stability -Continuous integration builds currently have a 30% pass rate. We aim to get this pass rate up to at least 80%. +ML.NET is a community effort and we welcome community feedback on our plans. The best way to give feedback is to [open an issue](https://github.com/dotnet/machinelearning/issues/new/choose) in this repo. -### Streaming metrics -Currently, the way ML.NET computes [metrics](https://docs.microsoft.com/dotnet/machine-learning/resources/metrics) is memory-intensive. We will compute metrics in a streaming fashion instead, thereby reducing memory consumption. +We also invite contributions. The [first good issue](https://github.com/dotnet/machinelearning/labels/good%20first%20issue) and [up-for-grabs issues](https://github.com/dotnet/machinelearning/issues?q=is%3Aopen+is%3Aissue+label%3Aup-for-grabs) on GitHub are a good place to start. You can also help work on any of the features we've listed below or work on features that you want to add to the framework. -### Multivariate anomaly detection -ML.NET already supports [univariate anomaly detection](https://docs.microsoft.com/dotnet/api/microsoft.ml.timeseriescatalog.detectanomalybysrcnn?view=ml-dotnet), but we will add the ability to detect anomalies in multiple variables over time. +## Goals through June 2022 -### ONNX Runtime exportability +The following sections outline the major areas and features we plan to work on in the next year. -We will expand the number of ML.NET transforms and estimators that are exportable to the [ONNX Runtime](https://github.com/Microsoft/onnxruntime). +Note, that this is an aspirational list of what we hope to get to. Many of the items on this list will require more investigations and design, which can result in changes in our plans. We may have to cut things as we go, or we may be able to add more things. + +As we prioritize, cost, and continue planning, we will try to keep the Roadmap up to date to reflect our progress and learnings. + +### Keep docs, samples, and repo up to date + +We heard your feedback loud and clear that our outdated docs and samples were a top pain point when learning and using ML.NET. + +We have invested more resources into content development to make sure our Docs stay relevant and that we add documentation for new features faster as well as add more relevant samples. + +You can file issues and make suggestions for ML.NET documentation in the [dotnet/docs repo](https://github.com/dotnet/docs) and for ML.NET samples in the [dotnet/machinelearning-samples](https://github.com/dotnet/machinelearning-samples) repo. + +We are also taking steps to organize the [dotnet/machinelearning](https://github.com/dotnet/machinelearning) repo and updating our triage processes so that we can address your issues and feedback faster. Issues will be linked to version releases in the [Projects](https://github.com/dotnet/machinelearning/projects) section of the repo so you can see what we're actively working on and when we plan to release. + +### Get on the .NET release schedule + +ML.NET is .NET, and to make it feel more a part of .NET, we've decided to align with the .NET release schedule. + +This means that we will ship our next version of ML.NET (v1.7.0) with .NET 6.0 in November 2021. + +While we'll have major releases of ML.NET once a year with the major .NET releases, we will maintain release branches to optionally service ML.NET with bug fixes and/or minor features on the same cadence as .NET servicing. + +### Deep learning + +This past year we've been working on our plan for deep learning in .NET, and now we are ready to execute that plan to expand ML.NET's deep learning support. + +As part of this plan, we will: + +1. Make it easier to consume ONNX models in ML.NET using the ONNX Runtime (RT) +2. Fully support and productionize [TorchSharp](https://github.com/xamarin/TorchSharp) for building neural networks in .NET +3. Build a bridge between TorchSharp and ML.NET + +Read more about the deep learning plan and leave your feedback in this [tracking issue](https://github.com/dotnet/machinelearning/issues/5918). + +### Move from System.Drawing to ImageSharp + +Starting in .NET 6, System.Drawing.Common will only be supported on Windows (you can read more about this decision in this [design doc](https://github.com/dotnet/designs/blob/main/accepted/2021/system-drawing-win-only/system-drawing-win-only.md)). + +To ensure ML.NET works great on all platforms, we will replace System.Drawing with the [ImageSharp](https://github.com/SixLabors/ImageSharp) graphics library. + +*Related issues*: + +- [#3154](https://github.com/dotnet/machinelearning/issues/3154) + +### New features and scenarios + +#### Named Entity Recognition (NER) + +Named Entity Recognition, or NER, is the process of identifying and classifying/tagging information in text. For example, an NER model might look at a block of text and pick out "Seattle" and "Space Needle" and categorize them as locations or might find and tag "Microsoft" as a company. + +Currently you can consume a pre-trained ONNX model in ML.NET for NER, but it is not possible to train a custom NER model in ML.NET which has been a highly requested feature for several years. + +This year, we will work on adding support for training custom NER models in ML.NET. + +*Related issues*: + +- [#630](https://github.com/dotnet/machinelearning/issues/630) + +#### Dynamic IDataView + +In ML.NET, you must first define your model input and output schemas as new classes before loading data into an IDataView. + +This year, we will work on adding a way to create dynamic IDataViews, meaning that you don't have to define your schemas beforehand and instead the shape of the training data defines the schemas. + +*Related issues*: + +- [#5895](https://github.com/dotnet/machinelearning/issues/5895) + +#### Multivariate time series forecasting + +Currently ML.NET only supports univariate time series forecasting with the [SSA algorithm](https://docs.microsoft.com/dotnet/api/microsoft.ml.transforms.timeseries.ssaforecastingestimator?view=ml-dotnet) which is currently being [added to Model Builder](https://github.com/dotnet/machinelearning-modelbuilder/issues/1750). + +Univariate time series has one time-dependent variable whose values only depend on its past values through time. Multivariate time series has more than one time-dependent variable where each variable depends on its past values as well as the other variables. + +This year, we will work on adding support for multivariate time series forecasting to ML.NET. + +*Related issues*: + +- [#5638](https://github.com/dotnet/machinelearning/issues/5638) +- [#1696](https://github.com/dotnet/machinelearning/issues/1696) + +#### Multilabel Classification + +Currently, ML.NET's classification algorithms will return one Predicted Label as well as an array of Scores which correspond to each possible class. However, mapping each label to the Score is currently not a great experience. + +This year we will work on making the prediction info more user-friendly so that it is easy to assign multiple classes to one prediction. + +*Related issues*: + +- [#3909](https://github.com/dotnet/machinelearning/issues/3909) +- [#2278](https://github.com/dotnet/machinelearning/issues/2278) + +### Model explainability & Responsible AI + +Model Explainability and Responsible AI are becoming increasingly important areas of focus in the Machine Learning space and at Microsoft. Model explainability and fairness features are important because they let you debug and improve your models and answer questions about bias, building trust, and complying with regulations. + +ML.NET currently offers two main model explainability features: [Permutation Feature Importance](https://docs.microsoft.com/dotnet/api/microsoft.ml.permutationfeatureimportanceextensions?view=ml-dotnet) (PFI) and the [Feature Contribution Calculator](https://docs.microsoft.com/dotnet/api/microsoft.ml.transforms.featurecontributioncalculatingestimator?view=ml-dotnet) (FCC). + +We got a lot of feedback that the PFI API was difficult to use, so our first step is to improve the current experience in ML.NET. These improvements can be tracked in this [issue](https://github.com/dotnet/machinelearning/issues/5625) which will be merged soon. + +This year we also plan to expand the number of model explainability and fairness features. We are currently working on this plan and will update the roadmap as we finalize which model explainability and fairness techniques we will bring into ML.NET. + +### Define the plan for data prep + +While we are working on developing the features mentioned above, we will also be working on our plan for data preparation and wrangling in .NET. + +#### DataFrame API + +The plan for data prep will include the roadmap for the DataFrame API (Microsoft.Data.Analysis) which we will add and update to this Roadmap doc. + +*Related issues*: + +- [#5870](https://github.com/dotnet/machinelearning/issues/5870) +- [#5716](https://github.com/dotnet/machinelearning/issues/5716) +- [#1696](https://github.com/dotnet/machinelearning/issues/1696) diff --git a/docs/README.md b/docs/README.md index da5ab98236..3cce623465 100644 --- a/docs/README.md +++ b/docs/README.md @@ -2,12 +2,12 @@ Documents Index =============== Intro to ML.NET -=============== +--------------- -ML.NET provides state-of-the-art ML algorithms, transforms and components, aiming to make them useful for all developers, data scientists, and information workers and helpful in all products, services and devices. +ML.NET provides state-of-the-art ML algorithms, transforms, and components, aiming to make them useful for all developers, data scientists, and information workers and helpful in all products, services, and devices. Project Docs -============ +------------ - [Developer Guide](project-docs/developer-guide.md) - [Contributing to ML.NET](project-docs/contributing.md) @@ -16,21 +16,15 @@ Project Docs - [Project NuGet Dependencies](https://github.com/dotnet/buildtools/blob/master/Documentation/project-nuget-dependencies.md) - [ML.NET Roadmap](https://github.com/dotnet/machinelearning/blob/main/README.md) - [ML.NET Cookbook](code/MlNetCookBook.md) -- [ML.NET API Reference Documentation](https://docs.microsoft.com/en-us/dotnet/api/?view=ml-dotnet) +- [ML.NET API Reference Documentation](https://docs.microsoft.com/dotnet/api/?view=ml-dotnet) Building from Source -==================== +-------------------- - [Building ML.NET on Linux and OS X](building/unix-instructions.md) - [Building ML.NET on Windows](building/windows-instructions.md) Repo of Samples -==================== +--------------- - [ML.NET Samples](https://github.com/dotnet/machinelearning-samples/blob/main/README.md) - -Extensions for ML.NET -==================== - -- [Infer.NET - Bayesian / Probabilistic inference for ML.NET](https://github.com/dotnet/infer/blob/master/README.md) -- [NimbusML - Python bindings for ML.NET](https://github.com/Microsoft/NimbusML/blob/master/README.md) diff --git a/docs/project-docs/platform-limitations.md b/docs/project-docs/platform-limitations.md new file mode 100644 index 0000000000..28b9cc6e7d --- /dev/null +++ b/docs/project-docs/platform-limitations.md @@ -0,0 +1,16 @@ +Platform limitations +====================== + +While ML.NET is cross-platform, there are some limitations for specific platforms as outlined in the chart below. + +| | Training | Inference | +| :---- | :-----: | :---: | +| **Windows** | Yes | Yes | +| **Linux** | Yes | Yes | +| **macOS** | Yes | Yes | +| **ARM64** / **Apple M1** | Yes, with **limitations**.The following are *not supported*:
EnableMLUnsupportedPlatformTargetCheck
flag to false
to use ML.NET in Blazor.* | Yes, with **limitations**.The following are *not supported*:DLL not found
exception.
+
+If you are blocked by any of these limitations or would like to see different behavior when hitting them, please let us know by [filing an issue](https://github.com/dotnet/machinelearning/issues/new?assignees=&labels=&template=suggest-a-feature.md&title=).