Feature
Companies are building machine learning (ML) models to make better use of their data and gain insights into their customers’ needs, supply chains, pricing and other factors important to their success. They often want to include people in ML development who have various levels of expertise, from data scientists to business analysts. Software companies are building tools to help companies build ML models, automate their construction and deployment, collaborate across the company and optimize the models for their specific needs. Here, we look at some of the top products that help companies create ML models.
1. Azure Machine Learning
Microsoft offers Azure Machine Learning to help companies build ML models at scale. It allows users to streamline deployment and management of thousands of ML models. Users can prepare their data on Apache Spark clusters. A feature store makes features discoverable and reusable. Azure uses repeatable pipelines to automate ML workflows and continually models various metrics in the models, detects data drift and triggers re-training of models to improve their performance. It lets users rapidly create models for tasks, such as classification, regression, vision and natural language processing (NLP). Azure builds in security and compliance for the models created.
2. Vertex AI
Vertex AI is an ML platform that allows data scientists and engineers to create, train, test, monitor, tune and deploy ML and artificial intelligence (AI) models. It includes Model Garden, a collection of more than 130 foundation models from Google and its partners, letting companies choose an existing model that fits their needs, which they can then customize with their own data. AutoML lets developers who have limited expertise in ML train models specific to their needs. Custom training gives complete control over the training process, allowing users to select their preferred framework, write their own code and choose hyperparameter tuning options.
3. Watson Machine Learning
Watson Machine Learning by IBM runs on the company’s Cloud Pak for Data, a set of integrated software components for data analysis, organization and management. Watson Machine Learning allows users to build, train and deploy models. It provides a variety of tools that allow users to choose an appropriate level of automation for their needs. They can automatically process structured data to create pipelines of candidate models, then select the best-performing pipeline. Deep learning (DL) experiments automate hundreds of training runs. Federated learning allows users to train models on disconnected data sources. Data scientists, developers and domain experts can also collaborate on managing model data.
4. SageMaker
SageMaker is a fully managed service from Amazon Web Services (AWS) that enables high-performance ML. The integrated development environment uses tools, such as notebooks, debuggers, profilers and pipelines. It supports various ML frameworks, toolkits and programming languages, including TensorFlow, Pytorch and Jupyter. It offers a choice of tools, including an integrated development environment (IDE) for data scientists and a no-code interface for business analysts. Users can standardize ML systems deployment and operations (MLOps).
6. Datalore
Datalore by JetBrains allows teams in an organization to collaborate on a private platform to develop ML and deep learning models. It allows them to start with pre-configured and isolated development environments and customize them as they go along. It uses Jupyter-compatible notebooks and allows them to be made interactive with drop downs, sliders, inputs and widgets. It offers smart coding assistance for Python, SQL, R, Scala and Kotlin languages. It also provides automatic visualization to help users understand their data with various types of charts. The notebooks can run on GPUs and CPUs.
7. Dataiku AutoML
Dataiku’s AutoML provides builders of ML models with automatic feature generation. It lets them find reference sets in the company’s feature store and integrate them into their projects. It is designed to be used by people of varying levels of expertise, allowing users to accept default settings or modify any part for their needs. It offers a guided methodology for developing models, with built-in guardrails and explainability that allow users to compare models. Advanced developers can create custom models using languages such as Python, R, Scala, Julia and Pyspark or import models developed with MLFlow. The collaborative visual flow shows everything that’s occurred along the data pipeline and allows users to search for previous relevant projects, so they can re-suse best practices.
10. Databricks
Databricks is a platform for building ML models that includes taking in data, selecting features, tuning and turning the model into a product. It allows users to automatically track experiments, code, results and artifacts in one central hub. Fine-grained access control and data lineage help users meet compliance needs. It allows a team to work together by providing collaborative notebooks that support several languages, including Python, R, Scala and SQL, they can use tools, such as Jupyter Lab, PyCharm, IntelliJ or RStudio. AutoML also provides automatic hyperparameter tuning and model search with Hyperopt, Apache Spark and MLflow integrations.