Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Development] MXNet 2.0 Update #18931

Open
@szha

Description

Overview

As MXNet development approaches its 2.0 major milestone, we would like to update our community on roadmap status, and highlight new and upcoming features.

Motivation

The deep learning community has largely evolved independently from the data science and machine learning(ML) user base in NumPy. While most deep learning frameworks now implement NumPy-like math and array libraries, they differ in the definition of the APIs which creates confusion and a steeper learning curve of deep learning for ML practitioners and data scientists. This creates a barrier not only in the skillsets of the two different communities, but also hinders the knowledge sharing and code interoperability. MXNet 2.0 seeks to unify the deep learning and machine learning ecosystems.

What's new in version 2.0?

MXNet 2.0 is a major version upgrade of MXNet that provides NumPy-like programming interface, and is integrated with the new, easy-to-use Gluon 2.0 interface. Under the hood, we provide an enhanced DL implementation in NumPy. As a result, NumPy users can easily adopt MXNet. Version 2.0 incorporated accumulative learnings from MXNet 1.x and focuses on usability, extensibility, and developer experiences.

What's coming next?

We plan to make a series of beta releases of MXNet 2.0 in lockstep with downstream projects migration schedule. The first release is tracked in #19139. Also, subscribe to [email protected] for additional announcements.

How do I get started?

As a developer of MXNet, you can check out our main 2.0 branch. MXNet 2.0 nightly builds are available for download.

How can I help?

There are many ways you can contribute:

  • By submitting bug reports, you can help us identify issues and fix them.
  • If there are issues you would like to help with, let us know in the issue comments and one of the committers will help provide suggestions and pointers.
  • If you have a project that you would like to build on top of MXNet 2.0, post an RFC and let the MXNet developers know.
  • Looking for ideas to get started with developing MXNet? Check out the good-first-issues labels for Python developers and C++ developers

Highlights

Below are the highlights of new features that are available now in the MXNet 2.0 nightly build.

NumPy-compatible Array and Math Library

NumPy has long been established as the standard array and math library in Python and the MXNet community recognizes significant benefits in bridging the existing NumPy machine learning community and the growing deep learning community. In #14253, the MXNet community reached consensus on moving towards a NumPy-compatible programming experience, and committed to a major effort on providing NumPy compatible array library and operators.

To see what the new programming experience is like, check out Dive into Deep Learning book, the most comprehensive interactive deep learning book with code+math+forum. The latest version has an MXNet implementation with the new MXNet np, the NumPy-compatible math and array interface.

Gluon 2.0

Since the introduction of the Gluon API in MXNet 1.x, it has superseded other MXNet API for model development such as symbolic, module, and model APIs. Conceptually, Gluon was the first attempt in the deep learning community to unify the flexibility of imperative programming with the performance benefits of symbolic programming, through just-in-time compilation.

In Gluon 2.0, we are extending support to MXNet np with simplified interface and new functionalities:

  • Simplified hybridization with deferred compute and tracing: Deferred compute allows the imperative execution to be used for graph construction, which allows us to unify the historic divergence of NDArray and Symbol. Hybridization now works in simplified hybrid forward interface; users only need to specify the computation through imperative programming. Hybridization also works through tracing.
  • Data 2.0: The new design for data loading in Gluon allows hybridizing and deploy data processing pipeline in the same way as model hybridization. The new C++ data loader improves data loading efficiency on CIFAR 10 by 50%.
  • Distributed 2.0: The new distributed-training design in Gluon 2.0 provides a unified distributed data parallel interface across native Parameter Server, BytePS, and Horovod, and is extensible for supporting custom distributed training libraries.
  • Gluon Probability: parameterizable probability distributions and sampling functions to facilitate more areas of research such as Baysian methods and AutoML.
  • Gluon Metrics and Optimizers: refactored with MXNet np interface and addressed legacy issues.

3rdparty Plugin Support

Extensibility is important for both academia and industry users who want to develop new, and customized capabilities. In MXNet 2.0, we added the following support for plugging in 3rdparty functionality at runtime.

Developer Experiences

In MXNet 2.0, we are making development process more efficient in MXNet.

  • New CMake build system: improved CMake build system for compiling the most performant MXNet backend library based on the available environment, as well as cross-compilation support.
  • Memory profiler: the goal is to provide visibility and insight into the memory consumption of the MXNet backend.
  • Pythonic exception type in backend: updated error reporting in MXNet backend that allows directly defining exception types with Python exception classes to enable Pythonic error handling.

Documentation for Developers

We are improving the documentation for MXNet and deep learning developers.

  • CWiki for developers: reorganized and improved the development section in MXNet CWiki.
  • Developer Guide: new developer guides on how to develop and improve deep learning application with MXNet.

Ecosystem: GluonNLP NumPy

We are refactoring GluonNLP with NumPy interface for the next generation of GluonNLP. The initial version is available on dmlc/gluon-nlp master branch:

  • NLP models with NumPy: we support a large number of state-of-the-art backbone networks in GluonNLP including
    BERT, ALBERT, ELECTRA, MobileBERT, RoBERTa, XLMR, Transformer, Transformer XL
  • New Data Processing CLI: Consolidated data processing scripts into one CLI.

API Deprecation

As described in #17676, we are taking this major version upgrade as an opportunity to address the legacy issues in MXNet 1.x. Most notably, we are deprecating the following API:

  • Model, Module, Symbol: we are deprecating the legacy modeling and graph construction API in favor of automated graph tracing through deferred compute and Gluon.
  • mx.rnn: we are deprecating the symbolic RNN API in favor of the Gluon RNN API.
  • NDArray: we are deprecating NDArray and the old nd API in favor of the NumPy-compatible np and npx. The NDArray operators will be provided as an optional feature potentially in a separate repo. This will enable existing users who rely on MXNet 1.x for inference to have an easy upgrade path as old models will continue to work.
  • Caffe converter and Torch plugin: both extensions see low usage nowadays. We are extending support in DLPack to better support interoperability with PyTorch and Tensorflow instead.

Related Projects

Below is a list of project trackers for MXNet 2.0.

@apache/mxnet-committers feel free to comment or directly edit this post for updates in additional areas.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    RFCPost requesting for comments

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions