scispace - formally typeset
Search or ask a question
Author

Manish Amde

Bio: Manish Amde is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Spread spectrum & Network packet. The author has an hindex of 6, co-authored 13 publications receiving 1620 citations. Previous affiliations of Manish Amde include University of California & Indian Institute of Technology Bombay.

Papers
More filters
Journal Article
TL;DR: MLlib as mentioned in this paper is an open-source distributed machine learning library for Apache Spark that provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives.
Abstract: Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark's open-source distributed machine learning library. MLLIB provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shipped with Spark, MLLIB supports several languages and provides a high-level API that leverages Spark's rich ecosystem to simplify the development of end-to-end machine learning pipelines. MLLIB has experienced a rapid growth due to its vibrant open-source community of over 140 contributors, and includes extensive documentation to support further growth and to let users quickly get up to speed.

1,551 citations

Posted Content
TL;DR: MLlib as discussed by the authors is an open-source distributed machine learning library for Apache Spark that provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives.
Abstract: Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark's open-source distributed machine learning library. MLlib provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shipped with Spark, MLlib supports several languages and provides a high-level API that leverages Spark's rich ecosystem to simplify the development of end-to-end machine learning pipelines. MLlib has experienced a rapid growth due to its vibrant open-source community of over 140 contributors, and includes extensive documentation to support further growth and to let users quickly get up to speed.

84 citations

Journal ArticleDOI
25 Jul 2005
TL;DR: The authors survey various methodologies used for leveraging asynchronous on-chip communication and investigate various GALS based implementations, desynchronisation strategies and asynchronous network-on-chip (NoC) designs.
Abstract: Various kinds of asynchronous interconnect and synchronisation mechanisms are being proposed for designing low power, low emission and high-speed SOCs. They facilitate modular design and possess greater resilience to fabrication time inter-chip and run-time intra-chip process variability. They can provide a solution for low power consumption in chips and simplify global timing assumptions, e.g. on clock skew, by having asynchronous communication between modules. A few methodologies, including globally asynchronous, locally synchronous and desynchronisation, aim at leveraging the benefits of both synchronous and asynchronous design paradigms. The authors survey various methodologies used for leveraging asynchronous on-chip communication. They investigate various GALS based implementations, desynchronisation strategies and asynchronous network-on-chip (NoC) designs.

63 citations

Proceedings ArticleDOI
02 Jun 2003
TL;DR: The automated design of an asynchronous DLX microprocessor has been designed beginning with a standard RTL-like Verilog specification and the Pipefitter design flow has been used to automatically generate both the specification for the direct implementation of the Control Unit and a synthesisable Verilogs specification of the Data Path.
Abstract: In this paper the automated design of an asynchronous DLX microprocessor is presented. The microprocessor has been designed beginning with a standard RTL-like Verilog specification and the Pipefitter design flow has been used to automatically generate both the specification for the direct implementation of the Control Unit and a synthesizable Verilog specification of the Data Path. The architecture of the DLX is locally synchronous and globally asynchronous and the delay elements for the generation of the local clock signal are automatically produced by Pipefitter as well. The following steps of the design flows (i.e., logic synthesis, technology mapping, placement and routing) have been completed using standard tools leading to the final layout of the circuit. The final microprocessor implements all the functionality of a standard DLX (with the exception of the floating point unit) and supports its whole set of instructions. Some considerations on the area occupation of the microcontroller will be presented in the last section of this paper.

20 citations

Patent
12 Feb 2014
TL;DR: In this article, a churn predictor is built and trained from data collected from multiple customers, including static configuration data and dynamic measured data, and the churn predictor processes the customer instance and generates a churn likelihood score.
Abstract: A churn predictor predicts whether a customer is likely to churn. The churn predictor is built and trained from data collected from multiple customers. The data can include static configuration data and dynamic measured data. A churn predictor builder generates multiple customer instances and processes the instances based on the collected data, and based on separating the instances into one or more training subsets. Based on the processing, the builder generates and saves a churn predictor. The churn predictor can access data for a customer and generate a customer instance for evaluation against the training data. The churn predictor processes the customer instance and generates a churn likelihood score. Based on a churn type, the churn predictor system can generate preventive action for the customer.

16 citations


Cited by
More filters
Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations

Journal ArticleDOI
TL;DR: This open source computing framework unifies streaming, batch, and interactive big data workloads to unlock new applications.
Abstract: This open source computing framework unifies streaming, batch, and interactive big data workloads to unlock new applications

1,776 citations

Journal ArticleDOI
TL;DR: The research shows that NoC constitutes a unification of current trends of intrachip communication rather than an explicit new alternative.
Abstract: The scaling of microchip technologies has enabled large scale systems-on-chip (SoC). Network-on-chip (NoC) research addresses global communication in SoC, involving (i) a move from computation-centric to communication-centric design and (ii) the implementation of scalable communication structures. This survey presents a perspective on existing NoC research. We define the following abstractions: system, network adapter, network, and link to explain and structure the fundamental concepts. First, research relating to the actual network design is reviewed. Then system level design and modeling are discussed. We also evaluate performance analysis techniques. The research shows that NoC constitutes a unification of current trends of intrachip communication rather than an explicit new alternative.

1,720 citations

Journal ArticleDOI
TL;DR: It is found that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.
Abstract: Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network's prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.

1,491 citations