scispace - formally typeset
Search or ask a question
Author

Pratyush Kumar

Bio: Pratyush Kumar is an academic researcher from Indian Institute of Technology Madras. The author has contributed to research in topics: Scheduling (computing) & Cyber-physical system. The author has an hindex of 20, co-authored 110 publications receiving 1389 citations. Previous affiliations of Pratyush Kumar include École Polytechnique Fédérale de Lausanne & University of California, Santa Barbara.


Papers
More filters
Posted Content
TL;DR: The IndicNLP corpus, a large-scale, general-domain corpus containing 2.7 billion words for 10 Indian languages from two language families, is presented and it is shown that the IndiNLP embeddings significantly outperform publicly available pre-trained embedding on multiple evaluation tasks.
Abstract: We present the IndicNLP corpus, a large-scale, general-domain corpus containing 2.7 billion words for 10 Indian languages from two language families. We share pre-trained word embeddings trained on these corpora. We create news article category classification datasets for 9 languages to evaluate the embeddings. We show that the IndicNLP embeddings significantly outperform publicly available pre-trained embedding on multiple evaluation tasks. We hope that the availability of the corpus will accelerate Indic NLP research. The resources are available at this https URL.

38 citations

Proceedings ArticleDOI
12 Oct 2020
TL;DR: This work presents the Indian Lexicon Sign Language Dataset - INCLUDE - an ISL dataset that contains 0.27 million frames across 4,287 videos over 263 word signs from 15 different word categories and evaluates several deep neural networks combining different methods for augmentation, feature extraction, encoding and decoding.
Abstract: Indian Sign Language (ISL) is a complete language with its own grammar, syntax, vocabulary and several unique linguistic attributes. It is used by over 5 million deaf people in India. Currently, there is no publicly available dataset on ISL to evaluate Sign Language Recognition (SLR) approaches. In this work, we present the Indian Lexicon Sign Language Dataset - INCLUDE - an ISL dataset that contains 0.27 million frames across 4,287 videos over 263 word signs from 15 different word categories. INCLUDE is recorded with the help of experienced signers to provide close resemblance to natural conditions. A subset of 50 word signs is chosen across word categories to define INCLUDE-50 for rapid evaluation of SLR meth- ods with hyperparameter tuning. As the first large scale study of SLR on ISL, we evaluate several deep neural networks combining different methods for augmentation, feature extraction, encoding and decoding. The best performing model achieves an accuracy of 94.5% on the INCLUDE-50 dataset and 85.6% on the INCLUDE dataset. This model uses a pre-trained feature extractor and encoder and only trains a decoder. We further explore generalisation by fine-tuning the decoder for an American Sign Language dataset. On the ASLLVD with 48 classes, our model has an accuracy of 92.1%; improving on existing results and providing an efficient method to support SLR for multiple languages.

37 citations

Proceedings ArticleDOI
25 Jan 2011
TL;DR: It is proved that for periodic task-graphs, the optimal temperature is independent of the chosen static-ordering when following the proposed JUST schedule, which derives the optimal schedule referred to as the JUST schedule.
Abstract: Dynamic thermal management (DTM) techniques to manage the load on a system to avoid thermal hazards are soon becoming mainstream in today's systems. With the increasing percentage of leakage power, switching off the processors is becoming a viable alternative technique to speed scaling. For real-time applications, it is crucial that under such techniques the system still meets the performance constraints. In this paper we study stop-go scheduling to minimize peak temperature when scheduling an application, modeled as a task-graph, within a given makespan constraint. For a given static-ordering of execution of the tasks, we derive the optimal schedule referred to as the JUST schedule. We prove that for periodic task-graphs, the optimal temperature is independent of the chosen static-ordering when following the proposed JUST schedule. Simulation experiments validate the theoretical results.

37 citations

Proceedings ArticleDOI
05 Jun 2011
TL;DR: This work derives the shaper such that no job misses its real-time deadline and the peak temperature is optimally reduced, for the class of leaky bucket shapers which have a light-weight implementation.
Abstract: With increasing power densities, managing on-chip temperatures has become an important design challenge. We propose a novel approach to this problem with the use of shapers to dynamically and selectively insert idle times during the execution of hard real-time jobs on a single speed processor. For the class of leaky bucket shapers which have a light-weight implementation, we derive the shaper such that no job misses its real-time deadline and the peak temperature is optimally reduced. The analysis and design of such shapers allows for dynamically variable streams of jobs; for instance, periodic streams with jitter. We extend our results to consider non-zero power and timing overhead in transitioning to the idle mode. With experimental results, we demonstrate that the proposed approach provides a large improvement: on average 8K peak temperature reduction or 40% increase in utilization for a given peak temperature.

35 citations

Posted Content
TL;DR: Samanantar as discussed by the authors is the largest publicly available parallel corpora collection for Indic languages, which contains 46.9 million sentence pairs between English and 11 languages (from two language families).
Abstract: We present Samanantar, the largest publicly available parallel corpora collection for Indic languages. The collection contains a total of 46.9 million sentence pairs between English and 11 Indic languages (from two language families). In particular, we compile 12.4 million sentence pairs from existing, publicly-available parallel corpora, and we additionally mine 34.6 million sentence pairs from the web, resulting in a 2.8X increase in publicly available sentence pairs. We mine the parallel sentences from the web by combining many corpora, tools, and methods. In particular, we use (a) web-crawled monolingual corpora, (b) document OCR for extracting sentences from scanned documents (c) multilingual representation models for aligning sentences, and (d) approximate nearest neighbor search for searching in a large collection of sentences. Human evaluation of samples from the newly mined corpora validate the high quality of the parallel sentences across 11 language pairs. Further, we extracted 82.7 million sentence pairs between all 55 Indic language pairs from the English-centric parallel corpus using English as the pivot language. We trained multilingual NMT models spanning all these languages on Samanantar and compared with other baselines and previously reported results on publicly available benchmarks. Our models outperform existing models on these benchmarks, establishing the utility of Samanantar. Our data (this https URL) and models (this https URL) will be available publicly and we hope they will help advance research in Indic NMT and multilingual NLP for Indic languages.

32 citations


Cited by
More filters
Book ChapterDOI
Robin Burke1
01 Jan 2007
TL;DR: This chapter surveys the space of two-part hybrid recommender systems, comparing four different recommendation techniques and seven different hybridization strategies and finds that cascade and augmented hybrids work well, especially when combining two components of differing strengths.
Abstract: Adaptive web sites may offer automated recommendations generated through any number of well-studied techniques including collaborative, content-based and knowledge-based recommendation. Each of these techniques has its own strengths and weaknesses. In search of better performance, researchers have combined recommendation techniques to build hybrid recommender systems. This chapter surveys the space of two-part hybrid recommender systems, comparing four different recommendation techniques and seven different hybridization strategies. Implementations of 41 hybrids including some novel combinations are examined and compared. The study finds that cascade and augmented hybrids work well, especially when combining two components of differing strengths.

1,104 citations

Posted Content
TL;DR: This work proposes the Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, and performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques.
Abstract: When building a unified vision system or gradually adding new capabilities to a system, the usual assumption is that training data for all tasks is always available. However, as the number of tasks grows, storing and retraining on such data becomes infeasible. A new problem arises where we add new capabilities to a Convolutional Neural Network (CNN), but the training data for its existing capabilities are unavailable. We propose our Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities. Our method performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques and performs similarly to multitask learning that uses original task data we assume unavailable. A more surprising observation is that Learning without Forgetting may be able to replace fine-tuning with similar old and new task datasets for improved new task performance.

1,037 citations

Journal ArticleDOI
TL;DR: Semisolid metal (SSM) processing is a relatively new technology for metal forming as discussed by the authors, which deals with semisolid slurries, in which non-dendritic solid particles are dispersed in a liquid matrix.
Abstract: Semisolid metal (SSM) processingis a relatively new technology for metal forming. Different from the conventional metal forming technologies which use either solid metals (solid state processing) or liquid metals (casting) as starting materials, SSM processing deals with semisolid slurries, in which non-dendritic solid particles are dispersed in a liquid matrix. Semisolid metal slurries exhibit distinctive rheological characteristics: the steady state behaviour is pseudoplastic (or shear thinning), while the transient state behaviour is thixotropic. All the currently available technologies for SSM processing have been developed based on those unique rheological properties, which in turn originate from their non-dendritic microstructures. Year 2001 marks the 30th anniversary of the concept of SSM processing. Today, SSM processing has established itself as a scientifically sound and commercially viable technology for production of metallic components with high integrity, improved mechanical properti...

813 citations

Journal ArticleDOI
23 Apr 2014-Chance
TL;DR: Cressie and Wikle as mentioned in this paper present a reference book about spatial and spatio-temporal statistical modeling for spatial and temporal modeling, which is based on the work of Cressie et al.
Abstract: Noel Cressie and Christopher WikleHardcover: 624 pagesYear: 2011Publisher: John WileyISBN-13: 978-0471692744Here is the new reference book about spatial and spatio-temporal statistical modeling! No...

680 citations

Journal ArticleDOI
TL;DR: The aim of this survey is to enable researchers and system designers to get insights into the working and applications of CPSs and motivate them to propose novel solutions for making wide-scale adoption of CPS a tangible reality.
Abstract: Cyberphysical systems (CPSs) are new class of engineered systems that offer close interaction between cyber and physical components. The field of CPS has been identified as a key area of research, and CPSs are expected to play a major role in the design and development of future systems. In this paper, we survey recent advancements made in the development and applications of CPSs. We classify the existing research work based on their characteristics and identify the future challenges. We also discuss the examples of prototypes of CPSs. The aim of this survey is to enable researchers and system designers to get insights into the working and applications of CPSs and motivate them to propose novel solutions for making wide-scale adoption of CPS a tangible reality.

653 citations