scispace - formally typeset
Search or ask a question
Author

Mohit Jain

Bio: Mohit Jain is an academic researcher from Microsoft. The author has contributed to research in topics: Population & Chatbot. The author has an hindex of 17, co-authored 64 publications receiving 872 citations. Previous affiliations of Mohit Jain include University of Toronto & Hewlett-Packard.


Papers
More filters
Proceedings ArticleDOI
08 Jun 2018
TL;DR: A study with 16 first-time chatbot users interacting with eight chatbots over multiple sessions on the Facebook Messenger platform revealed that users preferred chatbots that provided either a 'human-like' natural language conversation ability, or an engaging experience that exploited the benefits of the familiar turn-based messaging interface.
Abstract: Text messaging-based conversational agents (CAs), popularly called chatbots, received significant attention in the last two years. However, chatbots are still in their nascent stage: They have a low penetration rate as 84% of the Internet users have not used a chatbot yet. Hence, understanding the usage patterns of first-time users can potentially inform and guide the design of future chatbots. In this paper, we report the findings of a study with 16 first-time chatbot users interacting with eight chatbots over multiple sessions on the Facebook Messenger platform. Analysis of chat logs and user interviews revealed that users preferred chatbots that provided either a 'human-like' natural language conversation ability, or an engaging experience that exploited the benefits of the familiar turn-based messaging interface. We conclude with implications to evolve the design of chatbots, such as: clarify chatbot capabilities, sustain conversation context, handle dialog failures, and end conversations gracefully.

213 citations

Proceedings ArticleDOI
02 May 2019
TL;DR: It is found that providing options and explanations were generally favored, as they manifest initiative from the chatbot and are actionable to recover from breakdowns, and provide a nuanced understanding on the strengths and weaknesses of each repair strategy.
Abstract: Text-based conversational systems, also referred to as chatbots, have grown widely popular. Current natural language understanding technologies are not yet ready to tackle the complexities in conversational interactions. Breakdowns are common, leading to negative user experiences. Guided by communication theories, we explore user preferences for eight repair strategies, including ones that are common in commercially-deployed chatbots (e.g., confirmation, providing options), as well as novel strategies that explain characteristics of the underlying machine learning algorithms. We conducted a scenario-based study to compare repair strategies with Mechanical Turk workers (N=203). We found that providing options and explanations were generally favored, as they manifest initiative from the chatbot and are actionable to recover from breakdowns. Through detailed analysis of participants' responses, we provide a nuanced understanding on the strengths and weaknesses of each repair strategy.

137 citations

Proceedings ArticleDOI
02 Sep 2018
TL;DR: An end-to-end deep learning framework for audio replay attack detection using a novel visual attention mechanism on time-frequency representations of utterances based on group delay features, via deep residual learning (an adaptation of ResNet-18 architecture).
Abstract: With automatic speaker verification (ASV) systems becoming increasingly popular, the development of robust countermeasures against spoofing is needed. Replay attacks pose a significant threat to the reliability of ASV systems because of the relative difficulty in detecting replayed speech and the ease with which such attacks can be mounted. In this paper, we propose an end-to-end deep learning framework for audio replay attack detection. Our proposed approach uses a novel visual attention mechanism on time-frequency representations of utterances based on group delay features, via deep residual learning (an adaptation of ResNet-18 architecture). Using a single model system, we achieve a perfect Equal Error Rate (EER) of 0% on both the development as well as the evaluation set of the ASVspoof 2017 dataset, against a previous best of 0.12% on the development set and 2.76% on the evaluation set reported in the literature. This highlights the efficacy of our feature representation and attention-based architecture in tackling the challenging task of audio replay attack detection.

88 citations

Journal ArticleDOI
27 Dec 2018
TL;DR: It is found that a conversational agent has the potential to effectively meet the information needs of farmers at scale and could inform future work on designing conversational agents for user populations with limited literacy and technology experience.
Abstract: Farmers constitute 54.6% of the Indian population, but earn only 13.9% of the national GDP. This gross mismatch can be alleviated by improving farmers' access to information and expert advice (e.g., knowing which seeds to sow and how to treat pests can significantly impact yield). In this paper, we report our experience of designing a conversational agent, called FarmChat, to meet the information needs of farmers in rural India. We conducted an evaluative study with 34 farmers near Ranchi in India, focusing on assessing the usability of the system, acceptability of the information provided, and understanding the user population's unique preferences, needs, and challenges in using the technology. We performed a comparative study with two different modalities: audio-only and audio+text. Our results provide a detailed understanding on how literacy level, digital literacy, and other factors impact users' preferences for the interaction modality. We found that a conversational agent has the potential to effectively meet the information needs of farmers at scale. More broadly, our results could inform future work on designing conversational agents for user populations with limited literacy and technology experience.

80 citations

Journal ArticleDOI
11 Sep 2017
TL;DR: DigiTouch is presented, a reconfigurable glove-based input device that enables thumb-to-finger touch interaction by sensing continuous touch position and pressure and improves the reliability of continuous touch tracking and estimating pressure on resistive fabric interfaces.
Abstract: Input is a significant problem for wearable systems, particularly for head mounted virtual and augmented reality displays. Existing input techniques either lack expressive power or may not be socially acceptable. As an alternative, thumb-to-finger touches present a promising input mechanism that is subtle yet capable of complex interactions. We present DigiTouch, a reconfigurable glove-based input device that enables thumb-to-finger touch interaction by sensing continuous touch position and pressure. Our novel sensing technique improves the reliability of continuous touch tracking and estimating pressure on resistive fabric interfaces. We demonstrate DigiTouch’s utility by enabling a set of easily reachable and reconfigurable widgets such as buttons and sliders. Since DigiTouch senses continuous touch position, widget layouts can be customized according to user preferences and application needs. As an example of a real-world application of this reconfigurable input device, we examine a split-QWERTY keyboard layout mapped to the user’s fingers. We evaluate DigiTouch for text entry using a multi-session study. With our continuous sensing method, users reliably learned to type and achieved a mean typing speed of 16.0 words per minute at the end of ten 20-minute sessions, an improvement over similar wearable touch systems.

71 citations


Cited by
More filters
Book ChapterDOI
01 Jan 1982
TL;DR: In this article, the authors discuss leading problems linked to energy that the world is now confronting and propose some ideas concerning possible solutions, and conclude that it is necessary to pursue actively the development of coal, natural gas, and nuclear power.
Abstract: This chapter discusses leading problems linked to energy that the world is now confronting and to propose some ideas concerning possible solutions. Oil deserves special attention among all energy sources. Since the beginning of 1981, it has merely been continuing and enhancing the downward movement in consumption and prices caused by excessive rises, especially for light crudes such as those from Africa, and the slowing down of worldwide economic growth. Densely-populated oil-producing countries need to produce to live, to pay for their food and their equipment. If the economic growth of the industrialized countries were to be 4%, even if investment in the rational use of energy were pushed to the limit and the development of nonpetroleum energy sources were also pursued actively, it would be extremely difficult to prevent a sharp rise in prices. It is evident that it is absolutely necessary to pursue actively the development of coal, natural gas, and nuclear power if a physical shortage of energy is not to block economic growth.

2,283 citations

Proceedings ArticleDOI
04 Nov 2009
TL;DR: The results show that the ST-matching algorithm significantly outperform incremental algorithm in terms of matching accuracy for low-sampling trajectories and when compared with AFD-based global algorithm, ST-Matching also improves accuracy as well as running time.
Abstract: Map-matching is the process of aligning a sequence of observed user positions with the road network on a digital map. It is a fundamental pre-processing step for many applications, such as moving object management, traffic flow analysis, and driving directions. In practice there exists huge amount of low-sampling-rate (e.g., one point every 2--5 minutes) GPS trajectories. Unfortunately, most current map-matching approaches only deal with high-sampling-rate (typically one point every 10--30s) GPS data, and become less effective for low-sampling-rate points as the uncertainty in data increases. In this paper, we propose a novel global map-matching algorithm called ST-Matching for low-sampling-rate GPS trajectories. ST-Matching considers (1) the spatial geometric and topological structures of the road network and (2) the temporal/speed constraints of the trajectories. Based on spatio-temporal analysis, a candidate graph is constructed from which the best matching path sequence is identified. We compare ST-Matching with the incremental algorithm and Average-Frechet-Distance (AFD) based global map-matching algorithm. The experiments are performed both on synthetic and real dataset. The results show that our ST-matching algorithm significantly outperform incremental algorithm in terms of matching accuracy for low-sampling trajectories. Meanwhile, when compared with AFD-based global algorithm, ST-Matching also improves accuracy as well as running time.

817 citations

01 Jan 2010
TL;DR: In this article, the authors present the design and implementation of a presence sensor platform that can be used for accurate occupancy detection at the level of individual offices, which is low-cost, wireless, and incrementally deployable within existing buildings.
Abstract: Buildings are among the largest consumers of electricity in the US. A significant portion of this energy use in buildings can be attributed to HVAC systems used to maintain comfort for occupants. In most cases these building HVAC systems run on fixed schedules and do not employ any fine grained control based on detailed occupancy information. In this paper we present the design and implementation of a presence sensor platform that can be used for accurate occupancy detection at the level of individual offices. Our presence sensor is low-cost, wireless, and incrementally deployable within existing buildings. Using a pilot deployment of our system across ten offices over a two week period we identify significant opportunities for energy savings due to periods of vacancy. Our energy measurements show that our presence node has an estimated battery lifetime of over five years, while detecting occupancy accurately. Furthermore, using a building simulation framework and the occupancy information from our testbed, we show potential energy savings from 10% to 15% using our system.

489 citations

Patent
12 Sep 2012
TL;DR: In this paper, an ontology-driven portal that organizes all three categories of data according to various Facets using underlying ontologies to define each facet and wherein any type of information can be classified and linked to other types of information is disclosed.
Abstract: The patent describes a single location and application on a network where a user can organize public, group, and private/personal information and have this single, location accessible to the public. A new, ontology-driven portal that organizes all three categories of data according to various “facets” using underlying ontologies to define each “facet” and wherein any type of information can be classified and linked to other types of information is disclosed. An application that enables a user to effectively utilize and manage knowledge and data the user posses and allows other users to effectively and seamlessly benefit from the user's knowledge and data over a computer network is also disclosed. A method of processing content created by a user utilizing a semantic, ontology-driven portal on a computer network is described. The semantic portal application provides the user with a content base, such as a semantic form or meta-form, for creating a semantic posting. The semantic portal utilizes a knowledge data structure, such as a taxonomy or ontology, in preparing a semantic posting based on the information provided by the user via the content base. The semantic portal application prepares a preview of a semantic posting for evaluation by the user. The semantic posting is then either modified by the user or accepted and posted by the user for external parties to view.

452 citations