scispace - formally typeset
Search or ask a question

Showing papers by "Amazon.com published in 2016"


Journal ArticleDOI
TL;DR: The authors deployed the reconfigurable fabric in a bed of 1,632 servers and FPGAs in a production datacenter and successfully used it to accelerate the ranking portion of the Bing Web search engine by nearly a factor of two.
Abstract: Datacenter workloads demand high computational capabilities, flexibility, power efficiency, and low cost It is challenging to improve all of these factors simultaneously To advance datacenter capabilities beyond what commodity server designs can provide, we designed and built a composable, reconfigurable hardware fabric based on field programmable gate arrays (FPGA) Each server in the fabric contains one FPGA, and all FPGAs within a 48-server rack are interconnected over a low-latency, high-bandwidth networkWe describe a medium-scale deployment of this fabric on a bed of 1632 servers, and measure its effectiveness in accelerating the ranking component of the Bing web search engine We describe the requirements and architecture of the system, detail the critical engineering challenges and solutions needed to make the system robust in the presence of failures, and measure the performance, power, and resilience of the system Under high load, the large-scale reconfigurable fabric improves the ranking throughput of each server by 95% at a desirable latency distribution or reduces tail latency by 29% at a fixed throughput In other words, the reconfigurable fabric enables the same throughput using only half the number of servers

835 citations


Book ChapterDOI
01 Jan 2016
TL;DR: In this article, the authors survey recent advances in algorithms for route planning in transportation networks, and show that one can compute driving directions in milliseconds or less even at continental scale for road networks, while others can deal efficiently with real-time traffic.
Abstract: We survey recent advances in algorithms for route planning in transportation networks. For road networks, we show that one can compute driving directions in milliseconds or less even at continental scale. A variety of techniques provide different trade-offs between preprocessing effort, space requirements, and query time. Some algorithms can answer queries in a fraction of a microsecond, while others can deal efficiently with real-time traffic. Journey planning on public transportation systems, although conceptually similar, is a significantly harder problem due to its inherent time-dependent and multicriteria nature. Although exact algorithms are fast enough for interactive queries on metropolitan transit systems, dealing with continent-sized instances requires simplifications or heavy preprocessing. The multimodal route planning problem, which seeks journeys combining schedule-based transportation (buses, trains) with unrestricted modes (walking, driving), is even harder, relying on approximate solutions even for metropolitan inputs.

618 citations


Proceedings ArticleDOI
10 Apr 2016
TL;DR: This work formulate bitrate adaptation as a utility maximization problem and devise an online control algorithm called BOLA that uses Lyapunov optimization techniques to minimize rebuffering and maximize video quality and proves that B OLA achieves a time-average utility that is within an additive term O(1/V) of the optimal value.
Abstract: Modern video players employ complex algorithms to adapt the bitrate of the video that is shown to the user. Bitrate adaptation requires a tradeoff between reducing the probability that the video freezes and enhancing the quality of the video shown to the user. A bitrate that is too high leads to frequent video freezes (i.e., rebuffering), while a bitrate that is too low leads to poor video quality. Video providers segment the video into short chunks and encode each chunk at multiple bitrates. The video player adaptively chooses the bitrate of each chunk that is downloaded, possibly choosing different bitrates for successive chunks. While bitrate adaptation holds the key to a good quality of experience for the user, current video players use ad-hoc algorithms that are poorly understood. We formulate bitrate adaptation as a utility maximization problem and devise an online control algorithm called BOLA that uses Lyapunov optimization techniques to minimize rebuffering and maximize video quality. We prove that BOLA achieves a time-average utility that is within an additive term O(1/V) of the optimal value, for a control parameter V related to the video buffer size. Further, unlike prior work, our algorithm does not require any prediction of available network bandwidth. We empirically validate our algorithm in a simulated network environment using an extensive collection of network traces. We show that our algorithm achieves near-optimal utility and in many cases significantly higher utility than current state-of-the-art algorithms. Our work has immediate impact on real-world video players and BOLA is part of the reference player implementation for the evolving DASH standard for video transmission.

508 citations


Journal ArticleDOI
24 Oct 2016-PeerJ
TL;DR: The empirical analysis indicates that the formal facts of a case are the most important predictive factor, consistent with the theory of legal realism suggesting that judicial decision-making is significantly affected by the stimulus of the facts.
Abstract: Recent advances in Natural Language Processing and Machine Learning provide us with the tools to build predictive models that can be used to unveil patterns driving judicial decisions. This can be useful, for both lawyers and judges, as an assisting tool to rapidly identify cases and extract patterns which lead to certain decisions. This paper presents the first systematic study on predicting the outcome of cases tried by the European Court of Human Rights based solely on textual content. We formulate a binary classification task where the input of our classifiers is the textual content extracted from a case and the target output is the actual judgment as to whether there has been a violation of an article of the convention of human rights. Textual information is represented using contiguous word sequences, i.e., N-grams, and topics. Our models can predict the court’s decisions with a strong accuracy (79% on average). Our empirical analysis indicates that the formal facts of a case are the most important predictive factor. This is consistent with the theory of legal realism suggesting that judicial decision-making is significantly affected by the stimulus of the facts. We also observe that the topical content of a case is another important feature in this classification task and explore this relationship further by conducting a qualitative analysis.

412 citations


Journal ArticleDOI
26 Feb 2016-Science
TL;DR: In this article, the authors show that synchronization of new leaf growth with dry season litterfall shifts canopy composition toward younger, more light-use efficient leaves, explaining large seasonal increases in ecosystem photosynthesis.
Abstract: In evergreen tropical forests, the extent, magnitude, and controls on photosynthetic seasonality are poorly resolved and inadequately represented in Earth system models. Combining camera observations with ecosystem carbon dioxide fluxes at forests across rainfall gradients in Amazonia, we show that aggregate canopy phenology, not seasonality of climate drivers, is the primary cause of photosynthetic seasonality in these forests. Specifically, synchronization of new leaf growth with dry season litterfall shifts canopy composition toward younger, more light-use efficient leaves, explaining large seasonal increases (~27%) in ecosystem photosynthesis. Coordinated leaf development and demography thus reconcile seemingly disparate observations at different scales and indicate that accounting for leaf-level phenology is critical for accurately simulating ecosystem-scale responses to climate change.

323 citations


Journal ArticleDOI
TL;DR: In this paper, the authors extract feature point matches between frames using SURF descriptors and dense optical flow, and use the matches to estimate a homography with RANSAC.
Abstract: This paper introduces a state-of-the-art video representation and applies it to efficient action recognition and detection. We first propose to improve the popular dense trajectory features by explicit camera motion estimation. More specifically, we extract feature point matches between frames using SURF descriptors and dense optical flow. The matches are used to estimate a homography with RANSAC. To improve the robustness of homography estimation, a human detector is employed to remove outlier matches from the human body as human motion is not constrained by the camera. Trajectories consistent with the homography are considered as due to camera motion, and thus removed. We also use the homography to cancel out camera motion from the optical flow. This results in significant improvement on motion-based HOF and MBH descriptors. We further explore the recent Fisher vector as an alternative feature encoding approach to the standard bag-of-words (BOW) histogram, and consider different ways to include spatial layout information in these encodings. We present a large and varied set of evaluations, considering (i) classification of short basic actions on six datasets, (ii) localization of such actions in feature-length movies, and (iii) large-scale recognition of complex events. We find that our improved trajectory features significantly outperform previous dense trajectories, and that Fisher vectors are superior to BOW encodings for video recognition tasks. In all three tasks, we show substantial improvements over the state-of-the-art results.

282 citations


Journal ArticleDOI
TL;DR: This review examines the three phases of morphological research that have occurred in the study of Sertoli cells, because microscopic anatomy was essentially the only scientific discipline available for about the first 75 years after the discovery of the testicular ‘nurse cell’.
Abstract: It has been one and a half centuries since Enrico Sertoli published the seminal discovery of the testicular 'nurse cell', not only a key cell in the testis, but indeed one of the most amazing cells in the vertebrate body. In this review, we begin by examining the three phases of morphological research that have occurred in the study of Sertoli cells, because microscopic anatomy was essentially the only scientific discipline available for about the first 75 years after the discovery. Biochemistry and molecular biology then changed all of biological sciences, including our understanding of the functions of Sertoli cells. Immunology and stem cell biology were not even topics of science in 1865, but they have now become major issues in our appreciation of Sertoli cell's role in spermatogenesis. We end with the universal importance and plasticity of function by comparing Sertoli cells in fish, amphibians, and mammals. In these various classes of vertebrates, Sertoli cells have quite different modes of proliferation and epithelial maintenance, cystic vs. tubular formation, yet accomplish essentially the same function but in strikingly different ways.

269 citations


Journal ArticleDOI
TL;DR: In this paper, the authors analyzed the zero-deforestation cattle agreements signed by major meatpacking companies in the Brazilian Amazon state of Para using property-level data on beef supply chains.
Abstract: New supply chain interventions offer promise to reduce deforestation from expansion of commercial agriculture, as more multinational companies agree to stop sourcing from farms with recent forest clearing. We analyzed the zero-deforestation cattle agreements signed by major meatpacking companies in the Brazilian Amazon state of Para using property-level data on beef supply chains. Our panel analysis of daily purchases by slaughterhouses before and after the agreements demonstrates that they now avoid purchasing from properties with deforestation, which was not the case prior to the agreements. Supplying ranchers registered their properties in a public environmental registry nearly 2 years before surrounding non-supplying properties, and 85% of surveyed ranchers indicated that the agreements were the driving force. In addition, supplying properties had significantly reduced deforestation rates following the agreements. Our results demonstrate important changes in the beef supply chain, but the agreements’ narrow scope and implementation diminish outcomes for forest conservation.

228 citations


Posted Content
TL;DR: This article proposed a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same, and derived two theorems that set their approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice, significantly outperforming alternative methods in a sequence of navigation tasks and in the control of a simulated robotic arm.
Abstract: Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. We propose a transfer framework for the scenario where the reward function changes between tasks but the environment's dynamics remain the same. Our approach rests on two key ideas: "successor features", a value function representation that decouples the dynamics of the environment from the rewards, and "generalized policy improvement", a generalization of dynamic programming's policy improvement operation that considers a set of policies rather than a single one. Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows the free exchange of information across tasks. The proposed method also provides performance guarantees for the transferred policy even before any learning has taken place. We derive two theorems that set our approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice, significantly outperforming alternative methods in a sequence of navigation tasks and in the control of a simulated robotic arm.

223 citations


Patent
Jonathan Evan Cohn1
28 Jan 2016
TL;DR: In this article, the authors present a system for analyzing the network traffic health of an inventory management system that includes an autonomous vehicle and a plurality of access points using a graphical user interface.
Abstract: A system for analyzing the network traffic health of an inventory management system that includes an autonomous vehicle and a plurality of access points. The autonomous vehicle interacts with access points in an inventory management system, and network traffic information related to network connectivity between the autonomous vehicle and the access points is obtained. The autonomous vehicle or the access points transmit(s) the network traffic information to a computer system that can generate a graphical user interface that represents the network traffic information for the inventory management system. The network traffic information can include a variety of information about the interactions between autonomous vehicles and access points, such as roam time of the autonomous vehicles between access points as the autonomous vehicles navigate within the inventory management system.

215 citations


Proceedings Article
19 Jun 2016
TL;DR: A robust random cut data structure that can be used as a sketch or synopsis of the input stream is investigated and it is shown how the sketch can be efficiently updated in a dynamic data stream.
Abstract: In this paper we focus on the anomaly detection problem for dynamic data streams through the lens of random cut forests. We investigate a robust random cut data structure that can be used as a sketch or synopsis of the input stream. We provide a plausible definition of non-parametric anomalies based on the influence of an unseen point on the remainder of the data, i.e., the externality imposed by that point. We show how the sketch can be efficiently updated in a dynamic data stream. We demonstrate the viability of the algorithm on publicly available real data.

Journal ArticleDOI
TL;DR: Evaluating occupancy trends for 511 populations of terrestrial mammals and birds, representing 244 species from 15 tropical forest protected areas on three continents, finds that occupancy declined in 22, increased in 17%, and exhibited no change in 22% of populations during the last 3–8 years, while 39% of population were detected too infrequently to assess occupancy changes.
Abstract: Extinction rates in the Anthropocene are three orders of magnitude higher than background and disproportionately occur in the tropics, home of half the world’s species. Despite global efforts to combat tropical species extinctions, lack of high-quality, objective information on tropical biodiversity has hampered quantitative evaluation of conservation strategies. In particular, the scarcity of population-level monitoring in tropical forests has stymied assessment of biodiversity outcomes, such as the status and trends of animal populations in protected areas. Here, we evaluate occupancy trends for 511 populations of terrestrial mammals and birds, representing 244 species from 15 tropical forest protected areas on three continents. For the first time to our knowledge, we use annual surveys from tropical forests worldwide that employ a standardized camera trapping protocol, and we compute data analytics that correct for imperfect detection. We found that occupancy declined in 22%, increased in 17%, and exhibited no change in 22% of populations during the last 3–8 years, while 39% of populations were detected too infrequently to assess occupancy changes. Despite extensive variability in occupancy trends, these 15 tropical protected areas have not exhibited systematic declines in biodiversity (i.e., occupancy, richness, or evenness) at the community level. Our results differ from reports of widespread biodiversity declines based on aggregated secondary data and expert opinion and suggest less extreme deterioration in tropical forest protected areas. We simultaneously fill an important conservation data gap and demonstrate the value of large-scale monitoring infrastructure and powerful analytics, which can be scaled to incorporate additional sites, ecosystems, and monitoring methods. In an era of catastrophic biodiversity loss, robust indicators produced from standardized monitoring infrastructure are critical to accurately assess population outcomes and identify conservation strategies that can avert biodiversity collapse.

Proceedings Article
13 Aug 2016
TL;DR: The 2016 ACM Conference on Knowledge Discovery and Data Mining (KDD'16) as mentioned in this paper has attracted a significant number of submissions from countries all over the world, in particular, the research track attracted 784 submissions and the applied data science track attracted 331 submissions.
Abstract: It is our great pleasure to welcome you to the 2016 ACM Conference on Knowledge Discovery and Data Mining -- KDD'16. We hope that the content and the professional network at KDD'16 will help you succeed professionally by enabling you to: identify technology trends early; make new/creative contributions; increase your productivity by using newer/better tools, processes or ways of organizing teams; identify new job opportunities; and hire new team members. We are living in an exciting time for our profession. On the one hand, we are witnessing the industrialization of data science, and the emergence of the industrial assembly line processes characterized by the division of labor, integrated processes/pipelines of work, standards, automation, and repeatability. Data science practitioners are organizing themselves in more sophisticated ways, embedding themselves in larger teams in many industry verticals, improving their productivity substantially, and achieving a much larger scale of social impact. On the other hand we are also witnessing astonishing progress from research in algorithms and systems -- for example the field of deep neural networks has revolutionized speech recognition, NLP, computer vision, image recognition, etc. By facilitating interaction between practitioners at large companies & startups on the one hand, and the algorithm development researchers including leading academics on the other, KDD'16 fosters technological and entrepreneurial innovation in the area of data science. This year's conference continues its tradition of being the premier forum for presentation of results in the field of data mining, both in the form of cutting edge research, and in the form of insights from the development and deployment of real world applications. Further, the conference continues with its tradition of a strong tutorial and workshop program on leading edge issues of data mining. The mission of this conference has broadened in recent years even as we placed a significant amount of focus on both the research and applied aspects of data mining. As an example of this broadened focus, this year we have introduced a strong hands-on tutorial program nduring the conference in which participants will learn how to use practical tools for data mining. KDD'16 also gives researchers and practitioners a unique opportunity to form professional networks, and to share their perspectives with others interested in the various aspects of data mining. For example, we have introduced office hours for budding entrepreneurs from our community to meet leading Venture Capitalists investing in this area. We hope that KDD 2016 conference will serve as a meeting ground for researchers, practitioners, funding agencies, and investors to help create new algorithms and commercial products. The call for papers attracted a significant number of submissions from countries all over the world. In particular, the research track attracted 784 submissions and the applied data science track attracted 331 submissions. Papers were accepted either as full papers or as posters. The overall acceptance rate either as full papers or posters was less than 20%. For full papers in the research track, the acceptance rate was lower than 10%. This is consistent with the fact that the KDD Conference is a premier conference in data mining and the acceptance rates historically tend to be low. It is noteworthy that the applied data science track received a larger number of submissions compared to previous years. We view this as an encouraging sign that research in data mining is increasingly becoming relevant to industrial applications. All papers were reviewed by at least three program committee members and then discussed by the PC members in a discussion moderated by a meta-reviewer. Borderline papers were thoroughly reviewed by the program chairs before final decisions were made.

Journal ArticleDOI
TL;DR: Improving monitoring strategies will allow a better understanding of the role of forest dynamics in climate-change mitigation, adaptation, and carbon cycle feedbacks, thereby reducing uncertainties in models of the key processes in the carbon cycle.
Abstract: Tropical forests harbor a significant portion of global biodiversity and are a critical component of the climate system. Reducing deforestation and forest degradation contributes to global climate-change mitigation efforts, yet emissions and removals from forest dynamics are still poorly quantified. We reviewed the main challenges to estimate changes in carbon stocks and biodiversity due to degradation and recovery of tropical forests, focusing on three main areas: (1) the combination of field surveys and remote sensing; (2) evaluation of biodiversity and carbon values under a unified strategy; and (3) research efforts needed to understand and quantify forest degradation and recovery. The improvement of models and estimates of changes of forest carbon can foster process-oriented monitoring of forest dynamics, including different variables and using spatially explicit algorithms that account for regional and local differences, such as variation in climate, soil, nutrient content, topography, biodiversity, disturbance history, recovery pathways, and socioeconomic factors. Generating the data for these models requires affordable large-scale remote-sensing tools associated with a robust network of field plots that can generate spatially explicit information on a range of variables through time. By combining ecosystem models, multiscale remote sensing, and networks of field plots, we will be able to evaluate forest degradation and recovery and their interactions with biodiversity and carbon cycling. Improving monitoring strategies will allow a better understanding of the role of forest dynamics in climate-change mitigation, adaptation, and carbon cycle feedbacks, thereby reducing uncertainties in models of the key processes in the carbon cycle, including their impacts on biodiversity, which are fundamental to support forest governance policies, such as Reducing Emissions from Deforestation and Forest Degradation.

Journal ArticleDOI
TL;DR: It is shown that NOMP achieves near-optimal performance under a variety of conditions, and is compared with classical algorithms such as MUSIC and more recent Atomic norm Soft Thresholding and Lasso algorithms, both in terms of frequency estimation accuracy and run time.
Abstract: We propose a fast sequential algorithm for the fundamental problem of estimating frequencies and amplitudes of a noisy mixture of sinusoids. The algorithm is a natural generalization of Orthogonal Matching Pursuit (OMP) to the continuum using Newton refinements, and hence is termed Newtonized OMP (NOMP). Each iteration consists of two phases: detection of a new sinusoid, and sequential Newton refinements of the parameters of already detected sinusoids. The refinements play a critical role in two ways: 1) sidestepping the potential basis mismatch from discretizing a continuous parameter space and 2) providing feedback for locally refining parameters estimated in previous iterations. We characterize convergence and provide a constant false alarm rate (CFAR) based termination criterion. By benchmarking against the Cramer–Rao Bound, we show that NOMP achieves near-optimal performance under a variety of conditions. We compare the performance of NOMP with classical algorithms such as MUSIC and more recent Atomic norm Soft Thresholding (AST) and Lasso algorithms, both in terms of frequency estimation accuracy and run time.

Proceedings ArticleDOI
08 Sep 2016
TL;DR: It is shown that the combination of 3 techniques LVCSR-initialization, multi-task training and weighted cross-entropy gives the best results, with significantly lower False Alarm Rate than the LV CSR- initialization technique alone, across a wide range of Miss Rates.
Abstract: We propose improved Deep Neural Network (DNN) training loss functions for more accurate single keyword spotting on resource-constrained embedded devices. The loss function modifications consist of a combination of multi-task training and weighted cross entropy. In the multi-task architecture, the keyword DNN acoustic model is trained with two tasks in parallel the main task of predicting the keyword-specific phone states, and an auxiliary task of predicting LVCSR senones. We show that multi-task learning leads to comparable accuracy over a previously proposed transfer learning approach where the keyword DNN training is initialized by an LVCSR DNN of the same input and hidden layer sizes. The combination of LVCSRinitialization and Multi-task training gives improved keyword detection accuracy compared to either technique alone. We also propose modifying the loss function to give a higher weight on input frames corresponding to keyword phone targets, with a motivation to balance the keyword and background training data. We show that weighted cross-entropy results in additional accuracy improvements. Finally, we show that the combination of 3 techniques LVCSR-initialization, multi-task training and weighted cross-entropy gives the best results, with significantly lower False Alarm Rate than the LVCSR-initialization technique alone, across a wide range of Miss Rates.

Journal ArticleDOI
Michelle O. Johnson1, David W. Galbraith1, Manuel Gloor1, Hannes De Deurwaerder2, Matthieu Guimberteau3, Matthieu Guimberteau4, Anja Rammig5, Anja Rammig6, Kirsten Thonicke5, Hans Verbeeck2, Celso von Randow7, Abel Monteagudo, Oliver L. Phillips1, Roel J. W. Brienen1, Ted R. Feldpausch8, Gabriela Lopez Gonzalez1, Sophie Fauset1, Carlos A. Quesada, Bradley O. Christoffersen9, Bradley O. Christoffersen10, Philippe Ciais4, Gilvan Sampaio7, Bart Kruijt11, Patrick Meir10, Patrick Meir12, Paul R. Moorcroft13, Ke Zhang14, Esteban Álvarez-Dávila, Atila Alves de Oliveira, Iêda Leão do Amaral, Ana Andrade, Luiz E. O. C. Aragão, Alejandro Araujo-Murakami15, Eric Arets11, Luzmila Arroyo15, Gerardo Aymard, Christopher Baraloto16, Jocely Barroso17, Damien Bonal18, René G. A. Boot19, José Luís Camargo, Jérôme Chave20, Álvaro Cogollo, Fernando Cornejo Valverde21, Antonio Carlos Lola da Costa22, Anthony Di Fiore23, Leandro Valle Ferreira24, Niro Higuchi, Euridice Honorio, Timothy J. Killeen25, Susan G. Laurance26, William F. Laurance26, Juan Carlos Licona, Thomas E. Lovejoy27, Yadvinder Malhi28, Bia Marimon29, Ben Hur Marimon Junior29, Darley C.L. Matos24, Casimiro Mendoza, David A. Neill, Guido Pardo, Marielos Peña-Claros11, Nigel C. A. Pitman30, Lourens Poorter11, Adriana Prieto31, Hirma Ramírez-Angulo32, Anand Roopsind33, Agustín Rudas31, Rafael de Paiva Salomão24, Marcos Silveira17, Juliana Stropp34, Hans ter Steege35, John Terborgh30, Raquel Thomas33, Marisol Toledo, Armando Torres-Lezama32, Geertje M. F. van der Heijden36, Rodolfo Vasquez8, Ima Célia Guimarães Vieira24, Emilio Vilanova32, Vincent A. Vos, Timothy R. Baker1 
TL;DR: It is found that woody NPP is not correlated with stem mortality rates and is weakly positively correlated with AGB, and across the four models, basin‐wide average AGB is similar to the mean of the observations.
Abstract: Understanding the processes that determine above-ground biomass (AGB) in Amazonian forests is important for predicting the sensitivity of these ecosystems to environmental change and for designing and evaluating dynamic global vegetation models (DGVMs). AGB is determined by inputs from woody productivity [woody net primary productivity (NPP)] and the rate at which carbon is lost through tree mortality. Here, we test whether two direct metrics of tree mortality (the absolute rate of woody biomass loss and the rate of stem mortality) and/or woody NPP, control variation in AGB among 167 plots in intact forest across Amazonia. We then compare these relationships and the observed variation in AGB and woody NPP with the predictions of four DGVMs. The observations show that stem mortality rates, rather than absolute rates of woody biomass loss, are the most important predictor of AGB, which is consistent with the importance of stand size structure for determining spatial variation in AGB. The relationship between stem mortality rates and AGB varies among different regions of Amazonia, indicating that variation in wood density and height/diameter relationships also influences AGB. In contrast to previous findings, we find that woody NPP is not correlated with stem mortality rates and is weakly positively correlated with AGB. Across the four models, basin-wide average AGB is similar to the mean of the observations. However, the models consistently overestimate woody NPP and poorly represent the spatial patterns of both AGB and woody NPP estimated using plot data. In marked contrast to the observations, DGVMs typically show strong positive relationships between woody NPP and AGB. Resolving these differences will require incorporating forest size structure, mechanistic models of stem mortality and variation in functional composition in DGVMs.

Journal ArticleDOI
TL;DR: An innovative spatially explicit modelling approach capable of representing alternative pathways of the clear-cut deforestation, secondary vegetation dynamics, and the old-growth forest degradation is developed, and net deforestation-driven carbon emissions for the different scenarios are estimated.
Abstract: Following an intense occupation process that was initiated in the 1960s, deforestation rates in the Brazilian Amazon have decreased significantly since 2004, stabilizing around 6000 km(2) yr(-1) in the last 5 years. A convergence of conditions contributed to this, including the creation of protected areas, the use of effective monitoring systems, and credit restriction mechanisms. Nevertheless, other threats remain, including the rapidly expanding global markets for agricultural commodities, large-scale transportation and energy infrastructure projects, and weak institutions. We propose three updated qualitative and quantitative land-use scenarios for the Brazilian Amazon, including a normative 'Sustainability' scenario in which we envision major socio-economic, institutional, and environmental achievements in the region. We developed an innovative spatially explicit modelling approach capable of representing alternative pathways of the clear-cut deforestation, secondary vegetation dynamics, and the old-growth forest degradation. We use the computational models to estimate net deforestation-driven carbon emissions for the different scenarios. The region would become a sink of carbon after 2020 in a scenario of residual deforestation (~1000 km(2) yr(-1)) and a change in the current dynamics of the secondary vegetation - in a forest transition scenario. However, our results also show that the continuation of the current situation of relatively low deforestation rates and short life cycle of the secondary vegetation would maintain the region as a source of CO2 - even if a large portion of the deforested area is covered by secondary vegetation. In relation to the old-growth forest degradation process, we estimated average gross emission corresponding to 47% of the clear-cut deforestation from 2007 to 2013 (using the DEGRAD system data), although the aggregate effects of the postdisturbance regeneration can partially offset these emissions. Both processes (secondary vegetation and forest degradation) need to be better understood as they potentially will play a decisive role in the future regional carbon balance.

Proceedings ArticleDOI
01 Dec 2016
TL;DR: This work proposes a max-pooling based loss function for training Long Short-Term Memory networks for small-footprint keyword spotting (KWS), with low CPU, memory, and latency requirements and results show that LSTM models trained using cross-entropy loss or max- Pooling loss outperform a cross-ENTropy loss trained baseline feed-forward Deep Neural Network (DNN).
Abstract: We propose a max-pooling based loss function for training Long Short-Term Memory (LSTM) networks for small-footprint keyword spotting (KWS), with low CPU, memory, and latency requirements. The max-pooling loss training can be further guided by initializing with a cross-entropy loss trained network. A posterior smoothing based evaluation approach is employed to measure keyword spotting performance. Our experimental results show that LSTM models trained using cross-entropy loss or max-pooling loss outperform a cross-entropy loss trained baseline feed-forward Deep Neural Network (DNN). In addition, max-pooling loss trained LSTM with randomly initialized network performs better compared to cross-entropy loss trained LSTM. Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields 67:6% relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.

Patent
29 Mar 2016
TL;DR: In this paper, a system that is capable of controlling multiple entertainment systems and/or speakers using voice commands is described, where the system receives voice commands and may determine audio sources and speakers indicated by voice commands.
Abstract: A system that is capable of controlling multiple entertainment systems and/or speakers using voice commands. The system receives voice commands and may determine audio sources and speakers indicated by the voice commands. The system may generate audio data from the audio sources and may send the audio data to the speakers using multiple interfaces. For example, the system may send the audio data directly to the speakers using a network address, may send the audio data to the speakers via a voice-enabled device or may send the audio data to the speakers via a speaker controller. The system may generate output zones including multiple speakers and may associate input devices with speakers within the output zones. For example, the system may receive a voice command from an input device in an output zone and may reduce output audio generated by speakers in the output zone.

Book ChapterDOI
08 Oct 2016
TL;DR: A novel approach to mutually voting for relevant Web images and video frames, where two forces are balanced, i.e. aggressive matching and passive video frame selection is proposed and validated on three large-scale video recognition datasets.
Abstract: Video recognition usually requires a large amount of training samples, which are expensive to be collected. An alternative and cheap solution is to draw from the large-scale images and videos from the Web. With modern search engines, the top ranked images or videos are usually highly correlated to the query, implying the potential to harvest the labeling-free Web images and videos for video recognition. However, there are two key difficulties that prevent us from using the Web data directly. First, they are typically noisy and may be from a completely different domain from that of users’ interest (e.g. cartoons). Second, Web videos are usually untrimmed and very lengthy, where some query-relevant frames are often hidden in between the irrelevant ones. A question thus naturally arises: to what extent can such noisy Web images and videos be utilized for labeling-free video recognition? In this paper, we propose a novel approach to mutually voting for relevant Web images and video frames, where two forces are balanced, i.e. aggressive matching and passive video frame selection. We validate our approach on three large-scale video recognition datasets.

Proceedings ArticleDOI
26 Jun 2016
TL;DR: This paper presents ERMIA, a memory-optimized database system built from scratch to cater the need of handling heterogeneous workloads that include read-mostly transactions that adopts snapshot isolation concurrency control to coordinate heterogeneous transactions and provides serializability when desired.
Abstract: Large main memories and massively parallel processors have triggered not only a resurgence of high-performance transaction processing systems optimized for large main-memory and massively parallel processors, but also an increasing demand for processing heterogeneous workloads that include read-mostly transactions. Many modern transaction processing systems adopt a lightweight optimistic concurrency control (OCC) scheme to leverage its low overhead in low contention workloads. However, we observe that the lightweight OCC is not suitable for heterogeneous workloads, causing significant starvation of read-mostly transactions and overall performance degradation. In this paper, we present ERMIA, a memory-optimized database system built from scratch to cater the need of handling heterogeneous workloads. ERMIA adopts snapshot isolation concurrency control to coordinate heterogeneous transactions and provides serializability when desired. Its physical layer supports the concurrency control schemes in a scalable way. Experimental results show that ERMIA delivers comparable or superior performance and near-linear scalability in a variety of workloads, compared to a recent lightweight OCC-based system. At the same time, ERMIA maintains high throughput on read-mostly transactions when the performance of the OCC-based system drops by orders of magnitude.

Proceedings ArticleDOI
08 Sep 2016
TL;DR: Two ways to improve deep neural network acoustic models for keyword spotting without increasing CPU usage by using low-rank weight matrices throughout the DNN and knowledge distilled from an ensemble of much larger DNNs used only during training are investigated.
Abstract: Several consumer speech devices feature voice interfaces that perform on-device keyword spotting to initiate user interactions. Accurate on-device keyword spotting within a tight CPU budget is crucial for such devices. Motivated by this, we investigated two ways to improve deep neural network (DNN) acoustic models for keyword spotting without increasing CPU usage. First, we used low-rank weight matrices throughout the DNN. This allowed us to increase representational power by increasing the number of hidden nodes per layer without changing the total number of multiplications. Second, we used knowledge distilled from an ensemble of much larger DNNs used only during training. We systematically evaluated these two approaches on a massive corpus of far-field utterances. Alone both techniques improve performance and together they combine to give significant reductions in false alarms and misses without increasing CPU or memory usage.

Proceedings Article
19 Jun 2016
TL;DR: In this article, an adaptive online gradient descent algorithm was proposed to solve online convex optimization problems with long-term constraints, which are constraints that need to be satisfied when accumulated over a finite number of rounds T, but can be violated in intermediate rounds.
Abstract: We present an adaptive online gradient descent algorithm to solve online convex optimization problems with long-term constraints, which are constraints that need to be satisfied when accumulated over a finite number of rounds T, but can be violated in intermediate rounds. For some user-defined trade-off parameter β ∈ (0; 1), the proposed algorithm achieves cumulative regret bounds of O(Tmax{β,1-β}) and O(T1-β/2), respectively for the loss and the constraint violations. Our results hold for convex losses, can handle arbitrary convex constraints and rely on a single computationally efficient algorithm. Our contributions generalize over the best known cumulative regret bounds of Mahdavi et al. (2012a), which are respectively O(T1/2) and O(T3/4) for general convex domains, and respectively O(T2/3) and O(T2/3) when the domain is further restricted to be a polyhedral set. We supplement the analysis with experiments validating the performance of our algorithm in practice.

Journal ArticleDOI
TL;DR: Leaf phenology drives the dry season green-up detected by MODIS in the Central Amazon and is consistent with evolutionary strategies to couple photosynthetic efficiency with light availability and to avoid predation and disease on vulnerable young leaves.

Posted Content
TL;DR: This article study the problem of measuring group differences in choices when the dimensionality of the choice set is large and propose an estimator that applies recent advances in machine learning to address this bias.
Abstract: We study the problem of measuring group differences in choices when the dimensionality of the choice set is large. We show that standard approaches suffer from a severe finite-sample bias, and we propose an estimator that applies recent advances in machine learning to address this bias. We apply this method to measure trends in the partisanship of congressional speech from 1873 to 2016, defining partisanship to be the ease with which an observer could infer a congressperson’s party from a single utterance. Our estimates imply that partisanship is far greater in recent years than in the past, and that it increased sharply in the early 1990s after remaining low and relatively constant over the preceding century.

Patent
28 Dec 2016
TL;DR: In this paper, the authors describe an unmanned aerial vehicle (UAV) configured to autonomously deliver items of inventory to various destinations, where the UAV may receive inventory information and a destination location and autonomously retrieve the inventory from a location within a materials handling facility, compute a route from the material handling facility to a destination and travel to the destination to deliver the inventory.
Abstract: This disclosure describes an unmanned aerial vehicle (“UAV”) configured to autonomously deliver items of inventory to various destinations The UAV may receive inventory information and a destination location and autonomously retrieve the inventory from a location within a materials handling facility, compute a route from the materials handling facility to a destination and travel to the destination to deliver the inventory

Patent
29 Jun 2016
TL;DR: In this article, a system configured to process speech commands may classify incoming audio as desired speech, undesired speech, or non-speech, where desired speech is speech that is from a same speaker as reference speech.
Abstract: A system configured to process speech commands may classify incoming audio as desired speech, undesired speech, or non-speech. Desired speech is speech that is from a same speaker as reference speech. The reference speech may be obtained from a configuration session or from a first portion of input speech that includes a wakeword. The reference speech may be encoded using a recurrent neural network (RNN) encoder to create a reference feature vector. The reference feature vector and incoming audio data may be processed by a trained neural network classifier to label the incoming audio data (for example, frame-by-frame) as to whether each frame is spoken by the same speaker as the reference speech. The labels may be passed to an automatic speech recognition (ASR) component which may allow the ASR component to focus its processing on the desired speech.

Journal ArticleDOI
TL;DR: Eulerian Video Magnification is a computational technique for visualizing subtle color and motion variations in ordinary videos by making the variations larger, a microscope for small changes that are hard or impossible for us to see by ourselves.
Abstract: The world is filled with important, but visually subtle signals. A person's pulse, the breathing of an infant, the sag and sway of a bridge---these all create visual patterns, which are too difficult to see with the naked eye. We present Eulerian Video Magnification, a computational technique for visualizing subtle color and motion variations in ordinary videos by making the variations larger. It is a microscope for small changes that are hard or impossible for us to see by ourselves. In addition, these small changes can be quantitatively analyzed and used to recover sounds from vibrations in distant objects, characterize material properties, and remotely measure a person's pulse.

Proceedings Article
01 Dec 2016
TL;DR: It is shown that phonological features outperform character-based models in PanPhon, a database relating over 5,000 IPA segments to 21 subsegmental articulatory features that boosts performance in various NER-related tasks.
Abstract: This paper contributes to a growing body of evidence that—when coupled with appropriate machine-learning techniques–linguistically motivated, information-rich representations can outperform one-hot encodings of linguistic data. In particular, we show that phonological features outperform character-based models. PanPhon is a database relating over 5,000 IPA segments to 21 subsegmental articulatory features. We show that this database boosts performance in various NER-related tasks. Phonologically aware, neural CRF models built on PanPhon features are able to perform better on monolingual Spanish and Turkish NER tasks that character-based models. They have also been shown to work well in transfer models (as between Uzbek and Turkish). PanPhon features also contribute measurably to Orthography-to-IPA conversion tasks.