scispace - formally typeset
Search or ask a question

Showing papers by "Amazon.com published in 2021"


Journal ArticleDOI
TL;DR: In this paper, an emerging technique called algorithm unrolling, or unfolding, offers promise in eliminating these issues by providing a concrete and systematic connection between iterative algorithms that are widely used in signal processing and deep neural networks.
Abstract: Deep neural networks provide unprecedented performance gains in many real-world problems in signal and image processing. Despite these gains, the future development and practical deployment of deep networks are hindered by their black-box nature, i.e., a lack of interpretability and the need for very large training sets. An emerging technique called algorithm unrolling, or unfolding, offers promise in eliminating these issues by providing a concrete and systematic connection between iterative algorithms that are widely used in signal processing and deep neural networks. Unrolling methods were first proposed to develop fast neural network approximations for sparse coding. More recently, this direction has attracted enormous attention, and it is rapidly growing in both theoretic investigations and practical applications. The increasing popularity of unrolled deep networks is due, in part, to their potential in developing efficient, high-performance (yet interpretable) network architectures from reasonably sized training sets.

377 citations


Journal ArticleDOI
TL;DR: This work provides a thorough review on the development of this problem in recent decades and inspects the recent advances in various aspects and proposes some interesting directions for future research.

340 citations


Proceedings ArticleDOI
03 May 2021
TL;DR: The Speech processing Universal PERformance Benchmark (SUPERB) as discussed by the authors is a leaderboard to benchmark the performance of a shared model across a wide range of speech processing tasks with minimal architecture changes and labeled data.
Abstract: Self-supervised learning (SSL) has proven vital for advancing research in natural language processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on large volumes of unlabeled data and achieves state-of-the-art (SOTA) for various tasks with minimal adaptation. However, the speech processing community lacks a similar setup to systematically explore the paradigm. To bridge this gap, we introduce Speech processing Universal PERformance Benchmark (SUPERB). SUPERB is a leaderboard to benchmark the performance of a shared model across a wide range of speech processing tasks with minimal architecture changes and labeled data. Among multiple usages of the shared model, we especially focus on extracting the representation learned from SSL due to its preferable re-usability. We present a simple framework to solve SUPERB tasks by learning task-specialized lightweight prediction heads on top of the frozen shared model. Our results demonstrate that the framework is promising as SSL representations show competitive generalizability and accessibility across SUPERB tasks. We release SUPERB as a challenge with a leaderboard and a benchmark toolkit to fuel the research in representation learning and general speech processing.

138 citations


Posted ContentDOI
TL;DR: In this article, a graph neural network estimator for estimated time of arrival (ETA) is presented, which has been deployed in production at Google Maps and has shown promising results.
Abstract: Travel-time prediction constitutes a task of high importance in transportation networks, with web mapping services like Google Maps regularly serving vast quantities of travel time queries from users and enterprises alike. Further, such a task requires accounting for complex spatiotemporal interactions (modelling both the topological properties of the road network and anticipating events -- such as rush hours -- that may occur in the future). Hence, it is an ideal target for graph representation learning at scale. Here we present a graph neural network estimator for estimated time of arrival (ETA) which we have deployed in production at Google Maps. While our main architecture consists of standard GNN building blocks, we further detail the usage of training schedule methods such as MetaGradients in order to make our model robust and production-ready. We also provide prescriptive studies: ablating on various architectural decisions and training regimes, and qualitative analyses on real-world situations where our model provides a competitive edge. Our GNN proved powerful when deployed, significantly reducing negative ETA outcomes in several regions compared to the previous production baseline (40+% in cities like Sydney).

115 citations


Proceedings ArticleDOI
20 Jun 2021
TL;DR: In this paper, a semi-supervised approach for contemporary object detectors following the teacher-student dual model framework is proposed, where the exponential moving averaging strategy is used to update the teacher from the student online, and a light-weighted detection-specific data ensemble for the teacher to generate more reliable pseudo-labels.
Abstract: We propose a semi-supervised approach for contemporary object detectors following the teacher-student dual model framework. Our method 1 is featured with 1) the exponential moving averaging strategy to update the teacher from the student online, 2) using plenty of region proposals and soft pseudo-labels as the student’s training targets, and 3) a light-weighted detection-specific data ensemble for the teacher to generate more reliable pseudo-labels. Compared to the recent state-of-the-art – STAC, which uses hard labels on sparsely selected hard pseudo samples, the teacher in our model exposes richer information to the student with soft-labels on many proposals. Our model achieves COCO-style AP of 53.04% on VOC07 val set, 8.4% better than STAC, when using VOC12 as unlabeled data. On MS-COCO, it outperforms prior work when only a small percentage of data is taken as labeled. It also reaches 53.8% AP on MS-COCO test-dev with 3.1% gain over the fully supervised ResNet-152 Cascaded R-CNN, by tapping into unlabeled data of a similar size to the labeled data.

102 citations


Proceedings ArticleDOI
26 Oct 2021
TL;DR: In this paper, a graph neural network estimator for estimated time of arrival (ETA) is presented, which has been deployed in production at Google Maps and has shown promising results.
Abstract: Travel-time prediction constitutes a task of high importance in transportation networks, with web mapping services like Google Maps regularly serving vast quantities of travel time queries from users and enterprises alike. Further, such a task requires accounting for complex spatiotemporal interactions (modelling both the topological properties of the road network and anticipating events---such as rush hours---that may occur in the future). Hence, it is an ideal target for graph representation learning at scale. Here we present a graph neural network estimator for estimated time of arrival (ETA) which we have deployed in production at Google Maps. While our main architecture consists of standard GNN building blocks, we further detail the usage of training schedule methods such as MetaGradients in order to make our model robust and production-ready. We also provide prescriptive studies: ablating on various architectural decisions and training regimes, and qualitative analyses on real-world situations where our model provides a competitive edge. Our GNN proved powerful when deployed, significantly reducing negative ETA outcomes in several regions compared to the previous production baseline (40+% in cities like Sydney).

102 citations


Journal ArticleDOI
TL;DR: In this article, a computational method leveraging deep learning and molecular dynamics simulations enables the rapid discovery of antimicrobial peptides with low toxicity and with high potency against diverse Gram-positive and Gram-negative pathogens.
Abstract: The de novo design of antimicrobial therapeutics involves the exploration of a vast chemical repertoire to find compounds with broad-spectrum potency and low toxicity. Here, we report an efficient computational method for the generation of antimicrobials with desired attributes. The method leverages guidance from classifiers trained on an informative latent space of molecules modelled using a deep generative autoencoder, and screens the generated molecules using deep-learning classifiers as well as physicochemical features derived from high-throughput molecular dynamics simulations. Within 48 days, we identified, synthesized and experimentally tested 20 candidate antimicrobial peptides, of which two displayed high potency against diverse Gram-positive and Gram-negative pathogens (including multidrug-resistant Klebsiella pneumoniae) and a low propensity to induce drug resistance in Escherichia coli. Both peptides have low toxicity, as validated in vitro and in mice. We also show using live-cell confocal imaging that the bactericidal mode of action of the peptides involves the formation of membrane pores. The combination of deep learning and molecular dynamics may accelerate the discovery of potent and selective broad-spectrum antimicrobials. A computational method leveraging deep learning and molecular dynamics simulations enables the rapid discovery of antimicrobial peptides with low toxicity and with high potency against diverse Gram-positive and Gram-negative pathogens.

98 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: This work proposes Supporting Clustering with Contrastive Learning (SCCL) – a novel framework to leverage contrastive learning to promote better separation in distance-based clustering and demonstrates the effectiveness of SCCL in leveraging the strengths of both bottom-up instance discrimination and top-down clustering to achieve better intra-clusters and inter-cluster distances.
Abstract: Unsupervised clustering aims at discovering the semantic categories of data according to some distance measured in the representation space. However, different categories often overlap with each other in the representation space at the beginning of the learning process, which poses a significant challenge for distance-based clustering in achieving good separation between different categories. To this end, we propose Supporting Clustering with Contrastive Learning (SCCL) – a novel framework to leverage contrastive learning to promote better separation. We assess the performance of SCCL on short text clustering and show that SCCL significantly advances the state-of-the-art results on most benchmark datasets with 3%-11% improvement on Accuracy and 4%-15% improvement on Normalized Mutual Information. Furthermore, our quantitative analysis demonstrates the effectiveness of SCCL in leveraging the strengths of both bottom-up instance discrimination and top-down clustering to achieve better intra-cluster and inter-cluster distances when evaluated with the ground truth cluster labels.

95 citations


Proceedings ArticleDOI
03 Mar 2021
TL;DR: In this paper, an energy-based learning framework for scene graph generation is proposed to incorporate the structure of scene graphs in the output space, which allows models to learn efficiently from a small number of labels.
Abstract: Traditional scene graph generation methods are trained using cross-entropy losses that treat objects and relationships as independent entities. Such a formulation, however, ignores the structure in the output space, in an inherently structured prediction problem. In this work, we introduce a novel energy-based learning framework for generating scene graphs. The proposed formulation allows for efficiently incorporating the structure of scene graphs in the output space. This additional constraint in the learning framework acts as an inductive bias and allows models to learn efficiently from a small number of labels. We use the proposed energy-based framework† to train existing stateof-the-art models and obtain a significant performance improvement, of up to 21% and 27%, on the Visual Genome [9] and GQA [5] benchmark datasets, respectively. Furthermore, we showcase the learning efficiency of the proposed framework by demonstrating superior performance in the zero- and few-shot settings where data is scarce.

94 citations


Journal ArticleDOI
07 Jun 2021
TL;DR: For example, the authors found that between 2000 and 2019, most soybean expansion in South America was on pastures converted originally for cattle production, especially in the Brazilian Amazon, where 9% of forest loss was converted to soybeans by 2016.
Abstract: A prominent goal of policies mitigating climate change and biodiversity loss is to achieve zero deforestation in the global supply chain of key commodities, such as palm oil and soybean. However, the extent and dynamics of deforestation driven by commodity expansion are largely unknown. Here we mapped annual soybean expansion in South America between 2000 and 2019 by combining satellite observations and sample field data. From 2000 to 2019, the area cultivated with soybean more than doubled from 26.4 Mha to 55.1 Mha. Most soybean expansion occurred on pastures originally converted from natural vegetation for cattle production. The most rapid expansion occurred in the Brazilian Amazon, where soybean area increased more than tenfold, from 0.4 Mha to 4.6 Mha. Across the continent, 9% of forest loss was converted to soybean by 2016. Soybean-driven deforestation was concentrated at the active frontiers, nearly half located in the Brazilian Cerrado. Efforts to limit future deforestation must consider how soybean expansion may drive deforestation indirectly by displacing pasture or other land uses. Holistic approaches that track land use across all commodities coupled with vegetation monitoring are required to maintain critical ecosystem services. Deforestation is often driven by land conversion for growing commodity crops. This study finds that, between 2000 and 2019, most soybean expansion in South America was on pastures converted originally for cattle production, especially in the Brazilian Amazon. More soy-driven deforestation occurred in the Brazilian Cerrado.

91 citations


Proceedings ArticleDOI
03 Mar 2021
TL;DR: The Bias in Open-Ended Language Generation Dataset (BOLD) as mentioned in this paper is a large-scale dataset that consists of 23,679 English text generation prompts for bias benchmarking across five domains: profession, gender, race, religion and political ideology.
Abstract: Recent advances in deep learning techniques have enabled machines to generate cohesive open-ended text when prompted with a sequence of words as context. While these models now empower many downstream applications from conversation bots to automatic storytelling, they have been shown to generate texts that exhibit social biases. To systematically study and benchmark social biases in open-ended language generation, we introduce the Bias in Open-Ended Language Generation Dataset (BOLD), a large-scale dataset that consists of 23,679 English text generation prompts for bias benchmarking across five domains: profession, gender, race, religion, and political ideology. We also propose new automated metrics for toxicity, psycholinguistic norms, and text gender polarity to measure social biases in open-ended text generation from multiple angles. An examination of text generated from three popular language models reveals that the majority of these models exhibit a larger social bias than human-written Wikipedia text across all domains. With these results we highlight the need to benchmark biases in open-ended language generation and caution users of language generation models on downstream tasks to be cognizant of these embedded prejudices.

Proceedings ArticleDOI
Bing Shuai1, Andrew Berneshawi1, Xinyu Li1, Davide Modolo1, Joseph Tighe1 
25 May 2021
TL;DR: SiamMOT as discussed by the authors introduces a motion model that estimates the instance's movement between two frames such that detected instances are associated, and it runs at 17 FPS for 720P videos on a single modern GPU.
Abstract: In this paper, we focus on improving online multi-object tracking (MOT). In particular, we introduce a region-based Siamese Multi-Object Tracking network, which we name SiamMOT. SiamMOT includes a motion model that estimates the instance’s movement between two frames such that detected instances are associated. To explore how the motion modelling affects its tracking capability, we present two variants of Siamese tracker, one that implicitly models motion and one that models it explicitly. We carry out extensive quantitative experiments on three different MOT datasets: MOT17, TAO-person and Caltech Roadside Pedestrians, showing the importance of motion modelling for MOT and the ability of SiamMOT to substantially outperform the state-of-the-art. Finally, SiamMOT also outperforms the winners of ACM MM’20 HiEve Grand Challenge on HiEve dataset. Moreover, SiamMOT is efficient, and it runs at 17 FPS for 720P videos on a single modern GPU.

Journal ArticleDOI
TL;DR: The MOTChallenge as mentioned in this paper is a benchmark for single-camera multiple object tracking (MOT) which has been widely used in the field of computer vision and has been used to evaluate the performance of object tracking algorithms.
Abstract: Standardized benchmarks have been crucial in pushing the performance of computer vision algorithms, especially since the advent of deep learning. Although leaderboards should not be over-claimed, they often provide the most objective measure of performance and are therefore important guides for research. We present MOTChallenge, a benchmark for single-camera Multiple Object Tracking (MOT) launched in late 2014, to collect existing and new data and create a framework for the standardized evaluation of multiple object tracking methods. The benchmark is focused on multiple people tracking, since pedestrians are by far the most studied object in the tracking community, with applications ranging from robot navigation to self-driving cars. This paper collects the first three releases of the benchmark: (i) MOT15, along with numerous state-of-the-art results that were submitted in the last years, (ii) MOT16, which contains new challenging videos, and (iii) MOT17, that extends MOT16 sequences with more precise labels and evaluates tracking performance on three different object detectors. The second and third release not only offers a significant increase in the number of labeled boxes, but also provide labels for multiple object classes beside pedestrians, as well as the level of visibility for every single object of interest. We finally provide a categorization of state-of-the-art trackers and a broad error analysis. This will help newcomers understand the related work and research trends in the MOT community, and hopefully shed some light into potential future research directions.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: This work develops a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision, underpinned by contrastive regularization and confidence-based reweighting, which gradually improves model fitting while effectively suppressing error propagation.
Abstract: Fine-tuned pre-trained language models (LMs) have achieved enormous success in many natural language processing (NLP) tasks, but they still require excessive labeled data in the fine-tuning stage. We study the problem of fine-tuning pre-trained LMs using only weak supervision, without any labeled data. This problem is challenging because the high capacity of LMs makes them prone to overfitting the noisy labels generated by weak supervision. To address this problem, we develop a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision. Underpinned by contrastive regularization and confidence-based reweighting, our framework gradually improves model fitting while effectively suppressing error propagation. Experiments on sequence, token, and sentence pair classification tasks show that our model outperforms the strongest baseline by large margins and achieves competitive performance with fully-supervised fine-tuning methods. Our implementation is available on https://github.com/yueyu1030/COSINE.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: The fifth AI City Challenge as mentioned in this paper attracted 305 participating teams across 38 countries, who leveraged city-scale real traffic data and high-quality synthetic data to compete in five challenge tracks: Track 1 addressed video-based automatic vehicle counting, where the evaluation being conducted on both algorithmic effectiveness and computational efficiency.
Abstract: The AI City Challenge was created with two goals in mind: (1) pushing the boundaries of research and development in intelligent video analysis for smarter cities use cases, and (2) assessing tasks where the level of performance is enough to cause real-world adoption. Transportation is a segment ripe for such adoption. The fifth AI City Challenge attracted 305 participating teams across 38 countries, who leveraged city-scale real traffic data and high-quality synthetic data to compete in five challenge tracks. Track 1 addressed video-based automatic vehicle counting, where the evaluation being conducted on both algorithmic effectiveness and computational efficiency. Track 2 addressed city-scale vehicle re-identification with augmented synthetic data to substantially increase the training set for the task. Track 3 addressed city-scale multi-target multi-camera vehicle tracking. Track 4 addressed traffic anomaly detection. Track 5 was a new track addressing vehicle retrieval using natural language descriptions. The evaluation system shows a general leader board of all submitted results, and a public leader board of results limited to the contest participation rules, where teams are not allowed to use external data in their work. The public leader board shows results more close to real-world situations where annotated data is limited. Results show the promise of AI in Smarter Transportation. State-of-the-art performance for some tasks shows that these technologies are ready for adoption in real-world systems.

Proceedings ArticleDOI
TL;DR: The Bias in Open-Ended Language Generation Dataset (BOLD) as mentioned in this paper is a large-scale dataset that consists of 23,679 English text generation prompts for bias benchmarking across five domains: profession, gender, race, religion and political ideology.
Abstract: Recent advances in deep learning techniques have enabled machines to generate cohesive open-ended text when prompted with a sequence of words as context. While these models now empower many downstream applications from conversation bots to automatic storytelling, they have been shown to generate texts that exhibit social biases. To systematically study and benchmark social biases in open-ended language generation, we introduce the Bias in Open-Ended Language Generation Dataset (BOLD), a large-scale dataset that consists of 23,679 English text generation prompts for bias benchmarking across five domains: profession, gender, race, religion, and political ideology. We also propose new automated metrics for toxicity, psycholinguistic norms, and text gender polarity to measure social biases in open-ended text generation from multiple angles. An examination of text generated from three popular language models reveals that the majority of these models exhibit a larger social bias than human-written Wikipedia text across all domains. With these results we highlight the need to benchmark biases in open-ended language generation and caution users of language generation models on downstream tasks to be cognizant of these embedded prejudices.

Journal ArticleDOI
TL;DR: The goal in releasing LaSOT is to provide a dedicated high quality platform for both training and evaluation of trackers, and to take advantage of the close connection between visual appearance and natural language, the largest densely annotated tracking benchmark to be presented.
Abstract: Despite great recent advances in visual tracking, its further development, including both algorithm design and evaluation, is limited due to lack of dedicated large-scale benchmarks. To address this problem, we present LaSOT, a high-quality Large-scale Single Object Tracking benchmark. LaSOT contains a diverse selection of 85 object classes, and offers 1550 totaling more than 3.87 million frames. Each video frame is carefully and manually annotated with a bounding box. This makes LaSOT, to our knowledge, the largest densely annotated tracking benchmark. Our goal in releasing LaSOT is to provide a dedicated high quality platform for both training and evaluation of trackers. The average video length of LaSOT is around 2500 frames, where each video contains various challenge factors that exist in real world video footage,such as the targets disappearing and re-appearing. These longer video lengths allow for the assessment of long-term trackers. To take advantage of the close connection between visual appearance and natural language, we provide language specification for each video in LaSOT. We believe such additions will allow for future research to use linguistic features to improve tracking. Two protocols, full-overlap and one-shot, are designated for flexible assessment of trackers. We extensively evaluate 48 baseline trackers on LaSOT with in-depth analysis, and results reveal that there still exists significant room for improvement. The complete benchmark, tracking results as well as analysis are available at http://vision.cs.stonybrook.edu/~lasot/ .

Proceedings ArticleDOI
19 Jan 2021
TL;DR: In this article, the authors proposed Audio ALBERT, a lite version of the self-supervised speech representation model, which achieved performance comparable with massive pre-trained networks in the downstream tasks while having 91% fewer parameters.
Abstract: Self-supervised speech models are powerful speech representation extractors for downstream applications. Recently, larger models have been utilized in acoustic model training to achieve better performance. We propose Audio ALBERT, a lite version of the self-supervised speech representation model. We apply the lightweight representation extractor to two downstream tasks, speaker classification and phoneme classification. We show that Audio ALBERT achieves performance comparable with massive pre-trained networks in the downstream tasks while having 91% fewer parameters. Moreover, we design probing models to measure how much the latent representations can encode the speaker’s and phoneme’s information. We find that the representations encoded in internal layers of Audio ALBERT contain more information for both phoneme and speaker than the last layer, which is generally used for downstream tasks. Our findings provide a new avenue for using self-supervised networks to achieve better performance and efficiency.

DOI
Heidi L. Rehm1, Heidi L. Rehm2, Angela Page2, Lindsay Smith3  +220 moreInstitutions (73)
10 Nov 2021
TL;DR: The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches.
Abstract: Summary The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.

Journal ArticleDOI
TL;DR: This work compares sparse and dense representations of predictive models in macroeconomic, microeconomics, and finance and specifies a prior that allows for both variable selection and shrinkage.
Abstract: We compare sparse and dense representations of predictive models in macroeconomics, microeconomics, and finance. To deal with a large number of possible predictors, we specify a prior that allows for both variable selection and shrinkage. The posterior distribution does not typically concentrate on a single sparse model, but on a wide set of models that often include many predictors.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: Wang et al. as discussed by the authors proposed Domain Consensus Clustering (DCC), which exploits the domain consensus knowledge to discover discriminative clusters on both common samples and private ones.
Abstract: In this paper, we investigate Universal Domain Adaptation (UniDA) problem, which aims to transfer the knowledge from source to target under unaligned label space. The main challenge of UniDA lies in how to separate common classes (i.e., classes shared across domains), from private classes (i.e., classes only exist in one domain). Previous works treat the private samples in the target as one generic class but ignore their intrinsic structure. Consequently, the resulting representations are not compact enough in the latent space and can be easily confused with common samples. To better exploit the intrinsic structure of the target domain, we propose Domain Consensus Clustering (DCC), which exploits the domain consensus knowledge to discover discriminative clusters on both common samples and private ones. Specifically, we draw the domain consensus knowledge from two aspects to facilitate the clustering and the private class discovery, i.e., the semantic-level consensus, which identifies the cycle-consistent clusters as the common classes, and the sample-level consensus, which utilizes the cross-domain classification agreement to determine the number of clusters and discover the private classes. Based on DCC, we are able to separate the private classes from the common ones, and differentiate the private classes themselves. Finally, we apply a class-aware alignment technique on identified common samples to minimize the distribution shift, and a prototypical regularizer to inspire discriminative target clusters. Experiments on four benchmarks demonstrate DCC significantly outperforms previous state-of-the-arts.

Journal ArticleDOI
TL;DR: In this article, the authors use textual analysis of high-dimensional data from patent documents to create new indicators of technological innovation and identify significant patents based on textual similarity of a given patent to previous and subsequent work: these patents are distinct from previous work but are related to subsequent innovations.
Abstract: We use textual analysis of high-dimensional data from patent documents to create new indicators of technological innovation We identify significant patents based on textual similarity of a given patent to previous and subsequent work: these patents are distinct from previous work but are related to subsequent innovations Our measure of patent significance is predictive of future citations and correlates strongly with measures of market value We identify breakthrough innovations as the most significant patents – those in the right tail of our measure – to construct indices of technological change at the aggregate, sectoral, and firm level Our technology indices span two centuries (1840-2010) and cover innovation by private and public firms, as well as non-profit organizations and the US government These indices capture the evolution of technological waves over a long time span and are strong predictors of productivity at the aggregate and sectoral level

Journal ArticleDOI
Cecilia Blundo1, Julieta Carilla1, Ricardo Grau1, Agustina Malizia1  +549 moreInstitutions (176)
TL;DR: In this paper, the authors show how a global community is responding to the challenges of tropical ecosystem research with diverse teams measuring forests tree-by-tree in thousands of long-term plots.

DOI
24 Jun 2021

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks, and they provide strong computable bounds on the amount of remaining information after forgetting.
Abstract: We show that the influence of a subset of the training samples can be removed – or "forgotten" – from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting. Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting, where we know that a "core" subset of the training samples does not need to be forgotten. While this variation of the problem is conceptually simple, we show that working in this setting significantly improves the accuracy and guarantees of forgetting methods applied to vision classification tasks. Moreover, our method allows efficient removal of all information contained in non-core data by simply setting to zero a subset of the weights with minimal loss in performance. We achieve these results by replacing a standard deep network with a suitable linear approximation. With opportune changes to the network architecture and training procedure, we show that such linear approximation achieves comparable performance to the original network and that the forgetting problem becomes quadratic and can be solved efficiently even for large models. Unlike previous forgetting methods on deep networks, ours can achieve close to the state-of-the-art accuracy on large scale vision tasks. In particular, we show that our method allows forgetting without having to trade off the model accuracy.

Journal ArticleDOI
01 Jul 2021
TL;DR: This work proposes a method that finds Lyapunov functions fully automatically—using machine learning—while also providing formal guarantees—using satisfiability modulo theories (SMT), and synthesises Lyap unov functions faster and over wider spatial domains than the alternatives, yet providing stronger or equal guarantees.
Abstract: We propose an automatic and formally sound method for synthesising Lyapunov functions for the asymptotic stability of autonomous non-linear systems Traditional methods are either analytical and require manual effort or are numerical but lack of formal soundness Symbolic computational methods for Lyapunov functions, which are in between, give formal guarantees but are typically semi-automatic because they rely on the user to provide appropriate function templates We propose a method that finds Lyapunov functions fully automatically—using machine learning—while also providing formal guarantees—using satisfiability modulo theories (SMT) We employ a counterexample-guided approach where a numerical learner and a symbolic verifier interact to construct provably correct Lyapunov neural networks (LNNs) The learner trains a neural network that satisfies the Lyapunov criteria for asymptotic stability over a samples set; the verifier proves via SMT solving that the criteria are satisfied over the whole domain or augments the samples set with counterexamples Our method supports neural networks with polynomial activation functions and multiple depth and width, which display wide learning capabilities We demonstrate our method over several non-trivial benchmarks and compare it favourably against a numerical optimisation-based approach, a symbolic template-based approach, and a cognate LNN-based approach Our method synthesises Lyapunov functions faster and over wider spatial domains than the alternatives, yet providing stronger or equal guarantees

Proceedings Article
03 May 2021
TL;DR: This paper proposes to use the backward mode linear relaxation based perturbation analysis (LiRPA) to replace LP during the BaB process, which can be efficiently implemented on the typical machine learning accelerators such as GPUs and TPUs and demonstrates an order of magnitude speedup compared to existing LP-based approaches.
Abstract: Formal verification of neural networks (NNs) is a challenging and important problem. Existing efficient complete solvers typically require the branch-and-bound (BaB) process, which splits the problem domain into sub-domains and solves each sub-domain using faster but weaker incomplete verifiers, such as Linear Programming (LP) on linearly relaxed sub-domains. In this paper, we propose to use the backward mode linear relaxation based perturbation analysis (LiRPA) to replace LP during the BaB process, which can be efficiently implemented on the typical machine learning accelerators such as GPUs and TPUs. However, unlike LP, LiRPA when applied naively can produce much weaker bounds and even cannot check certain conflicts of sub-domains during splitting, making the entire procedure incomplete after BaB. To address these challenges, we apply a fast gradient based bound tightening procedure combined with batch splits and the design of minimal usage of LP bound procedure, enabling us to effectively use LiRPA on the accelerator hardware for the challenging complete NN verification problem and significantly outperform LP-based approaches. On a single GPU, we demonstrate an order of magnitude speedup compared to existing LP-based approaches.

Journal ArticleDOI
TL;DR: A novel classification scheme for fairness metrics in machine learning is proposed based on how they handle pre-existing bias and thus align with the aims of non-discrimination law and provides concrete recommendations including a user-friendly checklist for choosing the most appropriate fairness metric for uses of machine learning and AI under EU non- discrimination law.
Abstract: Western societies are marked by diverse and extensive biases and inequality that are unavoidably embedded in the data used to train machine learning. Algorithms trained on biased data will, without intervention, produce biased outcomes and increase the inequality experienced by historically disadvantaged groups. Recognising this problem, much work has emerged in recent years to test for bias in machine learning and AI systems using various fairness and bias metrics. Often these metrics address technical bias but ignore the underlying causes of inequality. In this paper we make three contributions. First, we assess the compatibility of fairness metrics used in machine learning against the aims and purpose of EU non-discrimination law. We show that the fundamental aim of the law is not only to prevent ongoing discrimination, but also to change society, policies, and practices to ‘level the playing field’ and achieve substantive rather than merely formal equality. Based on this, we then propose a novel classification scheme for fairness metrics in machine learning based on how they handle pre-existing bias and thus align with the aims of non-discrimination law. Specifically, we distinguish between ‘bias preserving’ and ‘bias transforming’ fairness metrics. Our classification system is intended to bridge the gap between non-discrimination law and decisions around how to measure fairness in machine learning and AI in practice. Finally, we show that the legal need for justification in cases of indirect discrimination can impose additional obligations on developers, deployers, and users that choose to use bias preserving fairness metrics when making decisions about individuals because they can give rise to prima facie discrimination. To achieve substantive equality in practice, and thus meet the aims of the law, we instead recommend using bias transforming metrics. To conclude, we provide concrete recommendations including a user-friendly checklist for choosing the most appropriate fairness metric for uses of machine learning and AI under EU non-discrimination law.

Journal ArticleDOI
TL;DR: To assess the relationship between body mass index (BMI) classes and early COVID‐19 prognosis in inpatients with type 2 diabetes (T2D), a large number of patients with T2D were diagnosed with type 1 diabetes.
Abstract: Aim To assess the relationship between body mass index (BMI) classes and early COVID-19 prognosis in inpatients with type 2 diabetes (T2D). Methods From the CORONAvirus-SARS-CoV-2 and Diabetes Outcomes (CORONADO) study, we conducted an analysis in patients with T2D categorized by four BMI subgroups according to the World Health Organization classification. Clinical characteristics and COVID-19-related outcomes (i.e. intubation for mechanical ventilation [IMV], death and discharge by day 7 [D7]) were analysed according to BMI status. Results Among 1965 patients with T2D, 434 (22.1%) normal weight (18.5-24.9 kg/m2 , reference group), 726 (36.9%) overweight (25-29.9 kg/m2 ) and 805 (41.0%) obese subjects were analysed, including 491 (25.0%) with class I obesity (30-34.9 kg/m2 ) and 314 (16.0%) with class II/III obesity (≥35 kg/m2 ). In a multivariable-adjusted model, the primary outcome (i.e. IMV and/or death by D7) was significantly associated with overweight (OR 1.65 [1.05-2.59]), class I (OR 1.93 [1.19-3.14]) and class II/III obesity (OR 1.98 [1.11-3.52]). After multivariable adjustment, primary outcome by D7 was significantly associated with obesity in patients aged younger than 75 years, while such an association was no longer found in those aged older than 75 years. Conclusions Overweight and obesity are associated with poor early prognosis in patients with T2D hospitalized for COVID-19. Importantly, the deleterious impact of obesity on COVID-19 prognosis was no longer observed in the elderly, highlighting the need for specific management in this population.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: This article proposed an exponential moving average normalization (EMAN) to improve the performance of student-teacher based self- and semi-supervised learning techniques, which reduces the intrinsic cross-sample dependency of BN and enhances the generalization of the teacher.
Abstract: We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. Unlike the standard BN, where the statistics are computed within each batch, EMAN, used in the teacher, updates its statistics by exponential moving average from the BN statistics of the student. This design reduces the intrinsic cross-sample dependency of BN and enhances the generalization of the teacher. EMAN improves strong baselines for self-supervised learning by 4-6/1-2 points and semi-supervised learning by about 7/2 points, when 1%/10% supervised labels are available on ImageNet. These improvements are consistent across methods, network architectures, training duration, and datasets, demonstrating the general effectiveness of this technique. The code will be made available online.