Showing papers on "Concatenation published in 2018"

PDF

Open Access

Proceedings Article•DOI•

Deep Back-Projection Networks for Super-Resolution

[...]

Muhammad Haris¹, Greg Shakhnarovich², Norimichi Ukita¹•Institutions (2)

Toyota Technological Institute¹, Toyota Technological Institute at Chicago²

01 Jun 2018

TL;DR: Deep Back-Projection Networks (DBPN) as discussed by the authors exploit iterative up-and downsampling layers, providing an error feedback mechanism for projection errors at each stage, and construct mutually-connected up and down-sampling stages each of which represents different types of image degradation and high-resolution components.

...read moreread less

Abstract: The feed-forward architectures of recently proposed deep super-resolution networks learn representations of low-resolution inputs, and the non-linear mapping from those to high-resolution output. However, this approach does not fully address the mutual dependencies of low- and high-resolution images. We propose Deep Back-Projection Networks (DBPN), that exploit iterative up- and downsampling layers, providing an error feedback mechanism for projection errors at each stage. We construct mutually-connected up- and down-sampling stages each of which represents different types of image degradation and high-resolution components. We show that extending this idea to allow concatenation of features across up- and downsampling stages (Dense DBPN) allows us to reconstruct further improve super-resolution, yielding superior results and in particular establishing new state of the art results for large scaling factors such as 8A— across multiple data sets.

...read moreread less

1,269 citations

Posted Content•

Conditional BERT Contextual Augmentation.

[...]

Xing Wu¹, Shangwen Lv¹, Liangjun Zang¹, Jizhong Han¹, Songlin Hu¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

17 Dec 2018-arXiv: Computation and Language

TL;DR: The authors proposed a conditional BERT contextual augmentation method for text classification, which replaces words with more varied substitutions predicted by a language model and showed that a deep bidirectional language model is more powerful than either unidirectional or shallow concatenation of a forward and backward model.

...read moreread less

Abstract: We propose a novel data augmentation method for labeled sentences called conditional BERT contextual augmentation. Data augmentation methods are often applied to prevent overfitting and improve generalization of deep neural network models. Recently proposed contextual augmentation augments labeled sentences by randomly replacing words with more varied substitutions predicted by language model. BERT demonstrates that a deep bidirectional language model is more powerful than either an unidirectional language model or the shallow concatenation of a forward and backward model. We retrofit BERT to conditional BERT by introducing a new conditional masked language model\footnote{The term "conditional masked language model" appeared once in original BERT paper, which indicates context-conditional, is equivalent to term "masked language model". In our paper, "conditional masked language model" indicates we apply extra label-conditional constraint to the "masked language model".} task. The well trained conditional BERT can be applied to enhance contextual augmentation. Experiments on six various different text classification tasks show that our method can be easily applied to both convolutional or recurrent neural networks classifier to obtain obvious improvement.

...read moreread less

150 citations

Posted Content•

Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information

[...]

Seonhoon Kim¹, Inho Kang¹, Nojun Kwak²•Institutions (2)

Naver Corporation¹, Seoul National University²

29 May 2018-arXiv: Computation and Language

TL;DR: The authors proposed a densely-connected co-attentive recurrent neural network (C-RNN), which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers.

...read moreread less

Abstract: Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. For these tasks, understanding logical and semantic relationship between two sentences is required but it is yet challenging. Although attention mechanism is useful to capture the semantic relationship and to properly align the elements of two sentences, previous methods of attention mechanism simply use a summation operation which does not retain original features enough. Inspired by DenseNet, a densely connected convolutional network, we propose a densely-connected co-attentive recurrent neural network, each layer of which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers. It enables preserving the original and the co-attentive feature information from the bottommost word embedding layer to the uppermost recurrent layer. To alleviate the problem of an ever-increasing size of feature vectors due to dense concatenation operations, we also propose to use an autoencoder after dense concatenation. We evaluate our proposed architecture on highly competitive benchmark datasets related to sentence matching. Experimental results show that our architecture, which retains recurrent and attentive features, achieves state-of-the-art performances for most of the tasks.

...read moreread less

107 citations

Posted Content•

Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations

[...]

Andreas Rücklé, Steffen Eger, Maxime Peyrard, Iryna Gurevych

04 Mar 2018-arXiv: Computation and Language

TL;DR: It is shown that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually.

...read moreread less

Abstract: Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. Here, we generalize the concept of average word embeddings to power mean word embeddings. We show that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually. In addition, our proposed method outperforms different recently proposed baselines such as SIF and Sent2Vec by a solid margin, thus constituting a much harder-to-beat monolingual baseline.

...read moreread less

101 citations

Book Chapter•DOI•

Roto-translation covariant convolutional networks for medical image analysis

[...]

Erik J. Bekkers¹, Maxime W. Lafarge¹, Mitko Veta¹, Koen A. J. Eppenhof¹, Josien P. W. Pluim¹, Remco Duits¹ - Show less +2 more•Institutions (1)

Eindhoven University of Technology¹

10 Apr 2018

TL;DR: This article propose a framework for rotation and translation covariant deep learning using SE(2) group convolutions, which encode this geometric structure into convolutional neural networks (CNNs) via SE( 2) group CNN layers, which fit into the standard 2D CNN framework, and which allow generically deal with rotated input samples without the need for data augmentation.

...read moreread less

Abstract: We propose a framework for rotation and translation covariant deep learning using SE(2) group convolutions. The group product of the special Euclidean motion group SE(2) describes how a concatenation of two roto-translations results in a net roto-translation. We encode this geometric structure into convolutional neural networks (CNNs) via SE(2) group convolutional layers, which fit into the standard 2D CNN framework, and which allow to generically deal with rotated input samples without the need for data augmentation.

...read moreread less

97 citations

Posted Content•

Deep Reinforcement Learning for Swarm Systems

[...]

Maximilian Hüttenrauch¹, Adrian Šošić², Gerhard Neumann¹•Institutions (2)

University of Lincoln¹, Technische Universität Darmstadt²

17 Jul 2018-arXiv: Multiagent Systems

TL;DR: In this article, mean embeddings of distributions of agents are used to represent the information content required for decentralized decision making in a swarm of homogeneous agents, where the agents are treated as samples of a distribution and use the empirical mean embedding as input for a decentralized policy.

...read moreread less

Abstract: Recently, deep reinforcement learning (RL) methods have been applied successfully to multi-agent scenarios. Typically, these methods rely on a concatenation of agent states to represent the information content required for decentralized decision making. However, concatenation scales poorly to swarm systems with a large number of homogeneous agents as it does not exploit the fundamental properties inherent to these systems: (i) the agents in the swarm are interchangeable and (ii) the exact number of agents in the swarm is irrelevant. Therefore, we propose a new state representation for deep multi-agent RL based on mean embeddings of distributions. We treat the agents as samples of a distribution and use the empirical mean embedding as input for a decentralized policy. We define different feature spaces of the mean embedding using histograms, radial basis functions and a neural network learned end-to-end. We evaluate the representation on two well known problems from the swarm literature (rendezvous and pursuit evasion), in a globally and locally observable setup. For the local setup we furthermore introduce simple communication protocols. Of all approaches, the mean embedding representation using neural network features enables the richest information exchange between neighboring agents facilitating the development of more complex collective strategies.

...read moreread less

95 citations

Proceedings Article•DOI•

Frustratingly Easy Meta-Embedding -- Computing Meta-Embeddings by Averaging Source Word Embeddings

[...]

Joshua Coates, Danushka Bollegala¹•Institutions (1)

University of Liverpool¹

01 Mar 2018

TL;DR: The authors showed that the arithmetic mean of two distinct word embedding sets yields a performant meta-embedding that is comparable or better than more complex meta embedding learning methods, despite the incomparability of the source vector spaces.

...read moreread less

Abstract: Creating accurate meta-embeddings from pre-trained source embeddings has received attention lately. Methods based on global and locally-linear transformation and concatenation have shown to produce accurate meta-embeddings. In this paper, we show that the arithmetic mean of two distinct word embedding sets yields a performant meta-embedding that is comparable or better than more complex meta-embedding learning methods. The result seems counter-intuitive given that vector spaces in different source embeddings are not comparable and cannot be simply averaged. We give insight into why averaging can still produce accurate meta-embedding despite the incomparability of the source vector spaces.

...read moreread less

69 citations

Journal Article•DOI•

Modeling coherent errors in quantum error correction

[...]

Daniel Greenbaum¹, Zachary Dutton¹•Institutions (1)

BBN Technologies¹

01 Jan 2018

TL;DR: In this article, the authors examined the accuracy of the Pauli approximation for coherent errors on data qubits under the repetition code and found that coherent errors result in logical errors that are partially coherent and therefore non-Pauli.

...read moreread less

Abstract: Analysis of quantum error correcting codes is typically done using a stochastic, Pauli channel error model for describing the noise on physical qubits. However, it was recently found that coherent errors (systematic rotations) on physical data qubits result in both physical and logical error rates that differ significantly from those predicted by a Pauli model. Here we examine the accuracy of the Pauli approximation for coherent errors on data qubits under the repetition code. We analytically evaluate the logical error as a function of concatenation level and code distance. We find that coherent errors result in logical errors that are partially coherent and therefore non-Pauli. However, the coherent part of the error is negligible after two or more concatenation levels or at fewer than $\epsilon^{-(d-1)}$ error correction cycles, where $\epsilon \ll 1$ is the rotation angle error per cycle for a single physical qubit and $d$ is the code distance. These results lend support to the validity of modeling coherent errors using a Pauli channel under some minimum requirements for code distance and/or concatenation.

...read moreread less

48 citations

Journal Article•DOI•

Hierarchical Distribution Matching for Probabilistically Shaped Coded Modulation

[...]

Tsuyoshi Yoshida¹, Magnus Karlsson², Erik Agrell²•Institutions (2)

Mitsubishi Electric¹, Chalmers University of Technology²

05 Sep 2018-arXiv: Signal Processing

TL;DR: In this paper, a hierarchical distribution matching (DM) and dematching (invDM) scheme for probabilistic shaping with soft-decision forward error correction (FEC) coding is proposed.

...read moreread less

Abstract: The implementation difficulties of combining distribution matching (DM) and dematching (invDM) for probabilistic shaping (PS) with soft-decision forward error correction (FEC) coding can be relaxed by reverse concatenation, for which the FEC coding and decoding lies inside the shaping algorithms. PS can seemingly achieve performance close to the Shannon limit, although there are practical implementation challenges that need to be carefully addressed. We propose a hierarchical DM (HiDM) scheme, having fully parallelized input/output interfaces and a pipelined architecture that can efficiently perform the DM/invDM without the complex operations of previously proposed methods such as constant composition DM (CCDM). Furthermore, HiDM can operate at a significantly larger post-FEC bit error rate (BER) for the same post-invDM BER performance, which facilitates simulations. These benefits come at the cost of a slightly larger rate loss and required signal-to-noise ratio at a given post-FEC BER.

...read moreread less

48 citations

Proceedings Article•DOI•

Mixed Link Networks

[...]

Wenhai Wang¹, Xiang Li², Jian Yang², Tong Lu¹•Institutions (2)

Nanjing University¹, Nanjing University of Science and Technology²

06 Feb 2018

TL;DR: This work presents a highly efficient and modularized Mixed Link Network (MixNet) which is equipped with flexible inner link and outer link modules and demonstrates that MixNets can achieve superior efficiency in parameter over the state-of-the-art architectures on many competitive datasets like CIFAR-10/100, SVHN and ImageNet.

...read moreread less

Abstract: Basing on the analysis by revealing the equivalence of modern networks, we find that both ResNet and DenseNet are essentially derived from the same "dense topology", yet they only differ in the form of connection -- addition (dubbed "inner link") vs. concatenation (dubbed "outer link"). However, both two forms of connections have the superiority and insufficiency. To combine their advantages and avoid certain limitations on representation learning, we present a highly efficient and modularized Mixed Link Network (MixNet) which is equipped with flexible inner link and outer link modules. Consequently, ResNet, DenseNet and Dual Path Network (DPN) can be regarded as a special case of MixNet, respectively. Furthermore, we demonstrate that MixNets can achieve superior efficiency in parameter over the state-of-the-art architectures on many competitive datasets like CIFAR-10/100, SVHN and ImageNet.

...read moreread less

37 citations

Proceedings Article•DOI•

Subtrajectory Clustering: Models and Algorithms

[...]

Pankaj K. Agarwal¹, Kyle Fox², Kamesh Munagala¹, Abhinandan Nath¹, Jiangwei Pan³, Erin Taylor¹ - Show less +2 more•Institutions (3)

Duke University¹, University of Texas at Dallas², Facebook³

27 May 2018

TL;DR: It is shown that the subtrajectory clustering problem is NP-Hard and the algorithm indeed handles the desiderata of being robust to variations, being efficient and accurate, and being data-driven.

...read moreread less

Abstract: We propose a model for subtrajectory clustering ---the clustering of subsequences of trajectories; each cluster of subtrajectories is represented as a pathlet, a sequence of points that is not necessarily a subsequence of an input trajectory. Given a set of trajectories, our clustering model attempts to capture the shared portions between them by assuming each trajectory is a concatenation of a small set of pathlets, with possible gaps in between. We present a single objective function for finding the optimal collection of pathlets that best represents the trajectories taking into account noise and other artifacts of the data. We show that the subtrajectory clustering problem is NP-Hard and present fast approximation algorithms for subtrajectory clustering. We further improve the running time of our algorithm if the input trajectories are "well-behaved." Finally, we present experimental results on both real and synthetic data sets. We show via visualization and quantitative analysis that the algorithm indeed handles the desiderata of being robust to variations, being efficient and accurate, and being data-driven.

...read moreread less

Journal Article•DOI•

On short-length error-correcting codes for 5G-NR

[...]

Johannes Van Wonterghem¹, Amira Alloum², Joseph J. Boutros³, Marc Moeneclaey¹•Institutions (3)

Ghent University¹, Bell Labs², Texas A&M University³

01 Oct 2018

TL;DR: Optimizations of the Ordered Statistics Decoder are discussed and revealed to bring near-ML performance with a notable complexity reduction, making the decoding complexity at very short length affordable.

...read moreread less

Abstract: We compare the performance of a selection of short-length and very short-length linear binary error-correcting codes on the binary-input Gaussian noise channel, and on the fast and quasi-static flat Rayleigh fading channel. We use the probabilistic Ordered Statistics Decoder, that is universal to any code construction. As such we compare codes and not decoders. The word error rate versus the signal-to-noise ratio is found for LDPC, Reed–Muller, polar, turbo, Golay, random, and BCH codes at length 20, 32 and 256 bits. BCH and random codes outperform other codes in absence of a cyclic redundancy check concatenation. Under joint decoding, the concatenation of a cyclic redundancy check makes all codes perform very close to optimal lower bounds. Optimizations of the Ordered Statistics Decoder are discussed and revealed to bring near-ML performance with a notable complexity reduction, making the decoding complexity at very short length affordable.

...read moreread less

Journal Article•DOI•

Exploring data processing strategies in NGS target enrichment to disentangle radiations in the tribe Cardueae (Compositae)

[...]

Sonia Herrando-Moraira¹, Juan Antonio Calleja², Pau Carnicero², Kazumi Fujikawa³, Mercè Galbany-Casals², Núria Garcia-Jacas¹, Hyoung Tak Im⁴, Seung-Chul Kim⁵, Jianquan Liu⁶, Javier López-Alvarado², Jordi López-Pujol¹, Jennifer R. Mandel⁷, Sergi Massó¹, Iraj Mehregan⁸, Noemí Montes-Moreno¹, Elizaveta A. Pyak⁹, Cristina Roquet¹⁰, Llorenç Sáez², Alexander N. Sennikov¹¹, Alexander N. Sennikov¹², Alfonso Susanna¹, Roser Vilatersana¹ - Show less +18 more•Institutions (12)

Spanish National Research Council¹, Autonomous University of Barcelona², Makino Botanical Garden³, Chonnam National University⁴, Sungkyunkwan University⁵, Sichuan University⁶, University of Memphis⁷, Islamic Azad University⁸, Tomsk State University⁹, University of Savoy¹⁰, University of Helsinki¹¹, Russian Academy of Sciences¹²

01 Nov 2018-Molecular Phylogenetics and Evolution

TL;DR: The usefulness of the set of 1061 COS targets (a nuclear conserved orthology loci set developed for the Compositae) across a variety of taxonomic levels is confirmed and methodological choices significantly affected phylogenies in terms of topology, branch length, and support.

...read moreread less

Journal Article•DOI•

Anti-powers in infinite words

[...]

Gabriele Fici¹, Antonio Restivo¹, Manuel Silva², Luca Q. Zamboni³•Institutions (3)

University of Palermo¹, Universidade Nova de Lisboa², Claude Bernard University Lyon 1³

01 Jul 2018

TL;DR: This paper defines an anti-power of order k as a concatenation of k consecutive pairwise distinct blocks of the same length and derives that at every position of an aperiodic uniformly recurrent word start anti-powers of any order.

...read moreread less

Abstract: In combinatorics of words, a concatenation of k consecutive equal blocks is called a power of order k. In this paper we take a different point of view and define an anti-power of order k as a concatenation of k consecutive pairwise distinct blocks of the same length. As a main result, we show that every infinite word contains powers of any order or anti-powers of any order. That is, the existence of powers or anti-powers is an unavoidable regularity. Indeed, we prove a stronger result, which relates the density of anti-powers to the existence of a factor that occurs with arbitrary exponent. From these results, we derive that at every position of an aperiodic uniformly recurrent word start anti-powers of any order. We further show that any infinite word avoiding anti-powers of order 3 is ultimately periodic, and that there exist aperiodic words avoiding anti-powers of order 4. We also show that there exist aperiodic recurrent words avoiding anti-powers of order 6, and leave open the question whether there exist aperiodic recurrent words avoiding anti-powers of order k for k=4,5.

...read moreread less

Proceedings Article•DOI•

Bigrams and BiLSTMs Two Neural Networks for Sequential Metaphor Detection

[...]

Yuri Bizzoni¹, Mehdi Ghanimifard¹•Institutions (1)

University of Gothenburg¹

01 Jun 2018

TL;DR: Two alternative deep neural architectures to perform word-level metaphor detection on text are presented and compared: a bi-LSTM model and a new structure based on recursive feed-forward concatenation of the input.

...read moreread less

Abstract: We present and compare two alternative deep neural architectures to perform word-level metaphor detection on text: a bi-LSTM model and a new structure based on recursive feed-forward concatenation of the input. We discuss different versions of such models and the effect that input manipulation - specifically, reducing the length of sentences and introducing concreteness scores for words - have on their performance.

...read moreread less

Book Chapter•DOI•

TinyKeys: A New Approach to Efficient Multi-Party Computation

[...]

Carmit Hazay¹, Emmanuela Orsini², Peter Scholl³, Eduardo Soria-Vazquez⁴•Institutions (4)

Bar-Ilan University¹, Katholieke Universiteit Leuven², Aarhus University³, University of Bristol⁴

19 Aug 2018

TL;DR: In this paper, the authors present a new approach to design concretely efficient MPC protocols with semi-honest security in the dishonest majority setting, motivated by the fact that the efficiency of most practical protocols does not depend on the number of honest parties.

...read moreread less

Abstract: We present a new approach to designing concretely efficient MPC protocols with semi-honest security in the dishonest majority setting. Motivated by the fact that within the dishonest majority setting the efficiency of most practical protocols does not depend on the number of honest parties, we investigate how to construct protocols which improve in efficiency as the number of honest parties increases. Our central idea is to take a protocol which is secure for $n-1$ corruptions and modify it to use short symmetric keys, with the aim of basing security on the concatenation of all honest parties’ keys. This results in a more efficient protocol tolerating fewer corruptions, whilst also introducing an LPN-style syndrome decoding assumption.

...read moreread less

Proceedings Article•DOI•

Polar Coding for Deletion Channels: Theory and Implementation

[...]

Kuangda Tian¹, Arman Fazeli², Alexander Vardy²•Institutions (2)

Beihang University¹, University of California, San Diego²

17 Jun 2018

TL;DR: This paper presents an implementation of low-complexity polar SC decoder for deletion channels, and proves polarization theorems for the polar bit-channels in presence of deletions when $d$ = o(n), which implies that the coding scheme is capable of achieving the symmetric information rate for this concatenated scheme with diminishing error probabilities as $n$ becomes large.

...read moreread less

Abstract: In this paper, we propose a polar coding scheme for binary deletion channels. We also present an implementation of low-complexity polar SC decoder for deletion channels. The modified decoding algorithm requires only $O(d^{2}n\log n)$ computational complexity, where $d$ and $n$ respectively denote the number of deletions and the code-length. This is a huge improvement over naive implementation of the SC decoder for channels with deletion with $O(n^{d+1}\log n)$ computation complexity that was recently proposed by Thomas et al. in [21], and is based on running individual instances of SC decoder for every deletion pattern while treating the deleted symbols as erasures. We also prove polarization theorems for the polar bit-channels in presence of deletions when $d$ = o(n), which implies that our coding scheme is capable of achieving the symmetric information rate for this concatenated scheme with diminishing error probabilities as $n$ becomes large. The same framework, in both theory and implementation, is also applicable to channels formed as a concatenation between binary discrete memoryless channels and the d-deletion channel, which marks our coding scheme as the first family of practical codes that is capable of decoding noisy channels with deletions at the optimal code rate.

...read moreread less

Proceedings Article•DOI•

Coreference and Coherence in Neural Machine Translation: A Study Using Oracle Experiments

[...]

Dario Stojanovski, Alexander Fraser

01 Oct 2018

TL;DR: It is shown that NMT models taking advantage of context oracle signals can achieve considerable gains in BLEU, of up to 7.02 B LEU for coreference and 1.89 BLEu for coherence on subtitles translation.

...read moreread less

Abstract: Cross-sentence context can provide valuable information in Machine Translation and is critical for translation of anaphoric pronouns and for providing consistent translations. In this paper, we devise simple oracle experiments targeting coreference and coherence. Oracles are an easy way to evaluate the effect of different discourse-level phenomena in NMT using BLEU and eliminate the necessity to manually define challenge sets for this purpose. We propose two context-aware NMT models and compare them against models working on a concatenation of consecutive sentences. Concatenation models perform better, but are computationally expensive. We show that NMT models taking advantage of context oracle signals can achieve considerable gains in BLEU, of up to 7.02 BLEU for coreference and 1.89 BLEU for coherence on subtitles translation. Access to strong signals allows us to make clear comparisons between context-aware models.

...read moreread less

Journal Article•DOI•

Erratum: Moving towards a molecular taxonomy of autoimmune rheumatic diseases

[...]

Guillermo Barturen, Lorenzo Beretta, Ricard Cervera, Ronald F van Vollenhoven, Marta E. Alarcón-Riquelme - Show less +1 more

01 Mar 2018-Nature Reviews Rheumatology

TL;DR: This corrects the article DOI: 10.1038/nrrheum.2017.220 to NRRheum 2017, which indicates that the author’s work was first published in 2017, rather than 2016, which was previously reported.

...read moreread less

Abstract: Nature Reviews Rheumatology 14, 75–93 (2018) In the original version of this article, concatenation and non-concatenation were incorrectly referred to as catenation and non-catenation in the subheadings in Table 2 and in a subheading on page 87 in the main text. These errors have now been corrected in the PDF and HTML versions of the article.

...read moreread less

Journal Article•DOI•

Dynamic Relative Compression, Dynamic Partial Sums, and Substring Concatenation

[...]

Philip Bille¹, Anders Roy Christiansen¹, Patrick Hagge Cording¹, Inge Li Gørtz¹, Frederik Rye Skjoldjensen¹, Hjalte Wedel Vildhøj¹, Søren Vind¹ - Show less +3 more•Institutions (1)

Technical University of Denmark¹

01 Nov 2018-Algorithmica

TL;DR: In this paper, the authors study relative compression in a dynamic setting where the compressed source string S is subject to edit operations and present new data structures that achieve optimal time for updates and queries while using space linear in the size of the optimal relative compression, for nearly all combinations of parameters.

...read moreread less

Abstract: Given a static reference string R and a source string S, a relative compression of S with respect to R is an encoding of S as a sequence of references to substrings of R. Relative compression schemes are a classic model of compression and have recently proved very successful for compressing highly-repetitive massive data sets such as genomes and web-data. We initiate the study of relative compression in a dynamic setting where the compressed source string S is subject to edit operations. The goal is to maintain the compressed representation compactly, while supporting edits and allowing efficient random access to the (uncompressed) source string. We present new data structures that achieve optimal time for updates and queries while using space linear in the size of the optimal relative compression, for nearly all combinations of parameters. We also present solutions for restricted and extended sets of updates. To achieve these results, we revisit the dynamic partial sums problem and the substring concatenation problem. We present new optimal or near optimal bounds for these problems. Plugging in our new results we also immediately obtain new bounds for the string indexing for patterns with wildcards problem and the dynamic text and static pattern matching problem.

...read moreread less

Journal Article•DOI•

New Constructions of Binary and Ternary Locally Repairable Codes Using Cyclic Codes

[...]

Chanki Kim¹, Jong-Seon No¹•Institutions (1)

Seoul National University¹

01 Feb 2018-IEEE Communications Letters

TL;DR: New constructions of binary and ternary locally repairable codes (LRCs) using cyclic codes and their concatenation are proposed, and the similar method of the binary case is applied to construct the Ternary LRCs with good parameters.

...read moreread less

Abstract: New constructions of binary and ternary locally repairable codes (LRCs) using cyclic codes and their concatenation are proposed. The proposed binary LRCs with $d=4$ and some $r$ and with $d\ge 5$ and some $n$ are shown to be optimal in terms of the upper bounds. In addition, the similar method of the binary case is applied to construct the ternary LRCs with good parameters.

...read moreread less

Posted Content•

Frustratingly Easy Meta-Embedding -- Computing Meta-Embeddings by Averaging Source Word Embeddings.

[...]

Joshua Coates, Danushka Bollegala¹•Institutions (1)

University of Liverpool¹

14 Apr 2018-arXiv: Computation and Language

TL;DR: This paper shows that the arithmetic mean of two distinct word embedding sets yields a performant meta-embedding that is comparable or better than more complex meta- embedding learning methods.

...read moreread less

Posted Content•

TinyKeys: A New Approach to Efficient Multi-Party Computation.

[...]

Carmit Hazay, Emmanuela Orsini, Peter Scholl, Eduardo Soria-Vazquez

01 Jan 2018-IACR Cryptology ePrint Archive

TL;DR: This work investigates how to construct protocols which improve in efficiency as the number of honest parties increases, and takes a protocol which is secure for $n-1$ corruptions and modify it to use short symmetric keys, with the aim of basing security on the concatenation of all honest parties’ keys.

...read moreread less

Journal Article•

Separating regular languages with two quantifier alternations

[...]

Thomas Place

16 Nov 2018-Logical Methods in Computer Science

TL;DR: This work investigates the quantifier alternation hierarchy of first-order logic over finite words with a reliance on the separation problem and obtains as a corollary that one can decide whether a regular language is definable by a a#x03A3; 4 formula.

...read moreread less

Abstract: We investigate a famous decision problem in automata theory: separation. Given a class of language C, the separation problem for C takes as input two regular languages and asks whether there exists a third one which belongs to C, includes the first one and is disjoint from the second. Typically, obtaining an algorithm for separation yields a deep understanding of the investigated class C. This explains why a lot of effort has been devoted to finding algorithms for the most prominent classes. Here, we are interested in classes within concatenation hierarchies. Such hierarchies are built using a generic construction process: one starts from an initial class called the basis and builds new levels by applying generic operations. The most famous one, the dot-depth hierarchy of Brzozowski and Cohen, classifies the languages definable in first-order logic. Moreover, it was shown by Thomas that it corresponds to the quantifier alternation hierarchy of first-order logic: each level in the dot-depth corresponds to the languages that can be defined with a prescribed number of quantifier blocks. Finding separation algorithms for all levels in this hierarchy is among the most famous open problems in automata theory. Our main theorem is generic: we show that separation is decidable for the level 3/2 of any concatenation hierarchy whose basis is finite. Furthermore, in the special case of the dot-depth, we push this result to the level 5/2. In logical terms, this solves separation for $\Sigma_3$: first-order sentences having at most three quantifier blocks starting with an existential one.

...read moreread less

Proceedings Article•DOI•

Joint I-Vector with End-to-End System for Short Duration Text-Independent Speaker Verification

[...]

Zili Huang¹, Shuai Wang¹, Yanmin Qian²•Institutions (2)

Shanghai Jiao Tong University¹, Tencent²

01 Apr 2018

TL;DR: This paper develops and compares four methodologies to integrate traditional $i$-vector into end-to-end systems, including score fusion, embeddings concatenation, transformed Concatenation and joint learning and achieves significant gains.

...read moreread less

Abstract: Factor analysis based $i$ -vector has been the state-of-the-art method for speaker verification. Recently, researchers propose to build DNN based end-to-end speaker verification systems and achieve comparable performance with $i$ -vector. Since these two methods possess their own property and differ from each other significantly, we explore a framework to integrate these two paradigms together to utilize their complementarity. More specifically, in this paper we develop and compare four methodologies to integrate traditional $i$ -vector into end-to-end systems, including score fusion, embeddings concatenation, transformed concatenation and joint learning. All these approaches achieve significant gains. Moreover, the hard trial selection is performed on the end-to-end architecture which further improves the performance. Experimental results on a text-independent short-duration dataset generated from SRE 2010 reveal that the newly proposed method reduces the EER by relative 31.0% and 28.2% compared to the $i$ -vector and end-to-end baselines respectively.

...read moreread less

Proceedings Article•DOI•

Technologies Toward Implementation of Probabilistic Constellation Shaping

[...]

Tsuyoshi Yoshida¹, Magnus Karlsson², Erik Agrell²•Institutions (2)

Mitsubishi Electric¹, Chalmers University of Technology²

01 Sep 2018

TL;DR: Reverse concatenation of forward error correction and distribution matching significantly improves the implementation capability of probabilistic constellation shaping and should be considered to take full advantage of the benefits.

...read moreread less

Abstract: Reverse concatenation of forward error correction and distribution matching significantly improves the implementation capability of probabilistic constellation shaping. However, to take full advantage of the benefits, one should carefully understand the practical aspects and trade-offs.

...read moreread less

Posted Content•DOI•

Inferring Species Trees Using Integrative Models of Species Evolution

[...]

Huw A. Ogilvie¹, Timothy G. Vaughan², Nicholas J. Matzke³, Graham J. Slater⁴, Tanja Stadler², David Welch³, Alexei J. Drummond³ - Show less +3 more•Institutions (4)

Australian National University¹, ETH Zurich², University of Auckland³, University of Chicago⁴

07 Jan 2018-bioRxiv

TL;DR: An integrative model of evolution which combines both the FBD and MSC models is developed, which coherently models fossilization and gene evolution, and does not require an a priori substitution rate estimate to calibrate the molecular clock.

...read moreread less

Abstract: Bayesian methods can be used to accurately estimate species tree topologies, times and other parameters, but only when the models of evolution which are available and utilized sufficiently account for the underlying evolutionary processes Multispecies coalescent (MSC) models have been shown to accurately account for the evolution of genes within species in the absence of strong gene flow between lineages, and fossilized birth-death (FBD) models have been shown to estimate divergence times from fossil data in good agreement with expert opinion Until now dating analyses using the MSC have been based on a fixed clock or informally derived node priors instead of the FBD On the other hand, dating analyses using an FBD process have concatenated all gene sequences and ignored coalescence processes To address these mirror-image deficiencies in evolutionary models, we have developed an integrative model of evolution which combines both the FBD and MSC models By applying concatenation and the MSC (without employing the FBD process) to an exemplar data set consisting of molecular sequence data and morphological characters from the dog and fox subfamily Caninae, we show that concatenation causes predictable biases in estimated branch lengths We then applied concatenation using the FBD process and the combined FBD-MSC model to show that the same biases are still observed when the FBD process is employed These biases can be avoided by using the FBD-MSC model, which coherently models fossilization and gene evolution, and does not require an a priori substitution rate estimate to calibrate the molecular clock We have implemented the FBD-MSC in a new version of StarBEAST2, a package developed for the BEAST2 phylogenetic software

...read moreread less

Journal Article•DOI•

Interleaver Design for Short Concatenated Codes

[...]

Giacomo Ricciutelli¹, Marco Baldi¹, Franco Chiaraluce¹•Institutions (1)

Marche Polytechnic University¹

13 Jul 2018-IEEE Communications Letters

TL;DR: By focusing on the minimum distance of the overall concatenated code, this work proposes an algorithmic method for the design of good interleavers that is compared with classical approaches based on random searches to assess its advantages.

...read moreread less

Abstract: The choice of the interleaver may significantly affect the performance of short codes when they are used in serial concatenation. By focusing on the minimum distance of the overall concatenated code, we propose an algorithmic method for the design of good interleavers. As a valuable example of application, we consider the case of polar codes concatenated with cyclic redundancy check codes. For these codes, the method we propose is compared with classical approaches based on random searches to assess its advantages, which are also confirmed through examples of practical coded transmissions over the binary erasure channel.

...read moreread less

Book Chapter•DOI•

Formal Languages over GF(2)

[...]

Ekaterina Bakinova¹, Artem Basharin², Igor Batmanov³, Konstantin Lyubort⁴, Alexander Okhotin⁵, Elizaveta Sazhneva⁵ - Show less +2 more•Institutions (5)

Gubkin Russian State University of Oil and Gas¹, National Research University – Higher School of Economics², Moscow Institute of Physics and Technology³, Saint Petersburg Academic University⁴, Saint Petersburg State University⁵

09 Apr 2018

TL;DR: Variants of the union and concatenation operations on formal languages are investigated, in which Boolean logic in the definitions is replaced with the operations in the two-element field GF(2) (conjunction and exclusive OR), and a new class of formal grammars based on GF( 2)-operations is defined.

...read moreread less

Abstract: Variants of the union and concatenation operations on formal languages are investigated, in which Boolean logic in the definitions (that is, conjunction and disjunction) is replaced with the operations in the two-element field GF(2) (conjunction and exclusive OR). Union is thus replaced with symmetric difference, whereas concatenation gives rise to a new GF(2)-concatenation operation, which is notable for being invertible. All operations preserve regularity, and their state complexity is determined. Next, a new class of formal grammars based on GF(2)-operations is defined, and it is shown to have the same computational complexity as ordinary grammars with union and concatenation.

...read moreread less

Proceedings Article•DOI•

Automatic Visual Augmentation for Concatenation Based Synthesized Articulatory Videos from Real-time MRI Data for Spoken Language Training.

[...]

Chandana Srinivasan, Chiranjeevi Yarra¹, Ritu Aggarwal¹, Sanjeev Kumar Mittal¹, N. K. Kausthubha¹, Raseena K. T¹, Astha Singh, Prasanta Kumar Ghosh¹ - Show less +4 more•Institutions (1)

Indian Institute of Science¹

02 Sep 2018

TL;DR: This work proposes an augmentation method using pixel intensities in the regions enclosed by the articulatory boundaries obtained from air-tissue boundaries (ATBs) to synthesize ATBs using the ATBs from a few selected frames that have been used in synthesizing the articulation videos.

...read moreread less

Abstract: For the benefit of spoken language training, concatenation based articulatory video synthesis has been proposed in the past to overcome the limitation in the articulatory data recording. For this, real time magnetic resonance imaging (rt-MRI) video image-frames (IFs) containing articulatory movements have been used. These IFs require a visual augmentation for better understanding. We, in this work, propose an augmentation method using pixel intensities in the regions enclosed by the articulatory boundaries obtained from air-tissue boundaries (ATBs). Since, the pixel intensities reflect the muscle movements in the articulators, the augmented IFs could provide realistic articulatory movements, when we color them accordingly. However, the ATB manual annotation is time consuming; hence, we propose to synthesize ATBs using the ATBs from a few selected frames that have been used in synthesizing the articulatory videos. We augment a set of synthesized articulatory videos for 50 words obtained from the MRI-TIMIT database. Subjective evaluation on the quality of the augmented videos using twenty-one subjects suggests that the videos are visually more appealing than the respective synthesized rt-MRI videos with a rating of 3.75 out of 5, where a score of 5 (1) indicates that the augmented video quality is excellent (poor).

...read moreread less

Collapse