Showing papers by "Helsinki University of Technology published in 2019"

PDF

Open Access

Posted Content•

InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization

[...]

Fan-Yun Sun¹, Jordan Hoffmann², Vikas Verma³, Jian Tang⁴•Institutions (4)

National Taiwan University¹, Harvard University², Helsinki University of Technology³, HEC Montréal⁴

31 Jul 2019-arXiv: Learning

TL;DR: Experimental results on the tasks of graph classification and molecular property prediction show that InfoGraph is superior to state-of-the-art baselines and InfoGraph* can achieve performance competitive with state- of- the-art semi-supervised models.

...read moreread less

Abstract: This paper studies learning the representations of whole graphs in both unsupervised and semi-supervised scenarios. Graph-level representations are critical in a variety of real-world applications such as predicting the properties of molecules and community analysis in social networks. Traditional graph kernel based methods are simple, yet effective for obtaining fixed-length representations for graphs but they suffer from poor generalization due to hand-crafted designs. There are also some recent methods based on language models (e.g. graph2vec) but they tend to only consider certain substructures (e.g. subtrees) as graph representatives. Inspired by recent progress of unsupervised representation learning, in this paper we proposed a novel method called InfoGraph for learning graph-level representations. We maximize the mutual information between the graph-level representation and the representations of substructures of different scales (e.g., nodes, edges, triangles). By doing so, the graph-level representations encode aspects of the data that are shared across different scales of substructures. Furthermore, we further propose InfoGraph*, an extension of InfoGraph for semi-supervised scenarios. InfoGraph* maximizes the mutual information between unsupervised graph representations learned by InfoGraph and the representations learned by existing supervised methods. As a result, the supervised encoder learns from unlabeled data while preserving the latent semantic space favored by the current supervised task. Experimental results on the tasks of graph classification and molecular property prediction show that InfoGraph is superior to state-of-the-art baselines and InfoGraph* can achieve performance competitive with state-of-the-art semi-supervised models.

...read moreread less

394 citations

Proceedings Article•

Manifold Mixup: Better Representations by Interpolating Hidden States

[...]

Vikas Verma¹, Alex Lamb², Christopher Beckham³, Amir Najafi⁴, Ioannis Mitliagkas², David Lopez-Paz⁵, Yoshua Bengio² - Show less +3 more•Institutions (5)

Helsinki University of Technology¹, Université de Montréal², École Polytechnique de Montréal³, Sharif University of Technology⁴, Facebook⁵

24 May 2019

TL;DR: Manifold Mixup as discussed by the authors leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation, as a result, neural networks trained with Manifold mixup learn class-representations with fewer directions of variance.

...read moreread less

Abstract: Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations. Manifold Mixup leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation. As a result, neural networks trained with Manifold Mixup learn class-representations with fewer directions of variance. We prove theory on why this flattening happens under ideal conditions, validate it on practical situations, and connect it to previous works on information theory and generalization. In spite of incurring no significant computation and being implemented in a few lines of code, Manifold Mixup improves strong baselines in supervised learning, robustness to single-step adversarial attacks, and test log-likelihood.

...read moreread less

388 citations

Proceedings Article•

Practical Deep Learning with Bayesian Principles

[...]

Kazuki Osawa¹, Siddharth Swaroop², Mohammad Emtiyaz Khan³, Anirudh Jain⁴, Runa Eschenhagen, Richard E. Turner², Rio Yokota¹ - Show less +3 more•Institutions (4)

Tokyo Institute of Technology¹, University of Cambridge², École Polytechnique Fédérale de Lausanne³, Helsinki University of Technology⁴

01 Jan 2019

TL;DR: This work enables practical deep learning while preserving benefits of Bayesian principles, and applies techniques such as batch normalisation, data augmentation, and distributed training to achieve similar performance in about the same number of epochs as the Adam optimiser.

...read moreread less

Abstract: Bayesian methods promise to fix many shortcomings of deep learning, but they are impractical and rarely match the performance of standard methods, let alone improve them. In this paper, we demonstrate practical training of deep networks with natural-gradient variational inference. By applying techniques such as batch normalisation, data augmentation, and distributed training, we achieve similar performance in about the same number of epochs as the Adam optimiser, even on large datasets such as ImageNet. Importantly, the benefits of Bayesian principles are preserved: predictive probabilities are well-calibrated, uncertainties on out-of-distribution data are improved, and continual-learning performance is boosted. This work enables practical deep learning while preserving benefits of Bayesian principles. A PyTorch implementation is available as a plug-and-play optimiser.

...read moreread less

167 citations

Posted Content•

DAWN: Dynamic Adversarial Watermarking of Neural Networks

[...]

Sebastian Szyller¹, Buse Gul Atli², Samuel Marchal², Nadarajah Asokan³•Institutions (3)

Aalto University¹, Helsinki University of Technology², Association for Computing Machinery³

03 Jun 2019-arXiv: Cryptography and Security

TL;DR: DAWN (Dynamic Adversarial Watermarking of Neural Networks), the first approach to use watermarking to deter model extraction theft, is introduced and is shown to be resilient against two state-of-the-art model extraction attacks.

...read moreread less

Abstract: Training machine learning (ML) models is expensive in terms of computational power, amounts of labeled data and human expertise. Thus, ML models constitute intellectual property (IP) and business value for their owners. Embedding digital watermarks during model training allows a model owner to later identify their models in case of theft or misuse. However, model functionality can also be stolen via model extraction, where an adversary trains a surrogate model using results returned from a prediction API of the original model. Recent work has shown that model extraction is a realistic threat. Existing watermarking schemes are ineffective against IP theft via model extraction since it is the adversary who trains the surrogate model. In this paper, we introduce DAWN (Dynamic Adversarial Watermarking of Neural Networks), the first approach to use watermarking to deter model extraction IP theft. Unlike prior watermarking schemes, DAWN does not impose changes to the training process but it operates at the prediction API of the protected model, by dynamically changing the responses for a small subset of queries (e.g., 1- 2^{-64}$), incurring negligible loss of prediction accuracy (0.03-0.5%).

...read moreread less

93 citations

Posted Content•

GraphMix: Improved Training of GNNs for Semi-Supervised Learning

[...]

Vikas Verma¹, Meng Qu, Kenji Kawaguchi², Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang - Show less +3 more•Institutions (2)

Helsinki University of Technology¹, Massachusetts Institute of Technology²

25 Sep 2019-arXiv: Learning

TL;DR: GraphMix is presented, a regularization method for Graph Neural Network based semi-supervised object classification, whereby it is proposed to train a fully-connected network jointly with the graph neural network via parameter sharing and interpolation-based regularization.

...read moreread less

Abstract: We present GraphMix, a regularization method for Graph Neural Network based semi-supervised object classification, whereby we propose to train a fully-connected network jointly with the graph neural network via parameter sharing and interpolation-based regularization. Further, we provide a theoretical analysis of how GraphMix improves the generalization bounds of the underlying graph neural network, without making any assumptions about the "aggregation" layer or the depth of the graph neural networks. We experimentally validate this analysis by applying GraphMix to various architectures such as Graph Convolutional Networks, Graph Attention Networks and Graph-U-Net. Despite its simplicity, we demonstrate that GraphMix can consistently improve or closely match state-of-the-art performance using even simpler architectures such as Graph Convolutional Networks, across three established graph benchmarks: Cora, Citeseer and Pubmed citation network datasets, as well as three newly proposed datasets: Cora-Full, Co-author-CS and Co-author-Physics.

...read moreread less

80 citations

Posted Content•

Improved Precision and Recall Metric for Assessing Generative Models

[...]

Tuomas Kynkäänniemi¹, Tero Karras², Samuli Laine², Jaakko Lehtinen², Timo Aila² - Show less +1 more•Institutions (2)

Helsinki University of Technology¹, Nvidia²

15 Apr 2019-arXiv: Machine Learning

TL;DR: In this article, the authors present an evaluation metric that can separately and reliably measure both of these aspects in image generation tasks by forming explicit, non-parametric representations of the manifolds of real and generated data.

...read moreread less

Abstract: The ability to automatically estimate the quality and coverage of the samples produced by a generative model is a vital requirement for driving algorithm research. We present an evaluation metric that can separately and reliably measure both of these aspects in image generation tasks by forming explicit, non-parametric representations of the manifolds of real and generated data. We demonstrate the effectiveness of our metric in StyleGAN and BigGAN by providing several illustrative examples where existing metrics yield uninformative or contradictory results. Furthermore, we analyze multiple design variants of StyleGAN to better understand the relationships between the model architecture, training methods, and the properties of the resulting sample distribution. In the process, we identify new variants that improve the state-of-the-art. We also perform the first principled analysis of truncation methods and identify an improved method. Finally, we extend our metric to estimate the perceptual quality of individual samples, and use this to study latent space interpolations.

...read moreread less

76 citations

Journal Article•DOI•

Connecting the Properties of Coronal Shock Waves with Those of Solar Energetic Particles

[...]

Athanasios Kouloumvakos, Alexis P. Rouillard, Yihong Wu, Rami Vainio¹, Angelos Vourlidas², Illya Plotnikov, Alexandr Afanasiev³, H. Önel⁴ - Show less +4 more•Institutions (4)

Helsinki University of Technology¹, Johns Hopkins University Applied Physics Laboratory², University of Turku³, Leibniz Institute for Astrophysics Potsdam⁴

01 May 2019-The Astrophysical Journal

TL;DR: In this paper, a catalog of coronal pressure waves modeled in 3D to study the potential role of these waves in accelerating solar energetic particles (SEPs) measured in situ is presented.

...read moreread less

Abstract: We develop and exploit a new catalog of coronal pressure waves modeled in 3D to study the potential role of these waves in accelerating solar energetic particles (SEPs) measured in situ. Our sample comprises modeled shocks and SEP events detected during solar cycle 24 observed over a broad range of longitudes. From the 3D reconstruction of shock waves using coronagraphic observations we derived the 3D velocity along the entire front as a function of time. Combining new reconstruction techniques with global models of the solar corona, we derive the 3D distribution of basic shock parameters such as Mach numbers, compression ratios, and shock geometry. We then model in a time-dependent manner how the shock wave connects magnetically with spacecraft making in situ measurements of SEPs. This allows us to compare modeled shock parameters deduced at the magnetically well-connected regions, with different key parameters of SEPs such as their maximum intensity. This approach accounts for projection effects associated with remote-sensing observations and constitutes the most extensive study to date of shock waves in the corona and their relation to SEPs. We find a high correlation between the maximum flux of SEPs and the strength of coronal shock waves quantified, for instance, by the Mach number. We discuss the implications of that work for understanding particle acceleration in the corona.

...read moreread less

71 citations

Posted Content•

Toward Millimeter Wave Joint Radar-Communications: A Signal Processing Perspective

[...]

Kumar Vijay Mishra¹, M. R. Bhavani Shankar², Visa Koivunen³, Bjorn Ottersten², Sergiy A. Vorobyov³ - Show less +1 more•Institutions (3)

University of Iowa¹, University of Luxembourg², Helsinki University of Technology³

02 May 2019-arXiv: Signal Processing

TL;DR: In this article, the authors provide a signal processing perspective of mmWave JRC systems with an emphasis on waveform design and performance criteria that would optimally trade-off between communications and radar functionalities.

...read moreread less

Abstract: Synergistic design of communications and radar systems with common spectral and hardware resources is heralding a new era of efficiently utilizing a limited radio-frequency spectrum. Such a joint radar-communications (JRC) model has advantages of low-cost, compact size, less power consumption, spectrum sharing, improved performance, and safety due to enhanced information sharing. Today, millimeter-wave (mm-wave) communications have emerged as the preferred technology for short distance wireless links because they provide transmission bandwidth that is several gigahertz wide. This band is also promising for short-range radar applications, which benefit from the high-range resolution arising from large transmit signal bandwidths. Signal processing techniques are critical in implementation of mmWave JRC systems. Major challenges are joint waveform design and performance criteria that would optimally trade-off between communications and radar functionalities. Novel multiple-input-multiple-output (MIMO) signal processing techniques are required because mmWave JRC systems employ large antenna arrays. There are opportunities to exploit recent advances in cognition, compressed sensing, and machine learning to reduce required resources and dynamically allocate them with low overheads. This article provides a signal processing perspective of mmWave JRC systems with an emphasis on waveform design.

...read moreread less

69 citations

Proceedings Article•DOI•

DGC-Net: Dense Geometric Correspondence Network

[...]

Iaroslav Melekhov¹, Aleksei Tiulpin², Torsten Sattler³, Marc Pollefeys⁴, Esa Rahtu⁵, Juho Kannala⁶ - Show less +2 more•Institutions (6)

Aalto University¹, University of Oulu², Chalmers University of Technology³, ETH Zurich⁴, Tampere University of Technology⁵, Helsinki University of Technology⁶

01 Jan 2019

TL;DR: In this paper, a coarse-to-fine CNN-based framework was proposed for dense pixel correspondence estimation between two images. But the model is trained on synthetic transformations and demonstrates very good performance to unseen, realistic, data.

...read moreread less

Abstract: This paper addresses the challenge of dense pixel correspondence estimation between two images. This problem is closely related to optical flow estimation task where ConvNets (CNNs) have recently achieved significant progress. While optical flow methods produce very accurate results for the small pixel translation and limited appearance variation scenarios, they hardly deal with the strong geometric transformations that we consider in this work. In this paper, we propose a coarse-to-fine CNN-based framework that can leverage the advantages of optical flow approaches and extend them to the case of large transformations providing dense and subpixel accurate estimates. It is trained on synthetic transformations and demonstrates very good performance to unseen, realistic, data. Further, we apply our method to the problem of relative camera pose estimation and demonstrate that the model outperforms existing dense approaches.

...read moreread less

67 citations

Posted Content•

Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy

[...]

Alex Lamb¹, Vikas Verma², Kenji Kawaguchi³, Juho Kannala², Yoshua Bengio² - Show less +1 more•Institutions (3)

Université de Montréal¹, Helsinki University of Technology², Harvard University³

16 Jun 2019-arXiv: Machine Learning

TL;DR: This work proposes Interpolated Adversarial Training, which employs recently proposed interpolation based training methods in the framework of adversarial training, which retains adversarial robustness while achieving a standard test error of only 6.45%.

...read moreread less

Abstract: Adversarial robustness has become a central goal in deep learning, both in the theory and the practice. However, successful methods to improve the adversarial robustness (such as adversarial training) greatly hurt generalization performance on the unperturbed data. This could have a major impact on how the adversarial robustness affects real world systems (i.e. many may opt to forego robustness if it can improve accuracy on the unperturbed data). We propose Interpolated Adversarial Training, which employs recently proposed interpolation based training methods in the framework of adversarial training. On CIFAR-10,adversarial training increases the standard test error (when there is no adversary) from 4.43% to 12.32%, whereas with our Interpolated adversarial training we retain the adversarial robustness while achieving a standard test error of only 6.45%. With our technique, the relative increase in the standard error for the robust model is reduced from 178.1% to just 45.5%. Moreover, we provide mathematical analysis of Interpolated Adversarial Training to confirm its efficiencies and demonstrate its advantages in terms of robustness and generalization.

...read moreread less

61 citations

Posted Content•

Meta Reinforcement Learning for Sim-to-real Domain Adaptation

[...]

Karol Arndt¹, Murtaza Hazara², Ali Ghadirzadeh², Ville Kyrki²•Institutions (2)

Helsinki University of Technology¹, Aalto University²

16 Sep 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes to address the problem of sim-to-real domain transfer by using meta learning to train a policy that can adapt to a variety of dynamic conditions, and using a task-specific trajectory generation model to provide an action space that facilitates quick exploration.

...read moreread less

Abstract: Modern reinforcement learning methods suffer from low sample efficiency and unsafe exploration, making it infeasible to train robotic policies entirely on real hardware. In this work, we propose to address the problem of sim-to-real domain transfer by using meta learning to train a policy that can adapt to a variety of dynamic conditions, and using a task-specific trajectory generation model to provide an action space that facilitates quick exploration. We evaluate the method by performing domain adaptation in simulation and analyzing the structure of the latent space during adaptation. We then deploy this policy on a KUKA LBR 4+ robot and evaluate its performance on a task of hitting a hockey puck to a target. Our method shows more consistent and stable domain adaptation than the baseline, resulting in better overall performance.

...read moreread less

Proceedings Article•DOI•

What is the dimension of your binary data

[...]

Nikolaj Tatti¹, Taneli Mielikäinen¹, Aristides Gionis¹, Heikki Mannila¹•Institutions (1)

Helsinki University of Technology¹

04 Feb 2019-arXiv: Learning

TL;DR: This work considers the problem of defining a robust measure of dimension for 0/1 datasets, and shows that the basic idea of fractal dimension can be adapted for binary data.

...read moreread less

Abstract: Many 0/1 datasets have a very large number of variables; on the other hand, they are sparse and the dependency structure of the variables is simpler than the number of variables would suggest. Defining the effective dimensionality of such a dataset is a nontrivial problem. We consider the problem of defining a robust measure of dimension for 0/1 datasets, and show that the basic idea of fractal dimension can be adapted for binary data. However, as such the fractal dimension is difficult to interpret. Hence we introduce the concept of normalized fractal dimension. For a dataset $D$, its normalized fractal dimension is the number of columns in a dataset $D'$ with independent columns and having the same (unnormalized) fractal dimension as $D$. The normalized fractal dimension measures the degree of dependency structure of the data. We study the properties of the normalized fractal dimension and discuss its computation. We give empirical results on the normalized fractal dimension, comparing it against baseline measures such as PCA. We also study the relationship of the dimension of the whole dataset and the dimensions of subgroups formed by clustering. The results indicate interesting differences between and within datasets.

...read moreread less

Proceedings Article•

On Adversarial Mixup Resynthesis

[...]

Christopher Beckham¹, Sina Honari², Vikas Verma³, Alex Lamb⁴, Farnoosh Ghadiri⁵, R Devon Hjelm⁶, Yoshua Bengio⁴, Chris Pal¹ - Show less +4 more•Institutions (6)

École Polytechnique de Montréal¹, Nvidia², Helsinki University of Technology³, Université de Montréal⁴, Laval University⁵, Microsoft⁶

01 Jan 2019

TL;DR: New approaches to combining information encoded within the learned representations of auto-encoders such that a resynthesised output is trained to fool an adversarial discriminator for real versus synthesised data are explored.

...read moreread less

Abstract: In this paper, we explore new approaches to combining information encoded within the learned representations of auto-encoders. We explore models that are capable of combining the attributes of multiple inputs such that a resynthesised output is trained to fool an adversarial discriminator for real versus synthesised data. Furthermore, we explore the use of such an architecture in the context of semi-supervised learning, where we learn a mixing function whose objective is to produce interpolations of hidden states, or masked combinations of latent representations that are consistent with a conditioned class label. We show quantitative and qualitative evidence that such a formulation is an interesting avenue of research.

...read moreread less

Posted Content•

ICface: Interpretable and Controllable Face Reenactment Using GANs

[...]

Soumya Ranjan Tripathy, Juho Kannala¹, Esa Rahtu•Institutions (1)

Helsinki University of Technology¹

03 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: The proposed Interpretable and Controllable face reenactment network (ICface) is compared to the state-of-the-art neural network based face animation techniques in multiple tasks and the results indicate that ICface produces better visual quality, while being more versatile than most of the comparison methods.

...read moreread less

Abstract: This paper presents a generic face animator that is able to control the pose and expressions of a given face image. The animation is driven by human interpretable control signals consisting of head pose angles and the Action Unit (AU) values. The control information can be obtained from multiple sources including external driving videos and manual controls. Due to the interpretable nature of the driving signal, one can easily mix the information between multiple sources (e.g. pose from one image and expression from another) and apply selective post-production editing. The proposed face animator is implemented as a two-stage neural network model that is learned in a self-supervised manner using a large video collection. The proposed Interpretable and Controllable face reenactment network (ICface) is compared to the state-of-the-art neural network-based face animation techniques in multiple tasks. The results indicate that ICface produces better visual quality while being more versatile than most of the comparison methods. The introduced model could provide a lightweight and easy to use tool for a multitude of advanced image and video editing tasks.

...read moreread less

Proceedings Article•DOI•

Finding Good Itemsets by Packing Data

[...]

Nikolaj Tatti¹, Jilles Vreeken²•Institutions (2)

Helsinki University of Technology¹, Utrecht University²

06 Feb 2019-arXiv: Data Structures and Algorithms

TL;DR: This paper presents a simple greedy approach that builds a family of itemsets directly from data that allows for complex interactions between the attributes, not just co-occurrences of 1s.

...read moreread less

Abstract: The problem of selecting small groups of itemsets that represent the data well has recently gained a lot of attention. We approach the problem by searching for the itemsets that compress the data efficiently. As a compression technique we use decision trees combined with a refined version of MDL. More formally, assuming that the items are ordered, we create a decision tree for each item that may only depend on the previous items. Our approach allows us to find complex interactions between the attributes, not just co-occurrences of 1s. Further, we present a link between the itemsets and the decision trees and use this link to export the itemsets from the decision trees. In this paper we present two algorithms. The first one is a simple greedy approach that builds a family of itemsets directly from data. The second one, given a collection of candidate itemsets, selects a small subset of these itemsets. Our experiments show that these approaches result in compact and high quality descriptions of the data.

...read moreread less

Posted Content•

ODE$^2$VAE: Deep generative second order ODEs with Bayesian neural networks

[...]

Cagatay Yildiz, Markus Heinonen, Harri Lähdesmäki¹•Institutions (1)

Helsinki University of Technology¹

27 May 2019-arXiv: Machine Learning

TL;DR: This work presents Ordinary Differential Equation Variational Auto-Encoder, a latent second order ODE model for high-dimensional sequential data that can simultaneously learn the embedding of high dimensional trajectories and infer arbitrarily complex continuous-time latent dynamics.

...read moreread less

Abstract: We present Ordinary Differential Equation Variational Auto-Encoder (ODE$^2$VAE), a latent second order ODE model for high-dimensional sequential data. Leveraging the advances in deep generative models, ODE$^2$VAE can simultaneously learn the embedding of high dimensional trajectories and infer arbitrarily complex continuous-time latent dynamics. Our model explicitly decomposes the latent space into momentum and position components and solves a second order ODE system, which is in contrast to recurrent neural network (RNN) based time series models and recently proposed black-box ODE techniques. In order to account for uncertainty, we propose probabilistic latent ODE dynamics parameterized by deep Bayesian neural networks. We demonstrate our approach on motion capture, image rotation and bouncing balls datasets. We achieve state-of-the-art performance in long term motion prediction and imputation tasks.

...read moreread less

Posted Content•

Computing Tight Differential Privacy Guarantees Using FFT

[...]

Antti Koskela¹, Joonas Jälkö², Antti Honkela¹•Institutions (2)

University of Helsinki¹, Helsinki University of Technology²

07 Jun 2019-arXiv: Machine Learning

TL;DR: In this paper, a numerical accountant for evaluating the privacy loss for algorithms with continuous one dimensional output is proposed, which can be applied to the subsampled multidimensional Gaussian mechanism which underlies the popular DP stochastic gradient descent.

...read moreread less

Abstract: Differentially private (DP) machine learning has recently become popular. The privacy loss of DP algorithms is commonly reported using $(\varepsilon,\delta)$-DP. In this paper, we propose a numerical accountant for evaluating the privacy loss for algorithms with continuous one dimensional output. This accountant can be applied to the subsampled multidimensional Gaussian mechanism which underlies the popular DP stochastic gradient descent. The proposed method is based on a numerical approximation of an integral formula which gives the exact $(\varepsilon,\delta)$-values. The approximation is carried out by discretising the integral and by evaluating discrete convolutions using the fast Fourier transform algorithm. We give both theoretical error bounds and numerical error estimates for the approximation. Experimental comparisons with state-of-the-art techniques demonstrate significant improvements in bound tightness and/or computation time. Python code for the method can be found in Github (this https URL).

...read moreread less

Posted Content•

Affordance Learning for End-to-End Visuomotor Robot Control

[...]

Aleksi Hamalainen¹, Karol Arndt², Ali Ghadirzadeh², Ville Kyrki²•Institutions (2)

Helsinki University of Technology¹, Aalto University²

10 Mar 2019-arXiv: Robotics

TL;DR: In this paper, a deep neural network with a modular architecture consisting of separate perception, policy, and trajectory parts is employed to train an end-to-end robotic manipulator.

...read moreread less

Abstract: Training end-to-end deep robot policies requires a lot of domain-, task-, and hardware-specific data, which is often costly to provide. In this work, we propose to tackle this issue by employing a deep neural network with a modular architecture, consisting of separate perception, policy, and trajectory parts. Each part of the system is trained fully on synthetic data or in simulation. The data is exchanged between parts of the system as low-dimensional latent representations of affordances and trajectories. The performance is then evaluated in a zero-shot transfer scenario using Franka Panda robot arm. Results demonstrate that a low-dimensional representation of scene affordances extracted from an RGB image is sufficient to successfully train manipulator policies. We also introduce a method for affordance dataset generation, which is easily generalizable to new tasks, objects and environments, and requires no manual pixel labeling.

...read moreread less

Posted Content•

Lower bounds for maximal matchings and maximal independent sets

[...]

Alkida Balliu¹, Sebastian Brandt², Juho Hirvonen¹, Dennis Olivetti¹, Mikaël Rabie, Jukka Suomela¹ - Show less +2 more•Institutions (2)

Helsinki University of Technology¹, ETH Zurich²

08 Jan 2019-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: It follows that there is no deterministic algorithm for maximal matchings or maximal independent sets that runs in o(Δ + log n / log log n) rounds; this is an improvement over prior lower bounds also as a function of n.

...read moreread less

Abstract: There are distributed graph algorithms for finding maximal matchings and maximal independent sets in $O(\Delta + \log^* n)$ communication rounds; here $n$ is the number of nodes and $\Delta$ is the maximum degree. The lower bound by Linial (1987, 1992) shows that the dependency on $n$ is optimal: these problems cannot be solved in $o(\log^* n)$ rounds even if $\Delta = 2$. However, the dependency on $\Delta$ is a long-standing open question, and there is currently an exponential gap between the upper and lower bounds. We prove that the upper bounds are tight. We show that maximal matchings and maximal independent sets cannot be found in $o(\Delta + \log \log n / \log \log \log n)$ rounds with any randomized algorithm in the LOCAL model of distributed computing. As a corollary, it follows that there is no deterministic algorithm for maximal matchings or maximal independent sets that runs in $o(\Delta + \log n / \log \log n)$ rounds; this is an improvement over prior lower bounds also as a function of $n$.

...read moreread less

Journal Article•DOI•

Graphene resonator as an ultrasound detector for generalized Love waves in a polymer film with two level states.

[...]

Antti Laitinen, Jukka-Pekka Kaikkonen, T. S. Abhilash, Igor Todoshchenko, Juuso Manninen, Vladislav Zavyalov, Alexander Savin, Andreas Isacsson, Pertti Hakonen¹ - Show less +5 more•Institutions (1)

Helsinki University of Technology¹

25 Mar 2019-arXiv: Materials Science

TL;DR: In this paper, surface shear waves at 22 MHz in a 0.5-micron-thick polymer film on SiO2/Si substrate at low temperatures using suspended and non-suspended graphene as detectors.

...read moreread less

Abstract: We have investigated surface shear waves at 22 MHz in a 0.5-micron-thick polymer film on SiO2/Si substrate at low temperatures using suspended and non-suspended graphene as detectors. By tracking ultrasound modes detected by oscillations of a trilayer graphene membrane both in vacuum and in helium superfluid, we assign the resonances to surface shear modes, generalized Love waves, in the resist/silicon-substrate system loaded with gold. The propagation velocity of these shear modes displays a logarithmic temperature dependence below 1 K, which is characteristic for modification of the elastic properties of a disordered solid owing to a large density of two level state (TLS) systems. For the dissipation of the shear mode, we find a striking logarithmic temperature dependence, which indicates a basic relation between the speed of the surface wave propagation and the mode dissipation.

...read moreread less

Proceedings Article•DOI•

Robust Grasp Planning Over Uncertain Shape Completions.

[...]

Jens Lundell¹, Francesco Verdoja², Ville Kyrki¹•Institutions (2)

Aalto University¹, Helsinki University of Technology²

02 Mar 2019-arXiv: Robotics

TL;DR: In this paper, a CNN is trained to take a partial view of the object as input and output the completed shape as a voxel grid for shape completion, and a dropout layer is enabled not only during training but also at run time to generate a set of shape samples representing the shape uncertainty through Monte Carlo sampling.

...read moreread less

Abstract: We present a method for planning robust grasps over uncertain shape completed objects. For shape completion, a deep neural network is trained to take a partial view of the object as input and outputs the completed shape as a voxel grid. The key part of the network is dropout layers which are enabled not only during training but also at run-time to generate a set of shape samples representing the shape uncertainty through Monte Carlo sampling. Given the set of shape completed objects, we generate grasp candidates on the mean object shape but evaluate them based on their joint performance in terms of analytical grasp metrics on all the shape candidates. We experimentally validate and benchmark our method against another state-of-the-art method with a Barrett hand on 90000 grasps in simulation and 200 grasps on a real Franka Emika Panda. All experimental results show statistically significant improvements both in terms of grasp quality metrics and grasp success rate, demonstrating that planning shape-uncertainty-aware grasps brings significant advantages over solely planning on a single shape estimate, especially when dealing with complex or unknown objects.

...read moreread less

Posted Content•

Extraction of Complex DNN Models: Real Threat or Boogeyman?

[...]

Buse Gul Atli¹, Sebastian Szyller¹, Mika Juuti¹, Samuel Marchal², Nadarajah Asokan³ - Show less +1 more•Institutions (3)

Helsinki University of Technology¹, University of Waterloo², Association for Computing Machinery³

11 Oct 2019-arXiv: Learning

TL;DR: This work evaluates the current state-of-the-art model extraction attack (Knockoff nets) against complex models, and introduces a defense based on distinguishing queries used for Knockoff nets from benign queries.

...read moreread less

Abstract: Recently, machine learning (ML) has introduced advanced solutions to many domains. Since ML models provide business advantage to model owners, protecting intellectual property of ML models has emerged as an important consideration. Confidentiality of ML models can be protected by exposing them to clients only via prediction APIs. However, model extraction attacks can steal the functionality of ML models using the information leaked to clients through the results returned via the API. In this work, we question whether model extraction is a serious threat to complex, real-life ML models. We evaluate the current state-of-the-art model extraction attack (Knockoff nets) against complex models. We reproduce and confirm the results in the original paper. But we also show that the performance of this attack can be limited by several factors, including ML model architecture and the granularity of API response. Furthermore, we introduce a defense based on distinguishing queries used for Knockoff nets from benign queries. Despite the limitations of the Knockoff nets, we show that a more realistic adversary can effectively steal complex ML models and evade known defenses.

...read moreread less

Journal Article•DOI•

Superadiabatic population transfer in a three-level superconducting circuit

[...]

Antti Vepsäläinen¹, Sergey Danilin¹, Gheorghe Sorin Paraoanu²•Institutions (2)

Aalto University¹, Helsinki University of Technology²

15 Nov 2019-arXiv: Quantum Physics

TL;DR: The speedup of the adiabatic population transfer in a three-level superconducting transmon circuit is demonstrated by suppressing the spurious nonadiabatic excitations with an additional two-photon microwave pulse.

...read moreread less

Abstract: Adiabatic manipulation of the quantum state is an essential tool in modern quantum information processing. Here we demonstrate the speed-up of the adiabatic population transfer in a three-level superconducting transmon circuit by suppressing the spurious non-adiabatic excitations with an additional two-photon microwave pulse. We apply this superadiabatic method to the stimulated Raman adiabatic passage, realizing fast and robust population transfer from the ground state to the second excited state of the quantum circuit.

...read moreread less

Posted Content•

Multi-View Stereo by Temporal Nonparametric Fusion.

[...]

Yuxin Hou¹, Juho Kannala¹, Arno Solin²•Institutions (2)

Aalto University¹, Helsinki University of Technology²

12 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: A pose-kernel structure that encourages similar poses to have resembling latent spaces is proposed that circumvents standard pitfalls in scaling Gaussian process inference, and can run in real-time on smart devices.

...read moreread less

Abstract: We propose a novel idea for depth estimation from multi-view image-pose pairs, where the model has capability to leverage information from previous latent-space encodings of the scene. This model uses pairs of images and poses, which are passed through an encoder--decoder model for disparity estimation. The novelty lies in soft-constraining the bottleneck layer by a nonparametric Gaussian process prior. We propose a pose-kernel structure that encourages similar poses to have resembling latent spaces. The flexibility of the Gaussian process (GP) prior provides adapting memory for fusing information from previous views. We train the encoder--decoder and the GP hyperparameters jointly end-to-end. In addition to a batch method, we derive a lightweight estimation scheme that circumvents standard pitfalls in scaling Gaussian process inference, and demonstrate how our scheme can run in real-time on smart devices.

...read moreread less

Posted Content•

Discovering Polarized Communities in Signed Networks

[...]

Francesco Bonchi¹, Edoardo Galimberti², Aristides Gionis³, Bruno Ordozgoiti², Giancarlo Ruffo⁴ - Show less +1 more•Institutions (4)

Institute for Scientific Interchange¹, University of Turin², Helsinki University of Technology³, Association for Computing Machinery⁴

06 Oct 2019-arXiv: Data Structures and Algorithms

TL;DR: In this article, the authors consider the problem of discovering polarized communities in signed networks and develop two intuitive spectral algorithms: one deterministic, and one randomized with quality guarantee, tight up to constant factors.

...read moreread less

Abstract: Signed networks contain edge annotations to indicate whether each interaction is friendly (positive edge) or antagonistic (negative edge). The model is simple but powerful and it can capture novel and interesting structural properties of real-world phenomena. The analysis of signed networks has many applications from modeling discussions in social media, to mining user reviews, and to recommending products in e-commerce sites. In this paper we consider the problem of discovering polarized communities in signed networks. In particular, we search for two communities (subsets of the network vertices) where within communities there are mostly positive edges while across communities there are mostly negative edges. We formulate this novel problem as a "discrete eigenvector" problem, which we show to be NP-hard. We then develop two intuitive spectral algorithms: one deterministic, and one randomized with quality guarantee $\sqrt{n}$ (where $n$ is the number of vertices in the graph), tight up to constant factors. We validate our algorithms against non-trivial baselines on real-world signed networks. Our experiments confirm that our algorithms produce higher quality solutions, are much faster and can scale to much larger networks than the baselines, and are able to detect ground-truth polarized communities.

...read moreread less

Journal Article•DOI•

Embodying circularity through usable relocatable modular buildings

[...]

Riikka Kyrö, Tuuli Jylhä¹, Antti Peltokorpi²•Institutions (2)

Delft University of Technology¹, Helsinki University of Technology²

04 Feb 2019-Facilities

TL;DR: Relocatable modular buildings could solve the challenges posed by quickly changing demographics in different types of regions and deliver both usability and circularity.

...read moreread less

Abstract: Global megatrends such as urbanization and ageing of the population result in fast-paced demographic changes, which pose different types of challenges for different regions. While many rural municipalities bear the burden of under-used buildings, cities are in a hurry to develop new ones to meet new space demands. The purpose of this paper is to assess the potential of relocatable modular buildings to address these challenges, following the principles of circular economy, while at the same time offering usability.,This multiple case study explores existing relocatable modular health-care buildings in Finland. The case buildings host hospital support functions, imaging services, a health-care centre and a care home. The primary data comprise 21 semi-structured interviews and observation during factory and site visits.,Based on the findings, relocatable modular buildings have many benefits and provide a viable option for cities and municipalities struggling to meet their fluctuating space demands. Some challenges were also identified, mainly derived from the dimensional restrictions of the modules.,This research contributes to the emerging body of knowledge on circular economy in the built environment. More specifically, the research provides a very concrete example of circularity and details a framework for usable and relocatable modular buildings. In conclusion, relocatable modular buildings could solve the challenges posed by quickly changing demographics in different types of regions and deliver both usability and circularity.

...read moreread less

Journal Article•DOI•

Non-Iterative Subspace-Based DOA Estimation in the Presence of Nonuniform Noise

[...]

Majdoddin Esfandiari¹, Sergiy A. Vorobyov¹, S. Aliban, Mahmood Karimi²•Institutions (2)

Helsinki University of Technology¹, Shiraz University²

27 Mar 2019-arXiv: Information Theory

TL;DR: In this paper, a non-iterative two-phase subspace-based DOA estimation method is proposed, where the first phase is based on estimating the noise subspace via eigendecomposition (ED) of some properly designed matrix, and the second phase is used to estimate the noise covariance matrix.

...read moreread less

Abstract: The uniform white noise assumption is one of the basic assumptions in most of the existing directional-of-arrival (DOA) estimation methods. In many applications, however, the non-uniform white noise model is more adequate. Then the noise variances at different sensors have to be also estimated as nuisance parameters while estimating DOAs. In this letter, different from the existing iterative methods that address the problem of non-uniform noise, a non-iterative two-phase subspace-based DOA estimation method is proposed. The first phase of the method is based on estimating the noise subspace via eigendecomposition (ED) of some properly designed matrix and it avoids estimating the noise covariance matrix. In the second phase, the results achieved in the first phase are used to estimate the noise covariance matrix, followed by estimating the noise subspace via generalized ED. Since the proposed method estimates DOAs in a non-iterative manner, it is computationally more efficient and has no convergence issues as compared to the existing methods. Simulation results demonstrate better performance of the proposed method as compared to other existing state-of-the-art methods.

...read moreread less

Posted Content•

PACStack: an Authenticated Call Stack

[...]

Hans Liljestrand¹, Thomas Nyman², Lachlan J. Gunn³, Jan-Erik Ekberg⁴, Nadarajah Asokan⁵ - Show less +1 more•Institutions (5)

University of Waterloo¹, Helsinki University of Technology², Aalto University³, Huawei⁴, Association for Computing Machinery⁵

24 May 2019-arXiv: Cryptography and Security

TL;DR: This work presents authenticated call stack (ACS), an approach that uses chained message authentication codes (MACs) and shows that PACStack achieves security comparable to hardware-assisted shadow stacks without requiring dedicated hardware.

...read moreread less

Abstract: A popular run-time attack technique is to compromise the control-flow integrity of a program by modifying function return addresses on the stack. So far, shadow stacks have proven to be essential for comprehensively preventing return address manipulation. Shadow stacks record return addresses in integrity-protected memory secured with hardware-assistance or software access control. Software shadow stacks incur high overheads or trade off security for efficiency. Hardware-assisted shadow stacks are efficient and secure, but require the deployment of special-purpose hardware. We present authenticated call stack (ACS), an approach that uses chained message authentication codes (MACs). Our prototype, PACStack, uses the ARM general purpose hardware mechanism for pointer authentication (PA) to implement ACS. Via a rigorous security analysis, we show that PACStack achieves security comparable to hardware-assisted shadow stacks without requiring dedicated hardware. We demonstrate that PACStack's performance overhead is small (~3%).

...read moreread less

Journal Article•DOI•

Measurement of the 2+→0+ ground-state transition in the β decay of F 20

[...]

Oliver S. Kirsebom¹, Oliver S. Kirsebom², M. Hukkanen³, Anu Kankainen³, Wladyslaw Henryk Trzaska³, D. F. Strömberg⁴, D. F. Strömberg⁵, Gabriel Martínez-Pinedo⁴, Gabriel Martínez-Pinedo⁵, K. Andersen², E. Bodewits, B. A. Brown⁶, L. Canete³, J. Cederkäll, Timo Enqvist⁷, Tommi Eronen³, H. O. U. Fynbo², S. Geldhof³, R. P. de Groote³, D. G. Jenkins⁸, Ari Jokinen³, Pankaj S. Joshi⁸, A. Khanam³, A. Khanam⁹, Joel Kostensalo³, Pasi Kuusiniemi⁷, Karlheinz Langanke⁴, Karlheinz Langanke⁵, Iain Moore³, M. Munch², Dmitrii Nesterenko³, J. D. Ovejas¹⁰, Heikki Penttilä³, Ilkka Pohjalainen³, M. Reponen³, Sami Rinta-Antila³, K. Riisager², A. de Roubin³, P. Schotanus, P. C. Srivastava¹¹, Jouni Suhonen³, J. A. Swartz², Olof Tengblad¹⁰, M. Vilen³, S. Vinals¹⁰, Juha Äystö³ - Show less +42 more•Institutions (11)

Dalhousie University¹, Aarhus University², University of Jyväskylä³, Technische Universität Darmstadt⁴, GSI Helmholtz Centre for Heavy Ion Research⁵, Michigan State University⁶, University of Oulu⁷, University of York⁸, Helsinki University of Technology⁹, Spanish National Research Council¹⁰, Indian Institute of Technology Roorkee¹¹

24 Dec 2019-Physical Review C

TL;DR: In this paper, the second-forbidden, non-unique, 2+→0+ ground state transition in the β decay of F20 was detected and the β-decay branching ratio inferred from the measurement is bβ=[0.41±0.08(stat)± 0.07(sys), corresponding to logft=10.89(11).

...read moreread less

Abstract: We report the first detection of the second-forbidden, nonunique, 2+→0+, ground-state transition in the β decay of F20. A low-energy, mass-separated F+20 beam produced at the IGISOL facility in Jyvaskyla, Finland, was implanted in a thin carbon foil and the β spectrum measured using a magnetic transporter and a plastic-scintillator detector. The β-decay branching ratio inferred from the measurement is bβ=[0.41±0.08(stat)±0.07(sys)]×10-5 corresponding to logft=10.89(11), making this one of the strongest second-forbidden, nonunique β transitions ever measured. The experimental result is supported by shell-model calculations and has significant implications for the final evolution of stars that develop degenerate oxygen-neon cores. Using the new experimental data, we argue that the astrophysical electron-capture rate on Ne20 is now known to within better than 25% at the relevant temperatures and densities. (Less)

...read moreread less

Posted Content•

Generalized linkage construction for constant-dimension codes

[...]

Daniel Heinlein¹•Institutions (1)

Helsinki University of Technology¹

24 Oct 2019-arXiv: Combinatorics

TL;DR: This work improves and generalizes two constructions for CDCs, the improved linkage construction and the parallel linkage construction, to the generalized linkageConstruction and the multiblock generalized linkage construction which yield many improved lower bounds for the cardinalities of CDCs.

...read moreread less

Abstract: A constant-dimension code (CDC) is a set of subspaces of constant dimension in a common vector space with upper bounded pairwise intersection. We improve and generalize two constructions for CDCs, the improved linkage construction and the parallel linkage construction, to the generalized linkage construction which in turn yields many improved lower bounds for the cardinalities of CDCs; a quantity not known in general.

...read moreread less

Collapse