Ufuk Topcu

Researcher at University of Texas at Austin

Publications - 504

Citations - 11791

Ufuk Topcu is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Markov decision process & Computer science. The author has an hindex of 44, co-authored 437 publications receiving 9636 citations. Previous affiliations of Ufuk Topcu include Google & University of Illinois at Urbana–Champaign.

Papers

PDF

Open Access

More filters

Posted Content

Generalization Bounds for Sparse Random Feature Expansions

Abolfazl Hashemi, +5 more

- 04 Mar 2021 -

arXiv: Machine Learning

TL;DR: In this article, the authors leverage ideas from compressive sensing to generate random feature expansions with theoretical guarantees even in the data-scarce setting, and provide generalization bounds for functions in a certain class (that is dense in a reproducing kernel Hilbert space).

...read moreread less

Proceedings ArticleDOI

Online Learning with Implicit Exploration in Episodic Markov Decision Processes

Mahsa Ghasemi, +3 more

TL;DR: A policy search algorithm that employs online mirror descent using an optimistically biased estimator of the loss function is proposed and it is proved that the proposed algorithm achieves both on expectation and with high probability a sublinear regret of $\tilde{\mathcal{O}}(\sqrt{L T\vert\Mathcal{S}\Vert \mathcal {A}\vert})$.

...read moreread less

Posted Content

Distributed Beamforming for Agents with Localization Errors.

Erfaun Noorani, +5 more

- 27 Mar 2020 -

arXiv: Signal Processing

TL;DR: This work considers a scenario in which a group of agents aim to collectively transmit a message signal to a client through beamforming and develops two greedy algorithms and a difference-of-submodular algorithm which returns a locally optimal solution to a certain relaxation of the subset selection problem regardless of the maximum localization error.

...read moreread less

Proceedings ArticleDOI

Risk-averse control of Markov decision processes with ω-regular objectives

Ruediger Ehlers, +2 more

TL;DR: A new optimization criterion for MDP policies that captures the task of working towards the satisfaction of some infinite-time horizon ω-regular specification is introduced and it is given an algorithm to compute policies that are optimal in this criterion and it captures the ideas of optimism and risk-averseness in MDP control.

...read moreread less

Journal ArticleDOI

Additive Logistic Mechanism for Privacy-Preserving Self-Supervised Learning

Yunhao Yang, +2 more

- 25 May 2022 -

arXiv.org

TL;DR: A post-training privacy-protection algorithm that adds noise to the neural network’s weights and a novel differential privacy mechanism that samples noise from the logistic distribution is designed.

...read moreread less

Collapse