What have the authors contributed in "Approximated and user steerable tsne for progressive visual analytics" ?

The authors introduce a controllable tSNE approximation ( A-tSNE ), which trades off speed and accuracy, to enable interactive data exploration. The authors demonstrate their technique with several datasets, in a real-world research scenario and for the real-time analysis of high-dimensional streams to illustrate its effectiveness for interactive data analysis.

What have the authors stated for future works in "Approximated and user steerable tsne for progressive visual analytics" ?

In the future the authors want to explore the application of A-tSNE in other research scenarios.

How does the algorithm compute the approximated neighborhoods?

In this work, the authors use a space partitioning technique called Forest of Randomized KD-Trees [38] to compute the approximated neighborhoods.

How is the geometry shader used to generate a quad for each point?

A geometry shader is used to generate a quad for each point that is colored using the precomputed texture, the KDE is obtained by drawing into a Frame Buffer Object using an additive blending [30].

How does the user make sure the clusters are not an artifact?

To make sure the clusters are not an artifact introduced by the approximated similarities, the user refines the selected data-points while the embedding evolves.

What is the importance of allowing an interactive feedback loop?

In such a setting it is crucial to allow an interactive feedback loop, between modeling the data (i.e., finding the right number of dimensions for the PCA before embedding) and visualizing the data.

What are the three strategies used to select the data points to be refined?

The authors propose three different strategies that are used to select the data points to be refined: user selection, breadth-first search and density-based refinement.

What is the strategy to refine the embedding?

A naive strategy to refine the embedding, is to progressively update the neighborhoods of all the points in X , while the gradient descent optimization is computed.

What is the significance of the algorithm when dealing with real-time data?

Liu et al. [47] demonstrate that, when dealing with real-time data, the response time of the algorithm is of great importance to the user.

How long does it take to compute the high-dimensional similarities?

With such a parameterization, A-tSNE computes the high-dimensional similarities in ≈ 51 seconds while 3 hours and 50 minutes are required by BH-SNE.

How can the BH-SNE algorithm be used to preserve the structure of a data?

reasonable results can be achieved even with low precision, because each data point is usually connected to a large number of springs and, therefore, the overall structure can be preserved.

(Open Access) Approximated and User Steerable tSNE for Progressive Visual Analytics (2017) | Nicola Pezzotti

Q: What are the requirements for the module that computes the approximated similarities?

the authors impose the following requirements to the modules that compute the approximated similarities (grey and red modules in Fig. 1):1) The performance gain due to the approximation must be high enough to enable interaction.

Delft University of Technology

Approximated and User Steerable tSNE for Progressive Visual Analytics

Pezzotti, Nicola; Lelieveldt, Boudewijn P.F.; van der Maaten, Laurens; Höllt, Thomas; Eisemann, Elmar;

Vilanova, Anna

DOI

10.1109/TVCG.2016.2570755

Publication date

2016

Document Version

Accepted author manuscript

Published in

IEEE Transactions on Visualization and Computer Graphics

Citation (APA)

Pezzotti, N., Lelieveldt, B. P. F., van der Maaten, L., Höllt, T., Eisemann, E., & Vilanova, A. (2016).

Approximated and User Steerable tSNE for Progressive Visual Analytics.

IEEE Transactions on

Visualization and Computer Graphics

(7), 1739-1752. https://doi.org/10.1109/TVCG.2016.2570755

Important note

To cite this publication, please use the final published version (if applicable).

Please check the document version above.

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent

of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Takedown policy

Please contact us and provide details if you believe this document breaches copyrights.

We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.

For technical reasons the number of authors shown on this cover page is limited to a maximum of 10.

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. -, NO. -, MONTH - 1

Approximated and User Steerable tSNE for

Progressive Visual Analytics

Nicola Pezzotti, Boudewijn P.F. Lelieveldt, Laurens van der Maaten,

Thomas H

ollt, Elmar Eisemann, and Anna Vilanova

Abstract—Progressive Visual Analytics aims at improving the interactivity in existing analytics techniques by means of visualization as

well as interaction with intermediate results. One key method for data analysis is dimensionality reduction, for example, to produce 2D

embeddings that can be visualized and analyzed efﬁciently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a well-suited

technique for the visualization of high-dimensional data. tSNE can create meaningful intermediate results but suffers from a slow

initialization that constrains its application in Progressive Visual Analytics. We introduce a controllable tSNE approximation (A-tSNE),

which trades off speed and accuracy, to enable interactive data exploration. We offer real-time visualization techniques, including a

density-based solution and a Magic Lens to inspect the degree of approximation. With this feedback, the user can decide on local

reﬁnements and steer the approximation level during the analysis. We demonstrate our technique with several datasets, in a real-world

research scenario and for the real-time analysis of high-dimensional streams to illustrate its effectiveness for interactive data analysis.

Index Terms—High Dimensional Data, Dimensionality Reduction, Progressive Visual Analytics, Approximate Computation

1 INTRODUCTION

ISUAL analysis of high dimensional data is a chal-

lenging process. Direct visualizations such as parallel

coordinates [1] or scatterplot matrices [2] work well for a

few dimensions but do not scale to hundreds or thousands

of dimensions. Typically indirect visualization is used for

these cases. First the dimensionality of the data is reduced,

usually to two or three dimensions, then the remaining

dimensions are used to lay out the data for visual inspection,

for example in a two dimensional scatterplot.

Dimensionality reduction techniques have been an active

ﬁeld of research in the last years, resulting in a number of

viable techniques [3]. A variant of tSNE [4], the Barnes Hut

SNE [5] has been accepted as the state of the art for non-

linear dimensionality reduction applied to visual analysis

of high-dimensional space in several application areas, such

as life sciences [6], [7], [8], [9]. tSNE produces 2D and 3D

embeddings that are meant to preserve local structure in

the high-dimensional data. The analyst inspects the embed-

dings with the goal to identify clusters or patterns that are

used to generate new hypothesis on the data, however, the

computational complexity of this technique does not allow

direct employment in interactive systems. This limitation

makes the analytic process a time consuming task that can

take hours, or even days, to adjust the parameters and

generate the right embedding to be analyzed.

• N. Pezzotti, T. H¨ollt, E. Eisemann, and A. Vilanova are with the Computer

Graphics and Visualization group, Delft University of Technology, Delft,

the Netherlands.

• B. P.F. Lelieveldt and L. van der Maaten are with the Pattern Recognition

and Bioinformatics group, Delft University of Technology, Delft, the

Netherlands.

• B. P.F. Lelieveldt is with the Division of Image Processing, Department of

Radiology, Leiden University Medical Center, Leiden, the Netherlands.

Manuscript received August 4, 2015; revised -, -.

Recently Stolper et al. [10], as well as M

uhlbacher et

al. [11] introduced Progressive Visual Analytics. The idea

of Progressive Visual Analytics is to provide the user with

meaningful intermediate results, in case computation of the

ﬁnal result is too costly. Based on these intermediate results

the user can start with the analysis process. M

uhlbacher et

al. also provide a set of requirements, which an algorithm

needs to fulﬁll in order to be suitable for Progressive Vi-

sual Analytics. Based on these requirements they analyze

a series of different algorithms, commonly deployed in

visual analytics systems and conclude that, for example,

tSNE fulﬁlls all requirements. The reason being that the

minimization in tSNE builds up on the iterative gradient

descent technique [4] and can therefore be used directly for

a per-iteration visualization, as well as interaction with the

intermediate results. However, M

uhlbacher et al. ignore the

fact that the distances in the high-dimensional space need

to be precomputed to start the minimization process. In fact

this initialization process is dominating the overall perfor-

mance of tSNE. Even with a per-iteration visualization of

the intermediate results [10], [11], [12], [13] the initialization

time will force the user to wait minutes, or even hours,

before the ﬁrst intermediate result can be generated on a

state-of-the-art desktop computer. Every modiﬁcation of the

data, for example, the addition of data-points or a change in

the high-dimensional space, will force the user to wait for

the full reinitialization of the algorithm.

In this work, we present A-tSNE, a novel approach

to adapt the complete tSNE pipeline, including a distance

computation for the Progressive Visual Analytics paradigm.

Instead of precomputing precise distances, we propose to

approximate the distances using Approximated K-Nearest

Neighborhood queries. This allows us to start the compu-

tation of the iterative minimization nearly instantly after

loading the data. Based on the intermediate results of the

tSNE, the user can now start the interpretation process of the

reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or

reuse of any copyrighted component of this work in other works.

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. -, NO. -, MONTH - 2

data immediately. Further, we modiﬁed the gradient descent

of tSNE such that it allows for the incorporation of updated

data during the iterative process. This change allows us to

continuously reﬁne the approximated neighborhoods in the

background, triggering updates of the embedding without

restarting the optimization. Eventually, this process arrives

at the precise solution. Furthermore, we allow the user to

steer the level of approximation by selecting points of inter-

est, such as clusters, which appear in the very early stages

of the optimization and enable an interactive exploration of

the high-dimensional data.

Our contributions are as follows:

1) We present A-tSNE, a twofold evolution of the tSNE

algorithm, which

a) minimizes initialization time and as such

enables immediate inspection of preliminary

computation results.

b) allows for interactive modiﬁcation, removal

or addition of high-dimensional data, with-

out disrupting the visual analysis process.

2) Using a set of standard benchmark data sets, we

show large computational performance improve-

ments of A-tSNE compared to the state of the art

while maintaining high precision.

3) We developed an interactive system for the visual

analysis of high dimensional data, allowing the user

to inspect and steer the level of approximation.

Finally, we illustrate the beneﬁts of exploratory pos-

sibilities in a real-world research scenario and for

the real-time analysis of high-dimensional streams.

2 RELATED WORK

The tSNE [4] algorithm builds the foundation of this work.

As described above, tSNE is used for visualization of high-

dimensional data in a wide ﬁeld of applications, from life

sciences to the analysis of deep-learning algorithms [6], [7],

[8], [9], [14], [15], [16]. tSNE is a non-linear dimensionality-

reduction algorithm that aims at preserving local structures

in the embedding, whilst showing global information, such

as the presence of clusters at several scales. Most of the user

tasks associated with the visualization of high-dimensional

data embeddings are based on identifying relationships

between data points. Typical tasks comprises the identi-

ﬁcation of visual clusters and their veriﬁcation based on

detail visualization of the high-dimensional data, e.g., using

parallel coordinate plots. For a complete description of such

tasks we refer to Brehmer et al. [17].

tSNE’s computational and memory complexity is

O(N

), where N is the number of data-points, which con-

strains the application of the technique. An evolution of the

algorithm, called Barnes-Hut-SNE (BH-SNE) [5], reduces the

computational complexity to O(N log(N)) and the memory

complexity to O(N). This approach was also developed in

parallel by Yang et al. [18]. However, despite the increased

performance, it still cannot be used to interactively explore

the data in a desktop environment.

Interactive performance is at the center of the latest

developments in Visual Analytics. New analytical tools and

algorithms, which are able to trade accuracy for speed and

offer the possibility to interactively reﬁne results [19], [20],

are needed to deal with the scalability issues of existing

analytics algorithms like tSNE. M

uhlbacher et al. [11] de-

ﬁned different strategies to increase the user involvement in

existing algorithms. They provide an in-depth analysis on

how the interconnection between the visualization and the

analytic modules can be achieved. Stolper et al. [10] deﬁned

the term Progressive Visual Analytics, describing techniques

that allow the analyst to directly interact with the analytics

process. Visualization of intermediate results is used to help

the user, for example, to ﬁnd optimal parameter settings or

ﬁlter the data [10]. For the design of our Progressive Visual

Analytics approach, we used the guidelines presented by

Stolper et al. [10], see section 4. Many algorithms are not

suited right away for Progressive Visual Analytics since the

production of intermediate results is computationally too

intensive or they do not generate useful intermediate results

at all. tSNE is an example of such an algorithm because of

its initialization process.

To overcome this problem, we propose to compute an

approximation of tSNE’s initialization stage, followed by a

user steerable [21] reﬁnement of the level of approximation.

To compute the conditional probabilities needed by BH-

SNE, a K-Nearest Neighborhood (KNN) search must be

evaluated for each point in the high-dimensional space.

Under these conditions, a traditional algorithm and data

structure, such as a KD-Tree [22], will not perform well. In

the BH-SNE [5] algorithm, a Vantage-Point Tree [23] is used

for the KNN search, but it is slow to query. In this work, we

propose to use an approximated computation of the KNN

in the initialization stage to start the analysis as soon as

possible. The level of approximation is then reﬁned on the

ﬂy during the analytics process.

Other dimensionality-reduction algorithms implement

approximation and steerability to increase performance as

well. For example MDSteer [24] works on a subset of

the data and allows the user to control the insertion of

points by selecting areas in the reduced space. Yang et

al. [25] present a dimensionality-reduction technique using

a dissimilarity matrix as input. By means of a divide-

and-conquer approach, the computational complexity of

the algorithm can be reduced. Multiple other techniques

provide steerability by means of guiding the dimensionality

reduction via user input. Joja et al. [26] and Paulovich et

al. [27] let the user place a small number of control points.

In other work, Paulovich et al. [28], propose the use of a

non-linear dimensionality-reduction algorithm on a small

number of automatically-selected control points. For these

techniques the position of the data points is ﬁnally obtained

by linear-interpolation schemes that make use of the control

points. However, they all limit the non-linear dimensionality

reduction to a subset of the dataset limiting the insights that

can be obtained from the data. In this work, we provide a

way to directly use the complete data allowing the analyst

to immediately start the analysis on all data points.

Ingram and Munzner’s Q-SNE [29] is based on a similar

idea as our approach, using Approximated KNN queries

for the computation of the high-dimensional similarities.

However, they use the APQ algorithm [29] that is designed

to exploit the sparse structure of high-dimensional spaces

obtained from document collections, limiting its application

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. -, NO. -, MONTH - 3

to such a context. A-tSNE improves Q-SNE in the direction

of providing a fast but approximated algorithm for the

analysis of traditional dense high-dimensional spaces. For

this reason it can be used right away in contexts where

BH-SNE is applied and Q-SNE would not be applicable. A

further distinction is that A-tSNE incorporates the principles

of the Progressive Visual Analytics by means of providing

a visualization of the level of approximation, the ability to

reﬁne the approximation based on user input, and allow-

ing the manipulation of the high-dimensional data without

waiting for the recomputation of the exact similarities.

Density-based visualization of the tSNE embedding has

been used in several works [5], [6], [9], however, they

employ slow-to-compute ofﬂine techniques. In our work,

we integrate real-time Kernel Density Estimation (KDE)

as described by Lampe and Hauser [30]. The interaction

with the embedding is important to allow the analyst to

explore the high-dimensional data. Selection operations in

the embedding and the visualization of the data in a coor-

dinated multiple-view system are necessary to enable this

exploration. The iVisClassiﬁer system [31] is an example of

such a solution. In our work, we take a similar approach,

providing a coordinated multiple-view framework for the

visualization of a selection in the embedding.

3 TSNE

In this section, we provide a short introduction to tSNE [4],

which is necessary to explain our contribution. tSNE inter-

prets the overall distances between data-points in the high-

dimensional space as a symmetric joint-probability distribu-

tion P . Likewise a joint-probability distribution Q is com-

puted, that describes the similarity in the low-dimensional

space. The goal is to achieve a representation, referred to as

embedding, in the low dimensional space where Q faithfully

represents P . This is achieved by optimizing the positions

in the low-dimensional space to minimize the cost function

C given by the Kullback-Leibler (KL) divergence between

the joint-probability distributions P and Q:

C(P, Q) = KL(P ||Q) =

i=1

j=1,j6=i





(1)

Given two data points x

and x

in the dataset X =

...x

}, the probability p

models the similarity of these

points in the high-dimensional space. To this extent, for

each point a Gaussian kernel, P

, is chosen whose variance

is deﬁned according to the local density in the high-

dimensional space and then p

is described as follows:

i|j

+ p

j|i

, (2)

where p

j|i

exp(−(||x

− x

)/(2σ

))

k6=i

exp(−(||x

− x

)/(2σ

))

(3)

j|i

can be seen as a relative measure of similarity based

on the local neighborhood of a data-point x

. The perplexity

value µ is a user-deﬁned parameter that describes the ef-

fective number of neighbors considered for each data-point.

The value of σ

is chosen such that for ﬁxed µ and each i:

µ = 2

−

j|i

log

j|i

(4)

A Student’s t-Distribution with one degree of freedom

is used to compute the joint-probability distribution in the

low-dimensional space Q, where the positions of the data-

points should be optimized. Given two low-dimensional

points y

and y

, the probability q

that describes their

similarity is given by:



(1 + ||y

− y



−1

(5)

with Z =

k=1

l6=k

(1 + ||y

− y

)

−1

(6)

The gradient of the Kullback-Leibler divergence between

P and Q is used to minimize C (see Eq. 1). It indicates the

change in position of the low-dimensional points for each

step of the gradient descent and is given by:

δC

δy

= 4

i=1

attr

− F

rep

) (7)

= 4

i=1

(

j6=i

Z(y

− y

) −

j6=i

Z(y

− y

)) (8)

The gradient descent can be seen as a N-body simula-

tion [32], where each data-point exerts an attractive and a

repulsive force on all the other points (F

attr

and F

rep

3.1 Barnes-Hut-SNE

In the original tSNE, the force is computed using a brute-

force approach, resulting in computational and memory

complexity of O(N

). Barnes-Hut-SNE (BH-SNE) [5] is an

evolution of the tSNE algorithm that introduces two differ-

ent approximations to reduce the computational complexity

to O(N log(N)) and the memory complexity to O(N).

The ﬁrst approximation is based on the observation

that the probability p

is inﬁnitesimal if x

and x

are

dissimilar. Therefore, the similarities of a data-point x

can

be computed taking into account only the points that belong

to the set of nearest neighbors N

. The cardinality of N

can

be set to K = b3µc, where µ is the user-selected perplexity

and b·c describes a rounding to the next-lower integer.

Without compromising the quality of the embedding [5], we

can adopt a sparse approximation of the high-dimensional

similarities. Eq. 3 can now be written as follows:

j|i







exp(−(||x

−x

)/(2σ

))

k∈N

exp(−(||x

−x

)/(2σ

))

if j ∈ N

0 otherwise

(9)

The computation of the K-Nearest Neighbors is per-

formed using a Vantage-Point Tree (VP-Tree) [23]. A VP-

Tree is data structure that computes KNN queries in a

high-dimensional metric space, in O(log(N)) time. It is a

binary tree that stores for each non leaf-node a hyper-sphere

centered on a data-point. The left children of each node

will contain the points that reside inside the hyper-sphere,

whereas the right one will contain the points outside it.

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. -, NO. -, MONTH - 4

(a) Progressive Visual Analytics workﬂow for tSNE.

(b) Progressive Visual Analytics workﬂow for A-tSNE.

Fig. 1. Comparison between the traditional and our tSNE workﬂow.

The eye icon marks modules which produce output for visualization,

whereas the hand icon marks modules that allow manipulation by the

user. The increased performance of the similarity computation allows the

user to seamlessly manipulate the input data. The level of approximation

can be visualized and the user can steer the reﬁnement process to

interesting regions.

The second approximation makes use of the formulation

of the gradient presented in Eq. 7. As described above tSNE

can be seen as an N-body simulation and thus the Barnes-

Hut algorithm [33] can be used to reduce the computational

complexity to O(N log(N)). For further details, we refer to

van der Maaten [5].

4 A-TSNE IN PROGRESSIVE VISUAL ANALYTICS

In this work, we introduce Approximated-tSNE (A-tSNE),

an evolution of the BH-SNE algorithm, using approximated

computations of high-dimensional similarities to generate

meaningful intermediate results. The level of approximation

can be deﬁned by the user to allow control on the trade

off between speed and quality. The level of approximation

can be reﬁned by the analyst in interesting regions of

the embedding, making A-tSNE a computational steerable

algorithm [21]. tSNE is well suited for the application in

Progressive Visual Analytics: after the initialization of the

algorithm, the intermediate results generated during the

iterative optimization process can be interpreted by the

analyst while they change over time, as shown in previous

work [11], [12]. Fig. 1a shows a typical Progressive Visual

Analytics workﬂow for tSNE.

Algorithms that can be used in a Progressive Visual An-

alytics system often have a computational module, e.g. the

initialization of the technique, that cannot be implemented

in an iterative way, creating a speed bump [10] in the user

analysis. tSNE is a good example for such an algorithm. It

consists of two computational modules that are serialized.

In the ﬁrst part of the algorithm, similarities between high-

dimensional points are calculated. In the second module, a

minimization of the cost function (Eq. 1) is computed by

means of a gradient descent. The ﬁrst module, depicted in

light grey in Fig. 1a, is slow to compute and does not create

any meaningful intermediate results.

We extend the Progressive Visual Analytics paradigm by

introducing approximated computation rather than aiming

at exact computations, in the modules that are not suited

for a per-iteration visualization. Fig. 1b shows the analytical

workﬂow for A-tSNE. While the generation and the inspec-

tion of the intermediate results is not changed, we introduce

a reﬁnement module, depicted in red in Fig. 1b, which

can be used to reﬁne the level of the approximation in the

embedding in a concurrent way. Furthermore, the increased

performance of the initialization module and the ability to

update the high-dimensional similarities during the gradi-

ent descent minimization, allows the analyst to manipulate

the high-dimensional data without waiting for the reinitial-

ization of the algorithm. We follow the guideline proposed

by Stolper et al. [10], focusing on providing increasingly

meaningful partial results during the minimization process

(purple modules in Fig. 1). Furthermore, we impose the

following requirements to the modules that compute the

approximated similarities (grey and red modules in Fig. 1):

1) The performance gain due to the approximation

must be high enough to enable interaction.

2) The amount of degradation caused by the approx-

imation must be controllable. A small increase of

approximation must not lead to large degradation

of the results.

3) The approximation quality can be measured and

visualized to avoid misleading the user.

4) The approximation can be reﬁned during the evolu-

tion. The reﬁnement can be steered by the user.

In the following Sections 4.1 to 4.4, we describe the A-

tSNE algorithm in detail using the MNIST [34] dataset for

illustration. The dataset consists of 60k labeled gray scale

images of handwritten digits (compare Fig. 2a). Each image

is represented as a 784 dimensional vector, corresponding to

the gray values of the pixels in the image.

4.1 A-tSNE

A-tSNE improves the BH-SNE algorithm, by using fast

and Approximated KNN computations to build the ap-

proximated high-dimensional joint-probability distribution

, instead of the exact distribution P . The cost function

C(P

, Q

) is then minimized in order to obtain the ap-

proximated embedding described by Q

The similarity between points can be computed using

the set of approximated neighbors N

, instead of the exact

neighborhood N

(see Eq. 9). We deﬁne the precision of the

KNN algorithm as ρ. ρ describes the average percentage of

points in the approximated neighborhood N

that belongs

to the exact neighborhood N

ρ =

i=1

∩ N

, (10)

where | · | indicates the cardinality of the neighborhood.

The cardinality of N

is indirectly speciﬁed by the user

Approximated and User Steerable tSNE for Progressive Visual Analytics

Figures

Citations

Guidelines for the use of flow cytometry and cell sorting in immunological studies (second edition)

The art of using t-SNE for single-cell transcriptomics.

Towards better analysis of machine learning models: A visual analytics perspective

Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets.

The Role of Uncertainty, Awareness, and Trust in Visual Analytics

References

Visualizing Data using t-SNE

Human-level control through deep reinforcement learning

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Learning Multiple Layers of Features from Tiny Images

Graph drawing by force-directed placement

Related Papers (5)

Visualizing Data using t-SNE

Accelerating t-SNE using tree-based algorithms

A global geometric framework for nonlinear dimensionality reduction.

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia

Frequently Asked Questions (14)

Q1. What have the authors contributed in "Approximated and user steerable tsne for progressive visual analytics" ?

Q2. What have the authors stated for future works in "Approximated and user steerable tsne for progressive visual analytics" ?

Q3. Why does tSNE build up on the iterative gradient descent technique?

Q4. How does the algorithm compute the approximated neighborhoods?

Q5. How is the geometry shader used to generate a quad for each point?

Q6. Why does the user have to wait for the first intermediate result to be generated?

Q7. How does the user make sure the clusters are not an artifact?

Q8. What are the requirements for the module that computes the approximated similarities?

Q9. What is the importance of allowing an interactive feedback loop?

Q10. What are the three strategies used to select the data points to be refined?

Q11. What is the strategy to refine the embedding?

Q12. What is the significance of the algorithm when dealing with real-time data?

Q13. How long does it take to compute the high-dimensional similarities?

Q14. How can the BH-SNE algorithm be used to preserve the structure of a data?