scispace - formally typeset
Search or ask a question
Institution

Helsinki Institute for Information Technology

FacilityEspoo, Finland
About: Helsinki Institute for Information Technology is a facility organization based out in Espoo, Finland. It is known for research contribution in the topics: Population & Bayesian network. The organization has 630 authors who have published 1962 publications receiving 63426 citations.


Papers
More filters
Proceedings ArticleDOI
01 Aug 2007
TL;DR: A method that learns a class-discriminative subspace or discriminative components of data, useful for visualization, dimensionality reduction, feature extraction, and for learning a regularized distance metric is introduced.
Abstract: We introduce a method that learns a class-discriminative subspace or discriminative components of data. Such a sub- space is useful for visualization, dimensionality reduction, feature extraction, and for learning a regularized distance metric. We learn the subspace by optimizing a probabilistic semiparametric model, a mixture of Gaussians, of classes in the subspace. The semiparametric modeling leads to fast computation (O(N) for N samples) in each iteration of optimization, in contrast to recent nonparametric methods that take O(N2) time, but with equal accuracy. Moreover, we learn the subspace in a semi-supervised manner from three kinds of data: labeled and unlabeled samples, and unlabeled samples with pairwise constraints, with a unified objective.

18 citations

Journal ArticleDOI
TL;DR: A machine learning approach to optimize the meta-visualization, based on an information retrieval perspective, is introduced, which optimizes locations of visualizations on a display, so that visualizations giving similar information about data are close to each other.
Abstract: Visualization is crucial in the first steps of data analysis. In visual data exploration with scatter plots, no single plot is sufficient to analyze complicated high-dimensional data sets. Given numerous visualizations created with different features or methods, meta-visualization is needed to analyze the visualizations together. We solve how to arrange numerous visualizations onto a meta-visualization display, so that their similarities and differences can be analyzed. Visualization has recently been formalized as an information retrieval task; we extend this approach, and formalize meta-visualization as an information retrieval task whose performance can be rigorously quantified and optimized. We introduce a machine learning approach to optimize the meta-visualization, based on an information retrieval perspective: two visualizations are similar if the analyst would retrieve similar neighborhoods between data samples from either visualization. Based on the approach, we introduce a nonlinear embedding method for meta-visualization: it optimizes locations of visualizations on a display, so that visualizations giving similar information about data are close to each other. In experiments we show such meta-visualization outperforms alternatives, and yields insight into data in several case studies.

18 citations

Journal ArticleDOI
TL;DR: In this article, a machine learning approach that prioritizes patient-customized drug combinations with a desired synergy-efficacy-toxicity balance by combining single-cell RNA sequencing with ex vivo single-agent testing in scarce patient-derived primary cells is proposed.
Abstract: The extensive drug resistance requires rational approaches to design personalized combinatorial treatments that exploit patient-specific therapeutic vulnerabilities to selectively target disease-driving cell subpopulations. To solve the combinatorial explosion challenge, we implemented an effective machine learning approach that prioritizes patient-customized drug combinations with a desired synergy-efficacy-toxicity balance by combining single-cell RNA sequencing with ex vivo single-agent testing in scarce patient-derived primary cells. When applied to two diagnostic and two refractory acute myeloid leukemia (AML) patient cases, each with a different genetic background, we accurately predicted patient-specific combinations that not only resulted in synergistic cancer cell co-inhibition but also were capable of targeting specific AML cell subpopulations that emerge in differing stages of disease pathogenesis or treatment regimens. Our functional precision oncology approach provides an unbiased means for systematic identification of personalized combinatorial regimens that selectively co-inhibit leukemic cells while avoiding inhibition of nonmalignant cells, thereby increasing their likelihood for clinical translation.

18 citations

Journal ArticleDOI
TL;DR: This study proposes to use the probit classifier with a proper prior structure and multiple kernel learning with a properly kernel construction procedure to perform group-wise feature selection (i.e., eliminating a group of features together if they are not helpful) and shows the validity and effectiveness of the proposed binary classification algorithm variants.
Abstract: Many financial organizations such as banks and retailers use computational credit risk analysis (CRA) tools heavily due to recent financial crises and more strict regulations. This strategy enables them to manage their financial and operational risks within the pool of financial institutes. Machine learning algorithms especially binary classifiers are very popular for that purpose. In real-life applications such as CRA, feature selection algorithms are used to decrease data acquisition cost and to increase interpretability of the decision process. Using feature selection methods directly on CRA data sets may not help due to categorical variables such as marital status. Such features are usually are converted into binary features using 1-of-k encoding and eliminating a subset of features from a group does not help in terms of data collection cost or interpretability. In this study, we propose to use the probit classifier with a proper prior structure and multiple kernel learning with a proper kernel construction procedure to perform group-wise feature selection (i.e., eliminating a group of features together if they are not helpful). Experiments on two standard CRA data sets show the validity and effectiveness of the proposed binary classification algorithm variants.

18 citations

Journal ArticleDOI
TL;DR: A methodology for assessment of large-scale public transport network overhauls, building upon the previous development in service-equity assessment methods, based on the use of open timetable data and reveals the disaggregate effects of the network overhaul from a three-level spatial perspective.

18 citations


Authors

Showing all 632 results

NameH-indexPapersCitations
Dimitri P. Bertsekas9433285939
Olli Kallioniemi9035342021
Heikki Mannila7229526500
Jukka Corander6641117220
Jaakko Kangasjärvi6214617096
Aapo Hyvärinen6130144146
Samuel Kaski5852214180
Nadarajah Asokan5832711947
Aristides Gionis5829219300
Hannu Toivonen5619219316
Nicola Zamboni5312811397
Jorma Rissanen5215122720
Tero Aittokallio522718689
Juha Veijola5226119588
Juho Hamari5117616631
Network Information
Related Institutions (5)
Google
39.8K papers, 2.1M citations

93% related

Microsoft
86.9K papers, 4.1M citations

93% related

Carnegie Mellon University
104.3K papers, 5.9M citations

91% related

Facebook
10.9K papers, 570.1K citations

91% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20231
20224
202185
202097
2019140
2018127