scispace - formally typeset
Search or ask a question

Showing papers by "Kai Li published in 2015"


Journal ArticleDOI
TL;DR: This work presents SEEK (search-based exploration of expression compendia), a query-based search engine for very large transcriptomic data collections, including thousands of human data sets from many different microarray and high-throughput sequencing platforms.
Abstract: The search engine SEEK allows multigene query across a large number of human expression data sets from array and sequencing platforms.

132 citations


Proceedings ArticleDOI
16 Feb 2015
TL;DR: This paper shows that two families of advanced caching algorithms, Segmented-LRU and Greedy-Dual-Size-Frequency, can be easily implemented with RIPQ and shows that these algorithms running on RIPQ increase hit ratios up to ∼20% over the current FIFO system, incur low overhead, and achieve high throughput.
Abstract: Facebook uses flash devices extensively in its photo-caching stack The key design challenge for an efficient photo cache on flash at Facebook is its workload: many small random writes are generated by inserting cache-missed content, or updating cache-hit content for advanced caching algorithms The Flash Translation Layer on flash devices performs poorly with such a workload, lowering throughput and decreasing device lifespan Existing coping strategies under-utilize the space on flash devices, sacrificing cache capacity, or are limited to simple caching algorithms like FIFO, sacrificing hit ratiosWe overcome these limitations with the novel Restricted Insertion Priority Queue (RIPQ) framework that supports advanced caching algorithms with large cache sizes, high throughput, and long device lifespan RIPQ aggregates small random writes, co-locates similarly prioritized content, and lazily moves updated content to further reduce device overhead We show that two families of advanced caching algorithms, Segmented-LRU and Greedy-Dual-Size-Frequency, can be easily implemented with RIPQ Our evaluation on Facebook's photo trace shows that these algorithms running on RIPQ increase hit ratios up to ∼20% over the current FIFO system, incur low overhead, and achieve high throughput

97 citations


Patent
27 May 2015
TL;DR: In this article, a first data segment is determined to be similar to a data segment in the stream indicated locale, and then a first segment is decomposed in a stream locality delta compression.
Abstract: Stream locality delta compression is disclosed. A previous stream indicated locale of data segments is selected. A first data segment is then determined to be similar to a data segment in the stream indicated locale.

62 citations


Journal ArticleDOI
TL;DR: Full correlation matrix analysis demonstrates how advances in computer science can alleviate computational bottlenecks in neuroscience by accelerating a naive, serial approach and revealing a region of medial prefrontal cortex whose selectivity derived from differential patterns of functional connectivity across categories.

39 citations


Proceedings ArticleDOI
15 Nov 2015
TL;DR: A closed-loop analysis system with FCMA on a cluster of nodes with Intel® Xeon Phi™ coprocessors and shows that the optimized single-node code runs 5x-16x faster than the baseline implementation using the well-known Intel® MKL and LibSVM libraries.
Abstract: Full correlation matrix analysis (FCMA) is an unbiased approach for exhaustively studying interactions among brain regions in functional magnetic resonance imaging (fMRI) data from human participants. In order to answer neuroscientific questions efficiently, we are developing a closed-loop analysis system with FCMA on a cluster of nodes with Intel® Xeon Phi™ coprocessors. Here we propose several ideas for data-driven algorithmic modification to improve the performance on the coprocessor. Our experiments with real datasets show that the optimized single-node code runs 5x-16x faster than the baseline implementation using the well-known Intel® MKL and LibSVM libraries, and that the cluster implementation achieves near linear speedup on 5760 cores.

18 citations


01 Jan 2015
Abstract: Full correlation matrix analysis (FCMA) is an unbiased approach for exhaustively studying interactions among brain regions in functional magnetic resonance imaging (fMRI) data from human participants. In order to answer neuroscientific questions efficiently, we are developing a closed-loop analysis system with FCMA on a cluster of nodes with Intel® Xeon Phi™ coprocessors. Here we propose several ideas for data-driven algorithmic modification to improve the performance on the coprocessor. Our experiments with real datasets show that the optimized single-node code runs 5x-16x faster than the baseline implementation using the well-known Intel® MKL and LibSVM libraries, and that the cluster implementation achieves near linear speedup on 5760 cores.

2 citations