scispace - formally typeset
Search or ask a question

Showing papers on "FLOPS published in 2017"


Journal ArticleDOI
TL;DR: In this paper, two 3-fold flops are exhibited, both of which have precisely one flopping curve and are not algebraically nor analytically isomorphic, yet their curvecounting Gopakumar-Vafa invariants are the same.
Abstract: Two 3-fold flops are exhibited, both of which have precisely one flopping curve. One of the two flops is new, and is distinct from all known algebraic D4-flops. It is shown that the two flops are neither algebraically nor analytically isomorphic, yet their curve-counting Gopakumar-Vafa invariants are the same. We further show that the contraction algebras associated to both are not isomorphic, so the flops are distinguished at this level. This shows that the contraction algebra is a finer invariant than various curve-counting theories, and it also provides more evidence for the proposed analytic classification of 3-fold flops via contraction algebras.

10 citations


Proceedings ArticleDOI
01 Oct 2017
TL;DR: A novel sparse matrix storage format, block-based CSR (compressed storage format) and COO (coordinate format), called BCSR&BCOO, and a thread-scalable computing kernel for sparse-dense matrix multiplication, called BSpMM are proposed.
Abstract: Deep Neural Network (DNN) is currently widely used in various applications, such as speech recognition, computer vision, etc. The computation kernel of DNN-based applications is large sparse-dense matrix multiplication. As the performance of existing methods and software libraries for sparse matrix multiplication is not as good as expected, real-time recognition process has not been achieved yet. Therefore, we propose a novel sparse matrix storage format, block-based CSR (compressed storage format) and COO (coordinate format), called BCSR&BCOO, and a thread-scalable computing kernel for sparse-dense matrix multiplication, called BSpMM. We evaluate the performance of our proposed data structure and computing kernel in a real application in DNN-based online speech recognition. The experimental results demonstrate up to 4x speedup over Intel MKL on a typical CPU-based multicore system. Significant improvement in FLOPS is observed as well.

1 citations