scispace - formally typeset
D

Daxin Jiang

Researcher at Microsoft

Publications -  193
Citations -  8482

Daxin Jiang is an academic researcher from Microsoft. The author has contributed to research in topics: Computer science & Question answering. The author has an hindex of 32, co-authored 159 publications receiving 5330 citations. Previous affiliations of Daxin Jiang include Peking University & University of Electronic Science and Technology of China.

Papers
More filters
Journal ArticleDOI

Cluster analysis for gene expression data: a survey

TL;DR: This paper divides cluster analysis for gene expression data into three categories, presents specific challenges pertinent to each clustering category and introduces several representative approaches, and suggests the promising trends in this field.
Posted Content

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

TL;DR: This work develops CodeBERT with Transformer-based neural architecture, and trains it with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators.
Journal ArticleDOI

Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training.

TL;DR: After pretraining on large-scale image-caption pairs, Unicoder-VL is transferred to caption-based image-text retrieval and visual commonsense reasoning, with just one additional output layer, and shows the powerful ability of the cross-modal pre-training.
Proceedings ArticleDOI

Context-aware query suggestion by mining click-through and session data

TL;DR: This paper proposes a novel context-aware query suggestion approach which is in two steps, and outperforms two baseline methods in both coverage and quality of suggestions.
Posted Content

GraphCodeBERT: Pre-training Code Representations with Data Flow

TL;DR: Results show that code structure and newly introduced pre-training tasks can improve GraphCodeBERT and achieves state-of-the-art performance on the four downstream tasks and it is shown that the model prefers structure-level attentions over token- level attentions in the task of code search.