scispace - formally typeset
D

David Pinto

Researcher at Benemérita Universidad Autónoma de Puebla

Publications -  159
Citations -  1659

David Pinto is an academic researcher from Benemérita Universidad Autónoma de Puebla. The author has contributed to research in topics: Cluster analysis & SemEval. The author has an hindex of 19, co-authored 158 publications receiving 1488 citations. Previous affiliations of David Pinto include Universidad Popular Autónoma del Estado de Puebla & Polytechnic University of Valencia.

Papers
More filters
Journal ArticleDOI

Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model

TL;DR: The proposed similarity measure soft similarity is a generalize of the well-known cosine similarity measure in VSM by introducing what it is called “soft cosine measure” and various formulas for exact or approximate calculation of the softcosine measure are proposed.
Proceedings Article

On cross-lingual plagiarism analysis using a statistical model

TL;DR: The process for the automatic cross-lingual plagiarism analysis based on the statistical bilingual dictionary has shown good results and it is considered that it could be useful also for the cross-lingsual nearduplicate detection task.
Proceedings Article

A statistical approach to crosslingual natural language tasks

TL;DR: In this paper, a statistical IBM 1 word alignment model (M1) is proposed to align words from a sentence in a source language to words from another sentence in another, target language.
Book ChapterDOI

Clustering Narrow-Domain Short Texts by Using the Kullback-Leibler Distance

TL;DR: This paper addresses the problem of clustering short length texts with the use of a new measure of distance between documents which is based on the symmetric Kullback-Leibler distance, and indicates that it is possible to use this measure for the addressed problem obtaining comparable results than those which use the Jaccard similarity measure.
Journal ArticleDOI

A statistical approach to crosslingual natural language tasks

TL;DR: This work proposes to use a direct probabilistic crosslingual NLP system which integrates both steps, translation and the specific NLP task, into a single one, and uses the statistical IBM 1 word alignment model (M1).