scispace - formally typeset
B

Benjamin Wellner

Researcher at Mitre Corporation

Publications -  6
Citations -  228

Benjamin Wellner is an academic researcher from Mitre Corporation. The author has contributed to research in topics: Set (abstract data type) & Information extraction. The author has an hindex of 4, co-authored 5 publications receiving 206 citations. Previous affiliations of Benjamin Wellner include Brandeis University.

Papers
More filters
Journal ArticleDOI

The MITRE Identification Scrubber Toolkit: Design, training, and assessment

TL;DR: The open source MITRE Identification Scrubber Toolkit (MIST) provides an environment to support rapid tailoring of automated de-identification to different document types, using automatically learned classifiers to de-identified and protect sensitive information.
Journal ArticleDOI

Effects of personal identifier resynthesis on clinical text de-identification.

TL;DR: The de-identification tool achieves high accuracy when training and test sets are homogeneous (ie, both real or resynthesized records), but the resynthesis component regularizes the data to make them less "realistic," resulting in loss of performance particularly when training on resynthesesized data and testing on real data.
Journal ArticleDOI

Bootstrapping a de-identification system for narrative patient records: Cost-performance tradeoffs

TL;DR: The human annotation effort needed to produce a system that de-identifies at high accuracy using the MIST framework is quantified, suggesting that the wider variety and contexts for protected health information in social work notes is more difficult to model.
Proceedings ArticleDOI

Evaluating the automatic mapping of human gene and protein mentions to unique identifiers.

TL;DR: A challenge task for the second BioCreAtIvE (Critical Assessment of Information Extraction in Biology) that requires participating systems to provide lists of the EntrezGene (formerly LocusLink) identifiers for all human genes and proteins mentioned in a MEDLINE abstract is developed.
Journal ArticleDOI

The “Coherent Data Set”: Combining Patient Data and Imaging in a Comprehensive, Synthetic Health Record

TL;DR: The Coherent Data Set is a novel synthetic data set that leverages structured data from Synthea™ to create a longitudinal, “coherent” patient-level electronic health record (EHR).