Showing papers by "John Platt published in 2005"

PDF

Open Access

Proceedings Article•

Multiple Instance Boosting for Object Detection

[...]

Cha Zhang¹, John Platt¹, Paul A. Viola¹•Institutions (1)

05 Dec 2005

TL;DR: MILBoost adapts the feature selection criterion of MILBoost to optimize the performance of the Viola-Jones cascade to show the advantage of simultaneously learning the locations and scales of the objects in the training set along with the parameters of the classifier.

...read moreread less

Abstract: A good image object detection algorithm is accurate, fast, and does not require exact locations of objects in a training set. We can create such an object detector by taking the architecture of the Viola-Jones detector cascade and training it with a new variant of boosting that we call MIL-Boost. MILBoost uses cost functions from the Multiple Instance Learning literature combined with the AnyBoost framework. We adapt the feature selection criterion of MILBoost to optimize the performance of the Viola-Jones cascade. Experiments show that the detection rate is up to 1.6 times better using MILBoost. This increased detection rate shows the advantage of simultaneously learning the locations and scales of the objects in the training set along with the parameters of the classifier.

...read moreread less

808 citations

Proceedings Article•DOI•

Hidden conditional random fields for phone classification.

[...]

Asela Gunawardana¹, Milind Mahajan¹, Alex Acero¹, John Platt¹•Institutions (1)

Microsoft¹

04 Sep 2005

TL;DR: This paper presents the results on the TIMIT phone classification task and shows that HCRFs outperforms comparable ML and CML/MMI trained HMMs and has the ability to handle complex features without any change in training procedure.

...read moreread less

Abstract: In this paper, we show the novel application of hidden conditional random fields (HCRFs) – conditional random fields with hidden state sequences – for modeling speech. Hidden state sequences are critical for modeling the non-stationarity of speech signals. We show that HCRFs can easily be trained using the simple direct optimization technique of stochastic gradient descent. We present the results on the TIMIT phone classification task and show that HCRFs outperforms comparable ML and CML/MMI trained HMMs. In fact, HCRF results on this task are the best single classifier results known to us. We note that the HCRF framework is easily extensible to recognition since it is a state and label sequence modeling technique. We also note that HCRFs have the ability to handle complex features without any change in training procedure.

...read moreread less

352 citations

Patent•

Auto playlist generator

[...]

John Platt¹, Christopher J. C. Burges¹, Alice Zheng¹, Christopher B. Weare¹, Steven E. Swenson¹ - Show less +1 more•Institutions (1)

Microsoft¹

11 Mar 2005

TL;DR: In this paper, a system and method for generating a list is described, which includes a seed item input subsystem, an item identifying subsystem, a descriptive metadata similarity determining subsystem and a list generating subsystem that builds a list based, at least in part, on similarity processing performed on seed item descriptive metadata and user item descriptors.

...read moreread less

Abstract: A system and method for generating a list is provided. The system includes a seed item input subsystem, an item identifying subsystem, a descriptive metadata similarity determining subsystem and a list generating subsystem that builds a list based, at least in part, on similarity processing performed on seed item descriptive metadata and user item descriptive metadata and user selected thresholds applied to such similarity processing. The method includes inexact matching between identifying metadata associated with new user items and identifying metadata stored in a reference metadata database. The method further includes subjecting candidate user items to similarity processing, where the degree to which the candidate user items are similar to the seed item is determined, and placing user items in a list of items based on user selected preferences for (dis)similarity between items in the list and the seed item.

...read moreread less

280 citations

Proceedings Article•

FastMap, MetricMap, and Landmark MDS are all Nystrom Algorithms

[...]

John Platt¹•Institutions (1)

Microsoft¹

01 Jan 2005

TL;DR: Empirical experiments on the Reuters and Corel Image Features data sets show that LMDS is more accurate than FastMap and MetricMap with roughly the same computation and can become even more accurate if allowed to be slower.

...read moreread less

Abstract: This paper unifies the mathematical foundation of three multidimen- sional scaling algorithms: FastMap, MetricMap, and Landmark MDS (LMDS). All three algorithms are based on the Nystrom approximation of the eigenvectors and eigenvalues of a matrix. LMDS is applies the basic Nystrom approximation, while FastMap and MetricMap use generaliza- tions of Nystrom, including deflation and using more points to establish an embedding. Empirical experiments on the Reuters and Corel Image Features data sets show that the basic Nystrom approximation outper- forms these generalizations: LMDS is more accurate than FastMap and MetricMap with roughly the same computation and can become even more accurate if allowed to be slower.

...read moreread less

192 citations

Patent•

Automatic organization of documents through email clustering

[...]

Arungunram C. Surendran¹, Erin L. Renshaw¹, John Platt¹•Institutions (1)

Microsoft¹

29 Dec 2005

TL;DR: A system that facilitates organization of emails comprises a clustering component that clusters a plurality of emails and creates topics for emails by assigning key phrases extracted from emails within one or more clusters as discussed by the authors.

...read moreread less

Abstract: A system that facilitates organization of emails comprises a clustering component that clusters a plurality of emails and creates topics for emails by assigning key phrases extracted from emails within one or more clusters. An organization component then utilizes the key phrases to organize documents. Furthermore, the organization component can comprise a probability component that determines a probability that a document belongs to a certain topic.

...read moreread less

135 citations

Patent•

Client-based generation of music playlists via clustering of music similarity vectors

[...]

Erin L. Renshaw¹, John Platt¹•Institutions (1)

Microsoft¹

27 Jan 2005

TL;DR: The Music Mapper as mentioned in this paper automatically constructs a set coordinate vectors for use in inferring similarity between various pieces of music in particular, given a music similarity graph expressed as links between various artists, albums, songs, etc, the music Mapper applies a recursive embedding process to embed each of the graphs music entries into a multi-dimensional space.

...read moreread less

Abstract: A “Music Mapper” automatically constructs a set coordinate vectors for use in inferring similarity between various pieces of music In particular, given a music similarity graph expressed as links between various artists, albums, songs, etc, the Music Mapper applies a recursive embedding process to embed each of the graphs music entries into a multi-dimensional space This recursive embedding process also embeds new music items added to the music similarity graph without reembedding existing entries so long a convergent embedding solution is achieved Given this embedding, coordinate vectors are then computed for each of the embedded musical items The similarity between any two musical items is then determined as either a function of the distance between the two corresponding vectors In various embodiments, this similarity is then used in constructing music playlists given one or more random or user selected seed songs or in a statistical music clustering process

...read moreread less

103 citations

Patent•

System and method for speeding up database lookups for multiple synchronized data streams

[...]

Chris J.C. Burges¹, John Platt¹•Institutions (1)

Microsoft¹

15 Sep 2005

TL;DR: In this article, a dynamic trace cache is used to limit the database queries necessary to identify particular traces, such as songs, commercials, jingles, station identifiers, etc.

...read moreread less

Abstract: A "Media Identifier" operates on concurrent media streams to provide large numbers of clients with real-time server-side identification of media objects embedded in streaming media, such as radio, television, or Internet broadcasts. Such media objects may include songs, commercials, jingles, station identifiers, etc. Identification of the media objects is provided to clients by comparing client-generated traces computed from media stream samples to a large database of stored, pre-computed traces (i.e., "fingerprints") of known identification. Further, given a finite number of media steams and a much larger number of clients, many of the traces sent to the server are likely to be almost identical. Therefore, a searchable dynamic trace cache is used to limit the database queries necessary to identify particular traces. This trace cache caches only one copy of recent traces along with the database search results, either positive or negative. Cache entries are then removed as they age.

...read moreread less

75 citations

Patent•

Systems and methods for generating audio thumbnails

[...]

Christopher J. C. Burges¹, John Platt¹, Daniel Plastina, Erin L. Renshaw, Henrique S. Malvar - Show less +1 more•Institutions (1)

Microsoft¹

10 Feb 2005

TL;DR: In this paper, a system and methodology to facilitate automatic generation of mnemonic audio portions or segments referred to as audio thumbnails is presented, which can then be employed to facilitate browsing or searching audio files in order to mitigate listening to longer segments or segments of such files.

...read moreread less

Abstract: The present invention relates to a system and methodology to facilitate automatic generation of mnemonic audio portions or segments referred to as audio thumbnails. A system is provided for summarizing audio information. The system includes an analysis component to determine common features in an audio file and a mnemonic detector to extract fingerprint portions of the audio file based in part on the common features in order to generate a thumbnail of the audio file. The generated thumbnails can then be employed to facilitate browsing or searching audio files in order to mitigate listening to longer portions or segments of such files.

...read moreread less

74 citations

Proceedings Article•DOI•

Using audio fingerprinting for duplicate detection and thumbnail generation

[...]

Christopher J. C. Burges¹, Daniel Plastina¹, John Platt¹, Erin L. Renshaw¹, Henrique S. Malvar¹ - Show less +1 more•Institutions (1)

Microsoft¹

18 Mar 2005

TL;DR: Two new applications of audio fingerprinting are presented: duplicate detection, whose goal is to identify duplicate audio clips in a set, even if they differ in compression quality or duration, and thumbnail generation, which aims to provide a representative short clip of a music track.

...read moreread less

Abstract: Audio fingerprinting is a powerful tool for identifying file-based or streaming audio, using a database of fingerprints. The paper presents two new applications of audio fingerprinting: duplicate detection, whose goal is to identify duplicate audio clips in a set, even if they differ in compression quality or duration, and thumbnail generation, which aims to provide a representative short clip of a music track. Neither application requires an external database of fingerprints. Thanks to the robustness of the fingerprinting engine, both applications perform well; the duplicate detector has a false positive rate that is conservatively bounded above by 1% on a very large data set, and the thumbnail generator significantly outperforms using a fixed window.

...read moreread less

62 citations

Patent•

Leveraging unlabeled data with a probabilistic graphical model

[...]

Christopher J. C. Burges¹, John Platt¹•Institutions (1)

Microsoft¹

30 Jun 2005

TL;DR: In this paper, a general probabilistic formulation referred to as "Conditional Harmonic Mixing" is provided, in which links between classification nodes are directed, a conditional probability matrix is associated with each link, and where the numbers of classes can vary from node to node.

...read moreread less

Abstract: A general probabilistic formulation referred to as ‘Conditional Harmonic Mixing’ is provided, in which links between classification nodes are directed, a conditional probability matrix is associated with each link, and where the numbers of classes can vary from node to node. A posterior class probability at each node is updated by minimizing a divergence between its distribution and that predicted by its neighbors. For arbitrary graphs, as long as each unlabeled point is reachable from at least one training point, a solution generally always exists, is unique, and can be found by solving a sparse linear system iteratively. In one aspect, an automated data classification system is provided. The system includes a data set having at least one labeled category node in the data set. A semi-supervised learning component employs directed arcs to determine the label of at least one other unlabeled category node in the data set.

...read moreread less

51 citations

Patent•

Multi-channel echo cancellation with round robin regularization

[...]

Jack W. Stokes¹, John Platt¹•Institutions (1)

Microsoft¹

27 Jun 2005

TL;DR: In this paper, a method and system of multi-channel echo cancellation using round robin regularization was proposed, which includes applying a plurality of adaptive filters, each having an inverse correlation matrix, to the multichannel playback signal.

...read moreread less

Abstract: A method and system of multi-channel echo cancellation using round robin regularization. The multi-channel round robin regularization echo cancellation method includes applying a plurality of adaptive filters, each having an inverse correlation matrix, to the multi-channel playback signal. Each of the plurality of adaptive filters is selected in a round robin sequence, so that every round each of the filters is selected. The inverse correlation matrix associated with each selected adaptive filter then is regularized as needed. The regularized adaptive filter then is used to remove the echo of the multi-channel playback signal from a captured signal. Regularization is implemented in a round robin manner to ensure that each subband is selected so that the adaptive filter for that subband can be examined. Other features of the multi-channel echo cancellation system and method include dynamic switching between monaural and multi-channel echo cancellation and mixed processing for lower and upper subbands.

...read moreread less

Patent•

Client-based generation of music playlists from a server-provided subset of music similarity vectors

[...]

John Platt¹, Erin L. Renshaw¹•Institutions (1)

Microsoft¹

27 Jan 2005

TL;DR: The Music Mapper as mentioned in this paper automatically constructs a set coordinate vectors for use in inferring similarity between various pieces of music, which is then used in constructing music playlists given one or more random or user selected seed songs or in a statistical music clustering process.

...read moreread less

Abstract: A “Music Mapper” automatically constructs a set coordinate vectors for use in inferring similarity between various pieces of music. In particular, given a music similarity graph expressed as links between various artists, albums, songs, etc., the Music Mapper applies a recursive embedding process to embed each of the graphs music entries into a multi-dimensional space. This recursive embedding process also embeds new music items added to the music similarity graph without reembedding existing entries so long a convergent embedding solution is achieved. Given this embedding, coordinate vectors are then computed for each of the embedded musical items. The similarity between any two musical items is then determined as either a function of the distance between the two corresponding vectors. In various embodiments, this similarity is then used in constructing music playlists given one or more random or user selected seed songs or in a statistical music clustering process.

...read moreread less

Proceedings Article•

Automatic Discovery of Personal Topics to Organize Email.

[...]

Arun C. Surendran, John Platt, Erin L. Renshaw

01 Jul 2005

TL;DR: This paper presents a procedure to automatically discover a user s personal topics by clustering their emails using appropriate keywords and demonstrates these keywords by creating an email/ document browser which makes use of these keywords as standing queries to create virtual folders that help organize, index and retrieve email efficiently.

...read moreread less

Abstract: We present in this paper a procedure to automatically discover a user s personal topics by clustering their emails. Unlike previous work, we automatically label topics using appropriate keywords. We show that, in order to get appropriate keywords, we must apply strong filters that use domain knowledge about e-mail and the workplace of the user. We demonstrate these keywords by creating an email/ document browser which makes use of these keywords as standing queries to create virtual folders that help organize, index and retrieve email efficiently. We present subjective user studies to show the usefulness of the strong filtering.

...read moreread less

Patent•

System and process for regression-based residual acoustic echo suppression

[...]

Amit Singh Chhetri¹, Arungunram C. Surendran¹, Jack W. Stokes¹, John Platt¹•Institutions (1)

Microsoft¹

31 Mar 2005

TL;DR: A regression-based residual echo suppression (RES) system and process for suppressing the portion of the microphone signal corresponding to a playback of a speaker audio signal that was not suppressed by an acoustic echo canceller (AEC) is proposed in this article.

...read moreread less

Abstract: A regression-based residual echo suppression (RES) system and process for suppressing the portion of the microphone signal corresponding to a playback of a speaker audio signal that was not suppressed by an acoustic echo canceller (AEC). In general, a prescribed regression technique is used between a prescribed spectral attribute of multiple past and present, fixed-length, periods (e.g., frames) of the speaker signal and the same spectral attribute of a current period (e.g., frame) of the echo residual in the output of the AEC. This automatically takes into consideration the correlation between the time periods of the speaker signal. The parameters of the regression can be easily tracked using adaptive methods. Multiple applications of RES can be used to produce better results and this system and process can be applied to stereo-RES as well.

...read moreread less

Journal Article•

Extensions of the informative vector machine

[...]

Neil D. Lawrence, John Platt, Michael I. Jordan

01 Jan 2005-Lecture Notes in Computer Science

TL;DR: In this paper, the informative vector machine (IVM) is extended to a block-diagonal covariance matrix, which allows the IVM to be applied to a mixture of labeled and unlabeled data.

...read moreread less

Abstract: The informative vector machine (IVM) is a practical method for Gaussian process regression and classification. The IVM produces a sparse approximation to a Gaussian process by combining assumed density filtering with a heuristic for choosing points based on minimizing posterior entropy. This paper extends IVM in several ways. First, we propose a novel noise model that allows the IVM to be applied to a mixture of labeled and unlabeled data. Second, we use IVM on a block-diagonal covariance matrix, for learning to learn from related tasks. Third, we modify the IVM to incorporate prior knowledge from known invariances. All of these extensions are tested on artificial and real data.

...read moreread less

Journal Article•

Redundant bit vectors for quickly searching high-dimensional regions

[...]

Jonathan Goldstein, John Platt, Christopher J. C. Burges

01 Jan 2005-Lecture Notes in Computer Science

TL;DR: Redundant Bit Vectors is proposed: a novel method for quickly solving applications such as audio fingerprinting that approximate the high-dimensional regions/distributions as tightened hyperrectangles as well as partition the query space to store each item redundantly in an index.

...read moreread less

Abstract: Applications such as audio fingerprinting require search in high dimensions: find an item in a database that is similar to a query. An important property of this search task is that negative answers are very frequent: much of the time, a query does not correspond to any database item. We propose Redundant Bit Vectors (RBVs): a novel method for quickly solving this search problem. RBVs rely on three key ideas: 1) approximate the high-dimensional regions/distributions as tightened hyperrectangles, 2) partition the query space to store each item redundantly in an index and 3) use bit vectors to store and search the index efficiently. We show that our method is the preferred method for very large databases or when the queries are often not in the database. Our method is 109 times faster than linear scan, and 48 times faster than locality-sensitive hashing on a data set of 239369 audio fingerprints.

...read moreread less

Patent•

Game-powered search engine

[...]

Luis von Ahn Arellano¹, Eric D. Brill¹, John Platt¹, Josh Benaloh¹•Institutions (1)

Microsoft¹

24 Jan 2005

TL;DR: In this article, the authors present a system and method that facilitates an interactive game-powered search engine that serve the purposes of both users who may be looking for information as well as game participants who may desire to earn some reward or level of enjoyment by playing the game.

...read moreread less

Abstract: The subject invention provides a unique system and method that facilitates an interactive game-powered search engine that serve the purposes of both users who may be looking for information as well as game participants who may desire to earn some reward or level of enjoyment by playing the game. More specifically, the system and method provides feedback to a user based on the user's input string or a string derived therefrom. The feedback can be a response or answer to the user's input in the form of text, an image, audio or sound, video, and/or a URL that is provided by one or more game participants when there is some degree of consistency or agreement between the responses or when individual players have demonstrated good reliability in their responses.

...read moreread less

Patent•

Metadata generation for rich media

[...]

John Platt¹, M. Robinson¹•Institutions (1)

Microsoft¹

28 Nov 2005

TL;DR: In this paper, text is extracted from a document or workflow that is relevant to the rich media content and the text is filtered into keyphrases and added to a metadata file associated with the content.

...read moreread less

Abstract: Metadata is generated for rich media content from a document or workflow that is associated with the rich media content. When rich media content is included in a document or workflow, text is extracted from the document or workflow that is relevant to the rich media content. The text is filtered into keyphrases and added to a metadata file associated with the rich media content.

...read moreread less

Patent•

Updating hidden conditional random field model parameters after processing individual training samples

[...]

Milind Mahajan¹, Alejandro Acero¹, Asela Gunawardana¹, John Platt¹•Institutions (1)

Microsoft¹

22 Sep 2005

TL;DR: In this article, a method and apparatus for training parameters in a hidden conditional random field model for use in speech recognition and phonetic classification is provided. But this method is limited to a single segment of speech, and the parameters are updated after processing of individual training samples.

...read moreread less

Abstract: A method and apparatus are provided for training parameters in a hidden conditional random field model for use in speech recognition and phonetic classification. The hidden conditional random field model uses parameterized features that are determined from a segment of speech, and those values are used to identify a phonetic unit for the segment of speech. The parameters are updated after processing of individual training samples.

...read moreread less

Patent•

Multi-input channel and multi-output channel echo cancellation

[...]

Jack W. Stokes¹, John Platt¹•Institutions (1)

Microsoft¹

10 Jun 2005

TL;DR: In this paper, an echo cancellation technique that can process multi-input microphone signals with only a small increase in the overall CPU consumption compared to implementing the algorithm for a single channel microphone signal is presented.

...read moreread less

Abstract: An echo cancellation technique that can process multi-input microphone signals with only a small increase in the overall CPU consumption compared to implementing the algorithm for a single channel microphone signal. Furthermore, the invention provides an architecture that provides for echo cancellation for multiple applications in parallel with only a small increase in CPU consumption compared to a single instance of echo cancellation with a single microphone input and multi-output channel playback.

...read moreread less

Proceedings Article•DOI•

Learning spatially-variable filters for super-resolution of text

[...]

Adrian Corduneanu¹, John Platt•Institutions (1)

Massachusetts Institute of Technology¹

14 Nov 2005

TL;DR: A novel algorithm for super-resolution of text magnifies images in real-time by interpolation with a variable linear filter determined nonlinearly from the neighborhood to which it is applied.

...read moreread less

Abstract: Images magnified by standard methods display a degradation of detail that is particularly noticeable in the blurry edges of text. Current super-resolution algorithms address the lack of sharpness by filling in the image with probable details. These algorithms break the outlines of text. Our novel algorithm for super-resolution of text magnifies images in real-time by interpolation with a variable linear filter. The coefficients of the filter are determined nonlinearly from the neighborhood to which it is applied. We train the mapping that defines the coefficients to specifically enhance edges of text, producing a conservative algorithm that infers the detail of magnified text. Possible applications include resizing web page layouts or other interfaces, and enhancing low resolution camera captures of text. In general, learning spatially-variable filters is applicable to other image filtering tasks.

...read moreread less

Patent•

Alpha correction to compensate for lack of gamma correction

[...]

John Platt¹, Mikhail M. Lyapunov¹•Institutions (1)

Microsoft¹

09 Mar 2005

TL;DR: In this paper, the blending coefficients (alpha values) of font glyphs undergo alpha correction to compensate for a lack of gamma correction in text rendering processes, which can be performed by a GPU which is not configured to perform gamma correction.

...read moreread less

Abstract: The blending coefficients (alpha values) of font glyphs undergo alpha correction to compensate for a lack of gamma correction in text rendering processes. The alpha correction includes selecting a set of correction coefficients that correspond to the predetermined gamma value of the display device and computing corrected alpha values from the known alpha values, the foreground colors, and set of correction coefficients. The corrected alpha values can then be used to blend the foreground and background colors of the corresponding display pixels without requiring gamma correction. Accordingly, the alpha correction can be performed by a GPU, which is not configured to perform gamma correction, thereby increasing the speed at which text rendering can occur.

...read moreread less

Patent•

Image processing using saltating samples

[...]

Hugues Hoppe¹, John Platt¹, Sylvain Lefebvre¹•Institutions (1)

Microsoft¹

28 Jul 2005

TL;DR: In this paper, a saltating sample image enhancement system and method that provides an image processing operation in which a filter considers one or one or more exact source image pixels, one or many bilinearly interpolated source image samples, where the bilinear weights are coupled to the position of the target pixel relative to the source pixels, and (optionally) one or multiple linearly interpolation source image sample samples, with the linear weights being coupled to position of target pixels relative to source pixels.

...read moreread less

Abstract: A saltating sample image enhancement system and method that provides an image processing operation in which a filter considers one or one or more exact source image pixels; one or more bilinearly interpolated source image samples, where the bilinear weights are coupled to the position of the target pixel relative to the source pixels; and (optionally) one or more linearly interpolated source image samples, where the linear weights are coupled to the position of the target pixel relative to the source pixels. The filter can construct a spatially continuous image statistic.

...read moreread less