Bond Default Prediction with Text Embeddings, Undersampling and Deep Learning.

Posted Content•

Bond Default Prediction with Text Embeddings, Undersampling and Deep Learning.

Luke Jordan¹•Institutions (1)

13 Oct 2021-arXiv: Learning-

TL;DR: This article used a combination of text embeddings from a pre-trained transformer network, a fully connected neural network, and synthetic oversampling to predict 9 out of 10 defaults for municipal bonds, at a cost of false positives on less than 0.1% non-defaulting bonds.

read less

Abstract: The special and important problems of default prediction for municipal bonds are addressed using a combination of text embeddings from a pre-trained transformer network, a fully connected neural network, and synthetic oversampling. The combination of these techniques provides significant improvement in performance over human estimates, linear models, and boosted ensemble models, on data with extreme imbalance. Less than 0.2% of municipal bonds default, but our technique predicts 9 out of 10 defaults at the time of issue, without using bond ratings, at a cost of false positives on less than 0.1% non-defaulting bonds. The results hold the promise of reducing the cost of capital for local public goods, which are vital for society, and bring techniques previously used in personal credit and public equities (or national fixed income), as well as the current generation of embedding techniques, to sub-sovereign credit decisions.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

SMOTE: synthetic minority over-sampling technique

[...]

Nitesh V. Chawla¹, Kevin W. Bowyer², Lawrence O. Hall¹, W. Philip Kegelmeyer³•Institutions (3)

University of South Florida¹, University of Notre Dame², Sandia National Laboratories³

01 Jan 2002-Journal of Artificial Intelligence Research

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less

Abstract: An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of oversampling the minority (abnormal)cla ss and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space)tha n only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space)t han varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC)and the ROC convex hull strategy.

...read moreread less

17,313 citations

Proceedings Article•

A unified approach to interpreting model predictions

[...]

Scott M. Lundberg¹, Su-In Lee¹•Institutions (1)

University of Washington¹

04 Dec 2017

TL;DR: In this article, a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), is presented, which assigns each feature an importance value for a particular prediction.

...read moreread less

Abstract: Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

...read moreread less

7,309 citations

Proceedings Article•DOI•

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

[...]

Nils Reimers¹, Iryna Gurevych¹•Institutions (1)

Technische Universität Darmstadt¹

14 Aug 2019

TL;DR: Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.

...read moreread less

Abstract: BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.

...read moreread less

4,020 citations

Journal Article•DOI•

Learning by Trading

[...]

Amit Seru¹, Tyler Shumway², Noah Stoffman³•Institutions (3)

University of Chicago¹, University of Michigan², Indiana University³

01 Feb 2010-Review of Financial Studies

TL;DR: This article used a large sample of individual investor records over a nine-year period to analyze survival rates, the disposition effect, and trading performance at the individual level to determine whether and how investors learn from their trading experience.

...read moreread less

Abstract: Using a large sample of individual investor records over a nine-year period, we analyze survival rates, the disposition effect, and trading performance at the individual level to determine whether and how investors learn from their trading experience. We find evidence of two types of learning: some investors become better at trading with experience, while others stop trading after realizing that their ability is poor. A substantial part of overall learning by trading is explained by the second type. By ignoring investor attrition, the existing literature significantly overestimates how quickly investors become better at trading. (JEL D10, G10)

...read moreread less

320 citations

Journal Article•DOI•

Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of sampling methods

[...]

Ligang Zhou¹•Institutions (1)

Macau University of Science and Technology¹

01 Mar 2013-Knowledge Based Systems

TL;DR: Investigation of the effect of sampling methods on the performance of quantitative bankruptcy prediction models on real highly imbalanced dataset suggests that the proper sampling method in developing prediction models is mainly dependent on the number of bankruptcies in the training sample set.

...read moreread less

Abstract: Corporate bankruptcy prediction is very important for creditors and investors. Most literature improves performance of prediction models by developing and optimizing the quantitative methods. This paper investigates the effect of sampling methods on the performance of quantitative bankruptcy prediction models on real highly imbalanced dataset. Seven sampling methods and five quantitative models are tested on two real highly imbalanced datasets. A comparison of model performance tested on random paired sample set and real imbalanced sample set is also conducted. The experimental results suggest that the proper sampling method in developing prediction models is mainly dependent on the number of bankruptcies in the training sample set.

...read moreread less

161 citations

Bond Default Prediction with Text Embeddings, Undersampling and Deep Learning.

References

Related Papers (5)