scispace - formally typeset
Search or ask a question
Posted Content

Bond Default Prediction with Text Embeddings, Undersampling and Deep Learning.

TL;DR: This article used a combination of text embeddings from a pre-trained transformer network, a fully connected neural network, and synthetic oversampling to predict 9 out of 10 defaults for municipal bonds, at a cost of false positives on less than 0.1% non-defaulting bonds.
Abstract: The special and important problems of default prediction for municipal bonds are addressed using a combination of text embeddings from a pre-trained transformer network, a fully connected neural network, and synthetic oversampling. The combination of these techniques provides significant improvement in performance over human estimates, linear models, and boosted ensemble models, on data with extreme imbalance. Less than 0.2% of municipal bonds default, but our technique predicts 9 out of 10 defaults at the time of issue, without using bond ratings, at a cost of false positives on less than 0.1% non-defaulting bonds. The results hold the promise of reducing the cost of capital for local public goods, which are vital for society, and bring techniques previously used in personal credit and public equities (or national fixed income), as well as the current generation of embedding techniques, to sub-sovereign credit decisions.
References
More filters
Journal ArticleDOI
TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Abstract: An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of "normal" examples with only a small percentage of "abnormal" or "interesting" examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of oversampling the minority (abnormal)cla ss and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space)tha n only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space)t han varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC)and the ROC convex hull strategy.

17,313 citations

Proceedings Article
04 Dec 2017
TL;DR: In this article, a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), is presented, which assigns each feature an importance value for a particular prediction.
Abstract: Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

7,309 citations

Proceedings ArticleDOI
14 Aug 2019
TL;DR: Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.
Abstract: BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.

4,020 citations

Journal ArticleDOI
TL;DR: This article used a large sample of individual investor records over a nine-year period to analyze survival rates, the disposition effect, and trading performance at the individual level to determine whether and how investors learn from their trading experience.
Abstract: Using a large sample of individual investor records over a nine-year period, we analyze survival rates, the disposition effect, and trading performance at the individual level to determine whether and how investors learn from their trading experience. We find evidence of two types of learning: some investors become better at trading with experience, while others stop trading after realizing that their ability is poor. A substantial part of overall learning by trading is explained by the second type. By ignoring investor attrition, the existing literature significantly overestimates how quickly investors become better at trading. (JEL D10, G10)

320 citations

Journal ArticleDOI
TL;DR: Investigation of the effect of sampling methods on the performance of quantitative bankruptcy prediction models on real highly imbalanced dataset suggests that the proper sampling method in developing prediction models is mainly dependent on the number of bankruptcies in the training sample set.
Abstract: Corporate bankruptcy prediction is very important for creditors and investors. Most literature improves performance of prediction models by developing and optimizing the quantitative methods. This paper investigates the effect of sampling methods on the performance of quantitative bankruptcy prediction models on real highly imbalanced dataset. Seven sampling methods and five quantitative models are tested on two real highly imbalanced datasets. A comparison of model performance tested on random paired sample set and real imbalanced sample set is also conducted. The experimental results suggest that the proper sampling method in developing prediction models is mainly dependent on the number of bankruptcies in the training sample set.

161 citations