Investors often rely on probabilistic models that were learned from small historical labeled datasets. The purpose of this article is to propose a new method for data‐efficient…
Abstract
Purpose
Investors often rely on probabilistic models that were learned from small historical labeled datasets. The purpose of this article is to propose a new method for data‐efficient model learning.
Design/methodology/approach
The proposed method, which is an extension of the standard minimum relative entropy (MRE) approach and has a clear financial interpretation, belongs to the class of semi‐supervised algorithms, which can learn from data that are only partially labeled with values of the variable of interest.
Findings
This study tests the method on an artificial dataset and uses it to learn a model for recovery of defaulted debt. In both cases, the resulting models perform better than the standard MRE model, when the number of labeled data is small.
Originality/value
The method can be applied to financial problems where labeled data are sparse but unlabeled data are readily available.