A nonlinear label compression and transformation method for multi-label classification using autoencoders
Bailey, James (Hrsg). Advances in knowledge discovery and data mining : 20th Pacific-Asia Conference, PAKDD 2016, Auckland, New Zealand, April 19-22, 2016, Proceedings. Pt. I. Cham: Springer 2016 S. 328 - 340
Erscheinungsjahr: 2016
ISBN/ISSN: 978-3-319-31753-3 ; 3-319-31753-9
Publikationstyp: Buchbeitrag (Konferenzbeitrag)
Sprache: Englisch
Doi/URN: 10.1007/978-3-319-31753-3_27
Geprüft | Bibliothek |
Inhaltszusammenfassung
Multi-label classification targets the prediction of multiple interdependent and non-exclusive binary target variables. Transformation-based algorithms transform the data set such that regular single-label algorithms can be applied to the problem. A special type of transformation-based classifiers are label compression methods, which compress the labels and then mostly use single label classifiers to predict the compressed labels. So far, there are no compression-based algorithms that follow ...Multi-label classification targets the prediction of multiple interdependent and non-exclusive binary target variables. Transformation-based algorithms transform the data set such that regular single-label algorithms can be applied to the problem. A special type of transformation-based classifiers are label compression methods, which compress the labels and then mostly use single label classifiers to predict the compressed labels. So far, there are no compression-based algorithms that follow a problem transformation approach and address non-linear dependencies in the labels. In this paper, we propose a new algorithm, called Maniac (Multi-lAbel classificatioN usIng AutoenCoders), which extracts the non-linear dependencies by compressing the labels using autoencoders. We adapt the training process of autoencoders in a way to make them more suitable for a parameter optimization in the context of this algorithm. The method is evaluated on eight standard multi-label data sets. Experiments show that despite not producing a good ranking, Maniac generates a particularly good bipartition of the labels into positives and negatives. This is caused by rather strong predictions with either really high or low probability. Additionally, the algorithm seems to perform better given more labels and a higher label cardinality in the data set.» weiterlesen» einklappen
Autoren
Klassifikation
DFG Fachgebiet:
Informatik
DDC Sachgruppe:
Informatik