Scalable induction of probabilistic real-time automata using maximum frequent pattern based clustering
12th SIAM International Conference on Data Mining 2012 : Anaheim, California, USA, 26 - 28 April 2012. Philadelphia, PA: Society for Industrial and Applied Mathematics 2012 S. 272 - 283
Erscheinungsjahr: 2012
ISBN/ISSN: 978-1-61197-232-0 ; 978-1-61197-282-5
Publikationstyp: Buchbeitrag (Konferenzbeitrag)
Sprache: Englisch
Doi/URN: 10.1137/1.9781611972825.24
Geprüft | Bibliothek |
Inhaltszusammenfassung
The paper presents a scalable method for learning probabilistic real-time automata (PRTAs), a new type of model that captures the dynamics of multi-dimensional event logs. In multi-dimensional event logs, events are described by several features instead of only one symbol. Moreover, it is not clear up front which events occur in an event log. The learning method to find a PRTA that models such an event log is based on the state merging of a prefix tree acceptor, which is guided by a clusterin...The paper presents a scalable method for learning probabilistic real-time automata (PRTAs), a new type of model that captures the dynamics of multi-dimensional event logs. In multi-dimensional event logs, events are described by several features instead of only one symbol. Moreover, it is not clear up front which events occur in an event log. The learning method to find a PRTA that models such an event log is based on the state merging of a prefix tree acceptor, which is guided by a clustering to determine the states of the automaton. To make the overall approach scalable, an online clustering method based on maximum frequent patterns (MFPs) is used. The approach is evaluated on a synthetic, a biological and a medical data set. The results show that the induction of automata using MFP-based clustering gives easy to understand and stable automata, but most importantly, makes it scalable to large data sets.» weiterlesen» einklappen
Autoren
Klassifikation
DFG Fachgebiet:
Informatik
DDC Sachgruppe:
Informatik