Evaluation Methods and Replicability of Software Architecture Research Objects
2022 IEEE 19th International Conference on Software Architecture (ICSA). Los Alamitos, CA, USA: IEEE 2022 S. 157 - 168
Erscheinungsjahr: 2022
Publikationstyp: Buchbeitrag (Konferenzbeitrag)
Sprache: Englisch
Doi/URN: 10.1109/icsa53651.2022.00023
Geprüft | Bibliothek |
Inhaltszusammenfassung
Context: Software architecture (SA) as research area experienced an increase in empirical research, as identified by Galster and Weyns in 2016 [1]. Empirical research builds a sound foundation for the validity and comparability of the research. A current overview on the evaluation and replicability of SA research objects could help to discuss our empirical standards as a community. However, no such current overview exists.Objective: We aim at assessing the current state of practice of evaluat...Context: Software architecture (SA) as research area experienced an increase in empirical research, as identified by Galster and Weyns in 2016 [1]. Empirical research builds a sound foundation for the validity and comparability of the research. A current overview on the evaluation and replicability of SA research objects could help to discuss our empirical standards as a community. However, no such current overview exists.Objective: We aim at assessing the current state of practice of evaluating SA research objects and replication artifact provision in full technical conference papers from 2017 to 2021.Method: We first create a categorization of papers regarding their evaluation and provision of replication artifacts. In a systematic literature review (SLR) with 153 papers we then investigate how SA research objects are evaluated and how artifacts are made available.Results: We found that technical experiments (28%) and case studies (29%) are the most frequently used evaluation methods over all research objects. Functional suitability (46% of evaluated properties) and performance (29%) are the most evaluated properties. 17 papers (11%) provide replication packages and 97 papers (63%) explicitly state threats to validity. 17% of papers reference guidelines for evaluations and 14% of papers reference guidelines for threats to validity.Conclusions: Our results indicate that the generalizability and repeatability of evaluations could be improved to enhance the maturity of the field; although, there are valid reasons for contributions to not publish their data. We derive from our findings a set of four proposals for improving the state of practice in evaluating software architecture research objects. Researchers can use our results to find recommendations on relevant properties to evaluate and evaluation methods to use and to identify reusable evaluation artifacts to compare their novel ideas with other research. Reviewers can use our results to compare the evaluation and replicability of submissions with the state of the practice.» weiterlesen» einklappen
Autoren
Klassifikation
DDC Sachgruppe:
Informatik