Data used in the study: Stack Exchange Data Dump 01.06.2020: https://archive.org/download/stackexchange
File | Description |
---|---|
parse_xml.py | Function file to parse the XML files from the Data Dump to .csv files |
related_tags.py | Function file to count the tags related to one given search term |
related_tag_count.py | Function file to merge the tag counts of two files generated with related_tags.py |
total_tag_count.py | Function file to generate a .csv file with the total count of tags in a dataset |
threshold_calculation.py | Function file to calculate the TRT1 and TST2 threshold |
UserSelection.py | Class file with general functions for the user input |
CsvAction.py | Class file with general functions to edit .csv files |
popular_number_count = mobile_number['Number_count'].argmax()
posts.xml and tags.xml got parsed with the help of the parse_xml.py file.
Category | Keyword |
---|---|
Development | android-studio |
Development | android-sdk |
Development | android-app |
Development | android |
Languages | kotlin |
3 The initial list of keywods is the same for both the English Stack Overflow community and the Russian Stack Overflow community, because there are no results for the cyrillic written form of google or android (гоогле or андроид)
List got generated by first searching for tags which are partly labled with android, google or kotlin (result 165 tags). After that the tags got manually categorized.
Category | Amount of tags |
---|---|
Layout/XML | 25 tags |
Android c | 2 tags |
Android general | 85 tags |
Google mobile | 6 tags |
Google other | 44 tags |
kotlin | 3 tags |
- The category "Layout/XML" can be filtered out completly, because the researched languages of this study are Kotlin and Java.
- The category "Android c" can be filtered out completly, because the researched languages of this study are Kotlin and Java.
- The category "Google other" can be filtered out completly, because mobile is the scope of this study.
- The category "Google mobile" can be filtered out completly, because results results for the tags seem to be more general about the services themself, than about specific programming related issues.
- The category "kotlin" can be limited to only the tag "kotlin", because results for the tag "kotlin-faq" result in posts which are also tagged with "kotlin" and results for "kotlin-native" are tagged with "c" (researched language of this study are Kotlin and Java)
- The category "Android general" can be limited to 4 tags, which are essential for every Android project. Most tags which contain the substring android are topic, or element specific (e.g. android-camera, or android-toast)
Disclaimer: A category hardware as suggested in the study is not necessary, because posts with the tags "galaxy" and "nexus" are scarce and a greater variety of devices makes it uncommon for Stack Overflow users to mention their used hardware device with a tag. A category for Android versions is not necessary because the only tags that are available are "android-5.0-lollipop", "pie" and "android-tv" (two of the three "androidtv" posts are either labled with "android-sdk" or "android")
related_tags.py got executed for each keyword in the initial keywords table, after that they got combined in pairs of two with the related_tag_count.py file.
To get the total count of tags, the total_tag_count.py file got executed on the before parsed posts.csv file.
For the thresholds the threshold_calculation.py got executed selecting the before genereated mobile tags count and total count files. TRT results are up to 2 because tags got used together with several of the initial keywords. Since the posts with a TRT value higher than 1 are low volume posts they will be filtered out here and only tags with a TRT < 1 > 0.4 will be used.
Tag | TST |
---|---|
listview | ~ 0,8845 |
sqlite | ~ 0,4391 |
gradle | ~ 0,918 |
firebase | ~ 0,7346 |
google-play | ~ 0,8925 |
webview | ~ 0,8519 |
--- | --- |
Like Rosen and Shibab's study (2016) the threshold for the TST will be 1% as suggested in the study. | |
Tag | TST |
--- | --- |
listview | 0.02956274802490261 |
sqlite | 0.025084647030982638 |
gradle | 0.01671096224560381 |
firebase | 0.016019223067681214 |
google-play | 0.012087231951068554 |
webview | 0.01004842174245458 |