Reviews and Responses for Understanding Citizen Science: Insights from the 2024/2025 OpenSky Network User Survey

Janina Inauen; Karsten Donnay; Vincent Lenders; Martin Strohmeier;
This web version is automatically generated from the LaTeX source and may not include all elements. For complete details, please refer to the PDF version.

Original paper

The DOI for the original paper is https://doi.org/10.59490/joas.2026.8474

Review - round 1

Reviewer 1

This paper presents a valuable empirical contribution to understanding participation patterns in stationary citizen sensing networks, specifically the OpenSky Network. The large-scale survey with hundreds of complete responses provides meaningful insights into demographics, motivations, and barriers to participation. The study is well-conceived and methodologically sound, addressing an important gap in citizen science literature by focusing on networks requiring hardware investment and technical maintenance. The paper is well-structured and follows excellent logic throughout, making it suitable for publication with “medium” and minor revisions.

Recommendation: Accept Submission

Major Comments

  1. The paper introduces technical terms without proper definition. GPS should be expanded as Global Positioning System on first use. More critically, ADS-B (Automatic Dependent Surveillance-Broadcast) and Mode S (Mode Select) are specialized aviation surveillance technologies that require explicit definition when first mentioned. While these may be familiar to aviation researchers, the interdisciplinary audience interested in citizen sensing research needs clear explanations. The authors should add definitions for all technical acronyms at first appearance.

  2. The methodology section introduces multiple statistical metrics with threshold values but provides limited context for readers unfamiliar with these measures. Several clarifications would strengthen this section. What decision criteria guided the selection of each test? The authors should expand this section with several sentences explaining the rationale for metric selection and guidelines for when different tests are applied. A brief concrete example illustrating one metric in practice would enhance accessibility.

  3. An important methodological consideration raised in the annotations concerns potential interactions among variables. Given that factors such as education, income, and employment status are likely correlated, and motivations may interact in complex ways, have the authors considered multi-factor analyses to disentangle partial contributions? Approaches such as Shapley value analysis could help assess marginal contributions when predictors are correlated. If such analyses were not conducted, this should be acknowledged as a limitation. If they were conducted, mentioning the results would strengthen the paper.

  4. The most significant weakness is the lack of concrete, actionable recommendations. The paper identifies what problems exist but provides insufficient guidance on how to address them. The paper should expand Section 7 or add a recommendations subsection addressing specific interventions. For financial barriers, what hardware subsidy models might work? Should the network consider graduated pricing by region or partnerships with educational institutions? For geographic disparities in the Global South and Latin America, what partnerships might prove effective? For sustaining motivation, how might enhanced dashboards show participants which research uses their data? The finding that participants value “contributing to independent, open-access data initiatives’’ suggests specific messaging strategies for recruitment. Most critically, the authors must explicitly address “What is the next step?’’

Minor Comments

  • Figures are difficult to read, likely due to small font sizes. I would increase the font size.

  • The interpretation of H3 requires clarification. The paper states H3 is not supported because “routine’’ showed only moderate importance with poor construct validity. However, the text notes that “relative thresholds suggest routine still ranks among key motivators.’’ Does moderate importance represent partial support for H3, or does “strongest motivators’’ in the hypothesis imply a higher threshold? The discussion should clarify what constitutes support versus rejection more explicitly, acknowledging both routine’s conceptual plausibility and the empirical measurement challenges encountered.

  • The finding that OpenSky participants score lower on “face’’ and “power resources’’ motivations than other network operators is intriguing. Could OpenSky’s open-science mission attract participants less motivated by recognition or material benefits? This difference has implications for recruitment messaging and deserves expanded discussion.

  • Line 38: “This kind of synergy’’ needs clearer antecedent reference, presumably to the hobbyist community overlap.

  • Line 157: correct “asspects’’ to “aspects.’’

  • Table 1 should include median age alongside mean, as age distributions can be skewed.

  • Section 5.3’s regional analysis would benefit from formal hypothesis testing (chi-square tests) confirming that regional differences in deterrents are statistically significant rather than chance variation.

Reviewer 2

Thank you for this valuable contribution investigating who participates in the OpenSky Network and why. The study’s objective and methodology are presented clearly. The paper reports results from a large member survey (late 2024–early 2025; >500 responses) to characterize participant demographics, motivations, and barriers in this stationary citizen sensing context (OSN). It combines descriptive analyses and confirmatory factor analysis to identify key motivators (e.g., “helping research’’/openness) and the most salient deterrent (financial cost), including regional patterns. Nevertheless, I recommend addressing the following points.

Recommendation: Revisions Required

Major Comments

  1. H2 (line 110): The rationale for the hypothesis is unclear. You state (line 109) that OpenSky data serve research purposes, yet H2 claims there is “less emphasis on helping with research.’’ The fact that OpenSky does not pursue a single predefined research objective does not, by itself, imply that “helping research’’ should be a less important driver. Please revise the hypothesis framing and the associated argumentation or explain more explicitly how the premises support H2 and how this connects to the subsequent interpretation.

  2. Structure and reporting in Chapter 4: Please revise the structure and presentation logic of this chapter.

    1. Present key descriptive numbers (and, where relevant, percentages) before drawing general conclusions. For example, line 213 states that most respondents were male, but the corresponding value appears only later (line 227). A similar issue occurs around line 225, where the text moves to hypothesis-oriented interpretation before providing the underlying results.

    2. Ensure that all tables and figures are explicitly referenced in the text and briefly introduced (what is shown and why it matters).

    3. Improve sequencing: introduce and describe a table/figure in the text before it appears to support reader interpretation. For example, Table 1 appears at the start of the chapter but is first referenced only later (line 232).

    4. Consider moving hypothesis-level conclusions from the results section to the discussion section. The results chapter should primarily report findings.

Minor Comments

  • Abstract: Include at least 1–2 concrete headline findings rather than stating “who, why, and what’’ in general terms.

  • Line 27: Specify what “growth’’ refers to (e.g., members, sensors, coverage, publications).

  • Lines 30–32: Consider moving this sentence to the preceding paragraph where OSN is introduced (e.g., near line 27, after listing the relevant sources).

  • Line 73: You refer to “three values’’ but list four (“help with research’’, “social expansion’’, “routine’’, “teaching’’). Please reconcile this.

  • Line 83: Please clarify what is meant by “skills’’ in this context.

  • Line 87: Chapter 2 discusses the theory. Please adjust the heading accordingly.

  • Line 102: You refer to “above-median-income’’ without stating the threshold. Consider clarifying this in the paragraph preceding H1.

  • Punctuation/style (lines 33, 109, 159, 214, 305): Semicolons appear unnecessary in several places. Where appropriate, split into separate sentences to improve readability. This is often an indicator for AI.

  • Punctuation/style (lines 38, 53, 70, 119, 187, 231, 246, 247, 271, 276, 281): Please review dash usage and replace with commas/colons where more suitable. This is often an indicator for AI.

  • Line 175: Please refer explicitly to Table 3 to support comprehension and mapping.

  • Line 180: Briefly explain what is meant by “attention check.’’

  • Line 189: Chapter 4 reads like the methods chapter overall. Please rename Chapter 4.4 and streamline headings (e.g., “Survey design’’, “Measures’’, “Data analysis’’) to avoid redundancy and improve navigability.

  • Line 205: The first sentence is not part of the results but should be mentioned in the methods chapter.

  • Line 205: The age results appear to belong to the demographics subsection. Consider aligning placement accordingly (rather than preceding Chapter 5.1).

  • Line 228: Consider moving “Respondents came from …’’ to the paragraph beginning at line 213, as it is conceptually linked to the area/continent.

  • Line 231 (Table 1 placement): This content appears to be intended as the textual introduction to Table 1. Consider moving it to directly accompany Table 1.

  • Line 231 (age statistics): If age was collected in ordinal categories, mean and standard deviation depend on assumed category midpoints. Please state clearly that these are approximations and interpret them accordingly.

  • Line 236: I would expect that the participants who failed the attention check are excluded from the whole analysis and would therefore mention the number of failed attention checks at the beginning of Chapter 5.

  • Line 237: The abbreviation is introduced earlier (line 192). Consider using only the abbreviation here.

  • Figure 3: The image appears distorted. Please correct this and consider adding a zoomed-in view of Europe for readability.

  • Lines 245–247: The sentence is difficult to follow. Consider using quotation marks and/or formatting to clearly separate the hypotheses from explanatory text.

  • Figure 4: Adjust colour choices for accessibility/readability (yellow is difficult to read, and the palette overlaps with Figure 3 while encoding different meaning). Also, reference the figure in the text and define the absolute vs. relative threshold succinctly, including how this affects interpretation.

  • Figure 5: Add data labels to improve readability. Consider adding an additional “total’’ bar per category (without continent breakdown) to support quick comparison.

  • Line 280: The country-level differences could be discussed in more depth.

  • Line 301: Add the missing unit after “50’’ (years).

  • Line 311: The conclusion refers to hypotheses only here. Avoid specifically mentioning H4 if hypotheses are otherwise not treated consistently in the conclusion.

  • Line 335: Typo: “Forth author’’ should be “Fourth author.’’

  • Figures (general): Please remove underlines in axis labels and legends.

  • General note for future work: The manuscript appears to assume that respondents operate sensors in their country of residence and uses this as a proxy for sensor location. In future surveys, consider directly asking where the sensor is operated to avoid misclassification.

Response - round 1

Thanks for reviewing our work. Apart from addressing your feedback and some smaller adjustments, the analysis and discussion of the deterrent part was expanded on slightly.

Response to reviewer 1

Thank you very much for taking your time for writing this valuable feedback. See our responses below.

Major Comments

The paper introduces technical terms without proper definition. GPS should be expanded as Global Positioning System on first use. More critically, ADS-B (Automatic Dependent Surveillance-Broad- cast) and Mode S (Mode Select) are specialized aviation surveillance technologies that require explicit definition when first mentioned. While these may be familiar to aviation researchers, the interdisciplinary audience interested in citizen sensing research needs clear explanations. The authors should add definitions for all technical acronyms at first appearance.

Spelled out GPS.

Added explanations for ADS-B and Mode S in the footnotes:
ADS-B: A surveillance technology in which an aircraft automatically broadcasts its identity, position (derived from onboard navigation systems such as GPS/GNSS), and other flight data to ground stations and other equipped aircraft without requiring radar interrogation.[Federal Aviation Administration 2025]

Mode S: A secondary surveillance radar system that enables selective interrogation of aircraft using a unique address, improving identification and altitude reporting, and providing a data link foundation used by extended services such as ADS-B.[Federal Aviation Administration 2024]

The methodology section introduces multiple statistical metrics with threshold values but provides limited context for readers unfamiliar with these measures. Several clarifications would strengthen this section. What decision criteria guided the selection of each test? The authors should expand this section with several sentences explaining the rationale for metric selection and guidelines for when different tests are applied. A brief concrete example illustrating one metric in practice would enhance accessibility.

Expanded explanations on the CFA part in the methodology section:

"Quality of the motivational constructs was assessed with Confirmatory Factor Analysis (CFA), a common method for assessing multi-item constructs [Hair et al. 2020]. The CFA included testing model fit, composite reliability, convergent validity, and discriminant validity [Lin 2025]. Respective thresholds were chosen based on the literature. Composite reliability, informing about the internal consistency of a latent construct [Cheung et al. 2024], was measured with McDonald’s ω\omega. Values of 0.70.7 or higher indicate good reliability [Hair et al. 2014]. Convergent validity, describing how well a category is represented by its respective items, was assessed through factor loadings (0.5\ge 0.5 acceptable, 0.7\ge 0.7 ideal) and Average Variance Extracted (AVE, 0.5\ge 0.5 acceptable)[Hair et al. 2014]. Factor loadings indicate the degree to which a category explains the variance of each of its corresponding items while AVE reflects how much of the variance in a construct’s items is accounted for by the construct itself, relative to the total variance present in those items [Cheung et al. 2024]. Discriminant validity, showing how sharply the categories can be distinguished from each other, was evaluated using a correlation threshold of 0.850.85 [Cheung et al. 2024]."

We did not add specific examples, but tried to make clear what each measure does by explaining more.

An important methodological consideration raised in the annotations concerns potential interactions among variables. Given that factors such as education, income, and employment status are likely correlated, and motivations may interact in complex ways, have the authors considered multi-factor analyses to disentangle partial contributions? Approaches such as Shapley value analysis could help assess marginal contributions when predictors are correlated. If such analyses were not conducted, this should be acknowledged as a limitation. If they were conducted, mentioning the results would strengthen the paper.

If we’re understanding correctly, you mean looking more into how different demographic backgrounds shape motivations. If that is the case: no, we did not address this topic specifically in this paper, but it certainly is an interesting avenue for further work. The data is available to be used by anyone interested in doing so. We added the following part to the limitations:

"While this study focused on describing motivational patterns at an aggregate level, it did not analyse how these motivations might vary across demographic subgroups or interact with structural factors such as education, income, or employment status. Given that such variables are often correlated, future research could employ multi‑factor analytical techniques—such as Shapley value approaches or other variance‑decomposition methods—to disentangle the partial contribution of each predictor. This would allow for a more nuanced understanding of whether demographic heterogeneity meaningfully shapes the motivational configuration of SCS contributors."

The most significant weakness is the lack of concrete, actionable recommendations. The paper identifies what problems exist but provides insufficient guidance on how to address them. The paper should expand Section 7 or add a recommendations subsection addressing specific interventions. For financial barriers, what hardware subsidy models might work? Should the network consider graduated pricing by region or partnerships with educational institutions? For geographic disparities in the Global South and Latin America, what partnerships might prove effective? For sustaining motivation, how might enhanced dashboards show participants which research uses their data? The finding that participants value "contributing to independent, open-access data initiatives" suggests specific messaging strategies for recruitment. Most critically, the authors must explicitly address "What is the next step?"

Chapter 7 renamed to "Conclusion and Recommendations" and several concrete ideas as well as an explicit "next step" was added.

Minor Comments

Figures are difficult to read, likely due to small font sizes. I would increase the font size.

Done.

The interpretation of H3 requires clarification. The paper states H3 is not supported because "routine" showed only moderate importance with poor construct validity. However, the text notes that "relative thresholds suggest routine still ranks among key motivators." Does moderate importance represent partial support for H3, or does "strongest motivators" in the hypothesis imply a higher threshold? The discussion should clarify what constitutes support versus rejection more explicitly, acknowledging both routine’s conceptual plausibility and the empirical measurement challenges encountered.

Good point. To clarify on this, we added several sentences to the paper:

First, in Chapter 4.4: "For comparison, relative thresholds, based on the minimum and maximum values of the mean responses, are also included."

Second, in Chapter 5.2: "However, under relative thresholds, “routine” (p=0.99p = 0.99 for H0H_0: true mean 3.08\ge 3.08) appears as one of several key motivators, though it ranks last among these relatively influential factors. Thus, H2: “Helping with research is not among the main factors motivating individuals to operate a sensor for OSN”, is not supported by the data at hand, quite the opposite. Using absolute thresholds, H3: “Routine—being already engaged in a similar activity—is among the strongest motivators for operating a sensor for OSN” receives the same judgement. Yet, when using more relaxed standards for determining what is strong H3 finds limited support."

Third, in Chapter 5.4: "The motivational results did not support H2 and only lent very limited support to H3. “Helping research” was confirmed as a strong motivator, while “routine”, though conceptually plausible and a somewhat strong motivator under relaxed thresholds, lacked empirical validity."

The finding that OpenSky participants score lower on "face" and "power resources" motivations than other network operators is intriguing. Could OpenSky’s open-science mission attract participants less motivated by recognition or material benefits? This difference has implications for recruitment messaging and deserves expanded discussion.

The discussion was moved from Chapter 5.2 to 5.4 and slightly expanded on: "Assuming all relevant variables were observed, the motivational differences between OSN participants and non‑participants suggest that structural or organizational characteristics specific to OSN attract individuals with a distinct motivational profile. In particular, OSN contributors appear to place less emphasis on social recognition and personal benefit than participants in other flight‑tracking communities." Additionally, one final sentence was added in the conclusion: "This group should also consider whether trade offs between different strategies could emerge in cases where initiatives designed to attract certain regional or motivational profiles could inadvertently reduce engagement among others."

Line 38: "This kind of synergy" needs clearer antecedent reference , presumably to the hobbyist community overlap.

Adjusted to: "Third, the network is embedded in a pre-existing community of aviation enthusiasts—such as plane spotters [Lichter-Marck 2016; NYCAviation 2025]. While not entirely unique in this area, the presence of such hobbyist communities creates a supportive environment which encourages participation. Furthermore, for SCS, the potential for synergy between similar networks is quite significant due to interoperable receivers that enable data sharing across multiple platforms."

Line 157: correct "asspects" to "aspects."

Done.

Table 1 should include median age alongside mean, as age distributions can be skewed.

Median category is now highlighted in bold in table 1. For discussion on the mean and the influence of the category midpoints see new version of chapter 5.4.

Section 5.3’s regional analysis would benefit from formal hypothesis testing (chi-square tests) confirming that regional differences in deterrents are statistically significant rather than chance variation.

While regional patterns in deterrents are descriptively informative, formally testing these differences is beyond the scope of the present study. The central hypothesis (H4) concerns overall deterrents to sensor adoption, not regional comparisons, and is fully supported by the aggregated results without the need for inferential testing by region. Moreover, several regions contain relatively small subsamples, and many deterrent items have low counts in specific regions. Under these conditions, χ2\chi^2 tests would be underpowered, sensitive to sparse cells, and would require multiple comparisons corrections, which would substantially increase the risk of either false positives or uninterpretable results. To address this concern nonetheless, we double-checked the language and clarified where to make explicit that these differences are not statistically tested.

Response to reviewer 2

Thank you very much for taking the time to give such detailed and helpful feedback. See our responses below.

Major Comments

H2 (line 110): The rationale for the hypothesis is unclear. You state (line 109) that OpenSky data serve research purposes, yet H2 claims there is “less emphasis on helping with research.” The fact that OpenSky does not pursue a single predefined research objective does not, by itself, imply that “helping research” should be a less important driver. Please revise the hypothesis framing and the associated argumentation or explain more explicitly how the premises support H2 and how this connects to the subsequent interpretation.

Looking at it in light of other citizen science projects, it does indeed make sense. I adjusted the sentence so it is a bit clearer:

"Unlike most citizen science projects, OSN does not pursue a single defined research objective which participants anticipate contributing to when they join the network. Rather, data is provided for diverse purposes, including but not exclusive to research. However, prior studies suggest that clear project goals are powerful motivators. Due to the lack of this clear research goal, OSN members are expected to place less emphasis on “helping with research” as a central reason for participation."

Structure and reporting in Chapter 4 : Please revise the structure and presentation logic of this chapter.

  1. Present key descriptive numbers (and, where relevant, percentages) before drawing general conclusions. For example, line 213 states that most respondents were male, but the corresponding value appears only later (line 227). A similar issue occurs around line 225, where the text moves to hypothesis-oriented interpretation before providing the underlying results.

  2. Ensure that all tables and figures are explicitly referenced in the text and briefly introduced (what is shown and why it matters).

  3. Improve sequencing: introduce and describe a table/figure in the text before it appears to support reader interpretation. For example, Table 1 appears at the start of the chapter but is first referenced only later (line 232).

  4. Consider moving hypothesis-level conclusions from the results section to the discussion section. The results chapter should primarily report findings.

Assuming that we’re talking about chapter 5, not 4, the following changes were made:

  1. Deleted word "male" from line 227. As for 225 I like starting the section with the statement of facts and would like to keep it. To improve it nonetheless, I adjusted the order of topics discussed following the statement, and added the information on income. Also changed the word „largely“ to „mostly“ so people are not confused when they see the age variable. And so i don’t have to start discussing this in the results section already. We also put the hypothesis discussion in the middle and the active - passive member comparison at the end, because the comparison is the „extra“ bit of information.

  2. Added missing reference changed fig. 2 to the graph displaying the gender of participants, as this is a more relevant graph because it is mentioned in the H1.

  3. Done.

  4. We think the hypotheses should at least be mentioned right where the related numbers are, because otherwise we risk repetition or it is annoying for the reader if they have to go back and forth to double check if what we’re saying in the interpretation is supported by the numbers. Generally speaking, we think it is fine to state whether a hypothesis finds support or not in this section and thus mostly kept it the way it was. However, we did move the following sentence from the deterrence results to the deterrence discussion: "Interestingly, obstacles identified as relevant in the literature, like insufficient skills and time constraints, are not central for OSN."

Minor Comments

Abstract: Include at least 1–2 concrete headline findings rather than stating “who, why, and what” in general terms.

Added to abstract: "The collected data show that participants are predominantly well-educated, above-median-income males from Western countries, with an average age of 50. Their primary motivation is contributing to research, despite limited knowledge of the specific research projects. Cost is the main barrier to participation, particularly in underrepresented regions, whereas disinterest and environmental concerns deter adoption in other areas."

Line 27: Specify what “growth” refers to (e.g., members, sensors, coverage, publications).

Added the word membership. So it is now "membership growth"

Lines 30–32: Consider moving this sentence to the preceding paragraph where OSN is introduced (e.g., near line 27, after listing the relevant sources).

Moved the sentence as proposed.

Line 73: You refer to “three values” but list four (“help with research”, “social expansion”, “routine”, “teaching”). Please reconcile this.

This was a typo - the authors do indeed add four categories - three of which align neatly with the umbrella categories while teaching is a standalone category. The „three“ came from an oversight when shortening the original thesis to this paper. WE adjusted it accordingly.

Line 83: Please clarify what is meant by “skills” in this context.

Added "(technical)":

...the perceived lack of time, physical ability, or (technical) skills as primary barriers.

Line 87: Chapter 2 discusses the theory. Please adjust the heading accordingly.

As Chapter 2 is the literature review (which necessarily requires covering some theory), and Chapter 3 uses the theories found in the literature (Ch 2) and applies it to the case in order to derive hypotheses, we believe our approach of naming them is more accurate.

Line 102: You refer to “above-median-income” without stating the threshold. Consider clarifying this in the paragraph preceding H1

There is no threshold for a reason, that is explained in the methods section (see quote below). The survey is a global survey. „above median“ depends on self-assignment of the participants. They select the income bracket they „feel“ they are part of, not a certain amount of money they actually have. On the chosen scale from 1-5, everything above 3 is "above median" income.

„Because OSN is international, income was measured subjectively by asking respondents to place themselves within national income quintiles, following the World Values Survey approach.“

Punctuation/style (lines 33, 109, 159, 214, 305): Semicolons appear unnecessary in several places. Where appropriate, split into separate sentences to improve readability. This is often an indicator for AI.

All but one replaced with „but“ or „.“

Punctuation/style (lines 38, 53, 70, 119, 187, 231, 246, 247, 271, 276, 281): Please review dash usage and replace with commas/colons where more suitable. This is often an indicator for AI.

Adjusted where found appropriate. Kept it for the hypothesis because we like it as a style element and actually do use it when writing without AI.

Line 175: Please refer explicitly to Table 3 to support comprehension and mapping.

Added clear reference to table in the text.

Line 180: Briefly explain what is meant by “attention check.”

Added: "... and added an attention check statement (“If you are actively reading this, please select five.”)."

Line 189: Chapter 4 reads like the methods chapter overall. Please rename Chapter 4.4 and streamline headings (e.g., “Survey design”, “Measures”, “Data analysis”) to avoid redundancy and improve navigability.

Changed title of 4.4 from „Methods“ to „Statistical Analysis“

Line 205: The first sentence is not part of the results but should be mentioned in the methods chapter.

Sentence moved to second paragraph of Methods section. "The survey was open for ten weeks, starting on December 22, 2024."

Line 205: The age results appear to belong to the demographics subsection. Consider aligning placement accordingly (rather than preceding Chapter 5.1).

Done.

Line 228: Consider moving “Respondents came from …” to the paragraph beginning at line 213, as it is conceptually linked to the area/continent.

In line 228 we discuss the locality with reference to H1 while the upper part in 213 is a general discussion of the survey. We think it is necessary to mention the locality of responses in both parts, and thus, left it like that.

Line 231 (Table 1 placement): This content appears to be intended as the textual introduction to Table 1. Consider moving it to directly accompany Table 1.

Done.

Line 231 (age statistics): If age was collected in ordinal categories, mean and standard deviation depend on assumed category midpoints. Please state clearly that these are approximations and interpret them accordingly.

Correct, we used category midpoints and then 13.5 and 79.5 for the fringe categories <18 and >75 to calculate average age. Added the word "approximate" every time average age is discussed. Added a sentence to chapter 4.4:

"As age was measured categorically, category midpoints were used to obtain an approximate average value." and added the following sentences to the discussion part regarding the age variable: "Since category midpoints were used for the mean calculation this result is an approximation. Further, as there was no one younger than 18, the results are sensitive to the >75 midpoint. The chosen midpoint of 79.5 imposes an artificial maximum on this open‑ended category and compresses its variability which leads to a downward bias as well as an underestimated standard deviation. However, because only 2.1% respondents fell into the >75 group, any downward bias in the calculated mean is expected to be small. If anything, OSN participants are slightly older than calculated, further supporting the finding."

Line 236: I would expect, that the participants who failed the attention check are excluded from the whole analysis and would therefore mention the number of failed attention checks at the beginning of Chapter 5.

No, they were only excluded from the analysis of the motivators. This was done because it was assumed that the demographic questions, which are asked first, are relatively fast and easy to answer and that people there would still be paying attention. The motivation question however, was quite long to answer but relatively easy to just click through - hence the attention check for this question only.

Line 237: The abbreviation is introduced earlier (line 192). Consider using only the abbreviation here.

Done.

Figure 3: The image appears distorted. Please correct this and consider adding a zoomed-in view of Europe for readability.

Adjusted the distortion, but refrained from using a Europe-only image for space reasons: we don’t want to include two maps as the community is already quite Eurocentric, we rather want to focus on the global reach (or lack therof) of OSN. The resolution of the vector graphic should allow a good zoom experience on a screen.

Lines 245–247: The sentence is difficult to follow. Consider using quotation marks and/or formatting to clearly separate the hypotheses from explanatory text.

Added quotation marks.

Figure 4: Adjust colour choices for accessibility/readability (yellow is difficult to read, and the palette overlaps with Figure 3 while encoding different meaning). Also, reference the figure in the text and define the absolute vs. relative threshold succinctly, including how this affects interpretation.

Changed lightest colour from yellow to green overall. Palette kept the same for all plots intentionally, there’s always the legend available. Referenced the relative threshold, also in the methods and discussed it more in the text. See also response to feedback from Reviewer 2 regarding this point.

Figure 5: Add data labels to improve readability. Consider adding an additional “total” bar per category (without continent breakdown) to support quick comparison.

Added total bar and data labels to improve readability.

Line 280: The country-level differences could be discussed in more depth.

We assume region-level differences are meant here. The discussion on the deterrent factors- including the regional differences was generally expanded. In the disscussion section e.g., we added:

"Although the results are descriptive, regional differences indicate that certain outreach or engagement strategies may be more effective in some areas than others. These variations also suggest the possibility of trade‑offs that should be considered when evaluating new approaches to user recruitment. Where feasible, targeted and region‑sensitive strategies are likely to yield the greatest impact. Clearly, there are some obstacles, such as the lack of a suitable spot or the legality of sensor set-ups, which are impossible for OSN to alleviate. "

Line 301: Add the missing unit after “50” (years).

Done.

Line 311: The conclusion refers to hypotheses only here. Avoid specifically mentioning H4 if hypotheses are otherwise not treated consistently in the conclusion.

Removed "...confirms H4 and..."

Line 335: Typo: “Forth author” should be “Fourth author.”

Done.

Figures (general): Please remove underlines in axis labels and legends.

Done.

General note for future work: The manuscript appears to assume that respondents operate sensors in their country of residence and uses this as a proxy for sensor location. In future surveys, consider directly asking where the sensor is operated to avoid misclassification.

This is an excellent point we really should have thought of before conducting the survey but somehow didn’t.

Cheung, G.W., Cooper-Thomas, H.D., Lau, R.S., and Wang, L.C. 2024. Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations. Asia Pacific Journal of Management 41, 2, 745–783.
Federal Aviation Administration. 2024. Radar surveillance terminology. https://www.faa.gov/air_traffic/technology/radardivestiture/terminology.
Federal Aviation Administration. 2025. Automatic dependent surveillance – broadcast (ADS-B). https://www.faa.gov/about/office_org/headquarters_offices/avs/offices/afx/afs/afs400/afs410/ads-b.
Hair, J.F., Black, W.C., Babin, B.J., and Anderson, R.E. 2014. Multivariate data analysis. Pearson.
Hair, J.F., Howard, M.C., and Nitzl, C. 2020. Assessing measurement model quality in PLS-SEM using confirmatory composite analysis. Journal of Business Research 109, 101–110.
Lichter-Marck, R. 2016. Eyes aloft: The sublime obsession of plane spotting. The Virginia Quarterly Review 92, 4, 52–63.
Lin, J. 2025. Confirmatory factor analysis (CFA) in r with lavaan. https://stats.oarc.ucla.edu/r/seminars/rcfa/.
NYCAviation. 2025. Homepage. https://www.nycaviation.com/.