Stakeholders Perceptions of a Universal Sustainability Assessment in Higher Education-A Review of Empirical Evidence

The progress of sustainability within higher education has steadily increased in focus over the last decade and has increasingly become a topic of academic research. As institutions investigate, implement and market sustainability efforts, there is a myriad of sustainability assessment methodologies currently available. This assortment of standards being used by institutions do not help students and faculty assess the level of sustainability uniformly between institutions. A universal framework was proposed for facilitate stakeholder’s review of comparing sustainability assessments in higher education. This research reviews the creation of the framework and results from testing in an online environment. The lack of data collected during the testing phase provides some anecdotal evidence regarding what stakeholder consider important in terms of sustainability within higher education and may also indicate that there is no need for a universal sustainability assessment in higher education to be used directly by stakeholder.


Introduction
There is a growing public expectation that universities should start focusing on delivering sustainability. Students not only place high value on many aspects of sustainability, but also express that sustainability concerns are a significant factor in university choices (Bone & Agombar 2011). Maragakis & Dobbelsteen (2013) conducted a survey to understand what stakeholders looked for in sustainable institutions. 95% of the respondents to the empirical study agreed that there was a need for a uniform sustainability rating system for higher education institutions while 92% agreed that employability after graduation should be a measure of an institutions sustainability.
With regards to a uniform rating system, numerous publications (Ryan et al., 2010;Glasser, 2009;Patrick et al., 2008;Perna et al., 2006) have investigated and analyzed the various assessment systems available to universities. However, none have gone so far as to suggest which assessment system would be best suited for standardized use. While stakeholders would prefer one system, it is seen as a controversial step as the choice will have far-reaching implications in theory and practice (Shriberg, 2002). Maragakis & Dobbelsteen (2015) conducted a literature review of sustainability assessments to create a theoretical framework for a universal system. Utilizing previous assessments from Orr (Penn State Green Destiny Council, 2000), Shriberg (2002) and Saadatian et al. (2011) they identified eleven criteria for reviewing sustainability assessments which was proposed as a framework for reviewing assessments. However this framework did not include direct any direct reference to the employability criteria.
The term employability is a convoluted term. A literature review by Maragakis et al. (2016a) recommended that three parameters should be used to assess one's employability due to their importance to future job-seeking graduates, namely starting salaries (based on studies from Rajecki & Borden, 2011), employment (based on studies from Bell &Blanchflower, 2011 andAshford et al. 2012) and over education (based on studies from Carroll &Tani, 2013 andLinsley, 2005).
These three parameters were further explored by Maragakis et al. (2016b) to gain insight on the perceptions held by higher education stakeholders. The data collected indicated that there was a strong preference for students to be employable after graduation, although students where not particularly concerned with starting salary or under-employment.
This research looked to validate stakeholder needs for a uniform system by providing a framework for reviewing assessments. Utilizing the theoretical framework proposed by Maragakis & Dobbelsteen (2015) and including the three parameters for employability an online tool was created for stakeholders to rate assessment systems, with the hopes of validating the framework and also providing insight into a potential assessment system appropriate for universal use.

Background
This paper focused on validating stakeholder's needs for a uniform sustainability assessment in higher education by testing a theoretical framework that was supported by academic research and stakeholder input.
In 2013, Maragakis & Dobbelsteen's empirical evidence indicated that there was a need for a uniform assessment system for sustainability in higher education that did not yet exist.
90% of stakeholders responded that the sustainability of a higher education institution was important in their selection, a conclusion also reached by Bone & Agombar's (2011). The survey identified that stakeholders were using a variety of methods to assess an institutions level of sustainability. It is interesting to note in Figure 1 that many respondents declared to either solely evaluate an institutions sustainability or use a mix of various resources available to them, implying that they were engaged and knowledgeable in the topic of sustainability within higher education institutions.  (Maragakis & Dobbelsteen, 2013) Of the participants familiar with one or more of the systems, AASHE's STARS was the best known with 88% of participants saying they were familiar with the system, although only 60% agreed that it was the best method for assessing an institutions sustainability.
Of the students pursuing higher education, 71% said they were doing it for personal accomplishment and future employability, 22% said they were studying exclusively for future employability, while only 7% responded to studying either exclusively for personal accomplishment or for some other reason. This result shows the importance of economic factors surrounding the attainment of a degree. In fact, in another questions 80% of stakeholders agreed that an institution's ability to make you more competitive in the job market is more important than sustainability. Of the remaining 20%, it was repeatedly mentioned that the two factors are intertwined and thus inseparable.
The same study also identified the need for economic factors to be used as a measure of sustainability. 92% of participants identified that employability after graduation should be included in the measurement of institutions sustainability. Maragakis & Dobbelsteen (2015) proposed a framework for comparing sustainability assessment utilizing parameters and criteria set forth by other researchers in the field of sustainability in higher education. The eleven criteria set forth are found below in Table 1. What quantity of material goods does the college/university consume on a per capita basis?
What are the university/college management policies for materials, waste, recycling, purchasing, landscaping, energy use and building?
Does the curriculum engender ecological literacy?
Do university/college finances help build sustainable regional economies?
What do graduates do in the world?
Ideal cross-institutional sustainability assessments (Shriberg, 2002) Identify important issues Are calculable and comparable Move beyond eco-efficiency

Stress comprehensibility
Identifying Strengths and Weakness of Sustainable Higher Educational Assessment Approaches (Saadatian et al., 2011) Popularity Utilizing this framework, the research highlighted that popular assessments available did not track "what graduates are doing in the world," a criteria set for by Orr (Penn State Green Destiny Council, 2000). Additionally, it was identified that neither the proposed framework nor any of the popular assessment included employability.
In order to explore the parameters surrounding employability, Maragakis et al. (2016a) studied the economic returns of higher education within the framework of sustainability assessment. The premise was that a degree should not be marketed as sustainable unless it addresses the economic return of the future graduate. The research recommended that three parameters should be used to assess one's employability: 1. Starting salary, as it was highly correlated to mid-career salary levels (Rajecki & Borden, 2011), 2. Under employment, as it has become a growing concern after the financial crisis of 2008 (Ashford et al., 2012), and 3. Over education, as this is also a growing phenomenon (Carroll & Tani, 2013) The result of this research provided three necessary parameters to address the term employability. These parameters, combined with the original framework proposed by Maragakis & Dobbelsteen (2015), were hypothesized to provide a more holistic review of sustainability assessment systems within higher education.

Research Question
The primary question of this research is: Can a holistic framework be created that will aid stakeholders in reviewing a universities level of sustainability?
The secondary research question is to validate if STARS is still the preferred assessment by stakeholders.

Website
The domain www.sustainingeducation.com was purchased and a website was developed using Wordpress. The website was developed to collect data for this research while also offering users relevant reference material. Four webpages were created: 3. An assessment webpage which allowed users to rate popular sustainability assessments based on fourteen different criteria.

A resources page which gave links to supporting material and other useful resources.
Upon completion of the website two weeks of testing was conducted in order to debug the site and respond to problems. Small changes were made to improve user interface across various platforms (desktop, tablet, and mobile). The total time for development and testing took three months.

Assessment Outline
A rating system was created using a custom widget which was programmed to allow users to rate systems based on the criteria from the proposed framework. As can be seen in Figure 2, the user interface was simple and allowed users to hover over the 1-5 star rating scale and select their preference. The rating level was set to continuously update and reflect the average user rating, with statistics given directly to the user so they could judge the relative popularity.
Users were given the option to rate STARS, GreenMetric, Princeton Review and Greenopia.
A slight modification was made from the original eleven criteria set forth by Maragakis & Dobbelsteen (2016a). The criteria "popularity" was removed since the ratings statistics would imply the relative popularity level. Additionally, an overall rating was allowed for each assessment so as to allow for an overall feedback from users.
The fourteen criteria used for the assessment webpage were: 10. Does the assessment measure processes and motivations?
11. Does the assessment stress comprehensibility?
12. What is the full time employment rate of graduates with that specific degree within 12 months of graduation?
13. What is the average yearly compensation of graduates with that specific degree within 12 months of graduation?
14. What percent of graduates are employed within their desired field 12 months after graduation?

Data Collection
A period of two months, from November 15, 2015 through January 15, 2016, was allowed for data collection. A digital campaign was initiated in December 20, 2015. The campaign consisted of posting on social websites such as Facebook and LinkedIn and an email to 110 people on December 20, 2015.

Website Results
After the two-month period, a total of 654 unique visitors visited the website and generated with 663 page views. The calculator page was by far the most popular generating 430 views, with the assessment page generating 120 views, the home page generating 99 views while the resources page generated 14 views.

Assessment Results
The website views highlight that the bulk of the interest was in the calculator page. The calculator page gathered 65% of the total views, receiving almost 4 visits for every 1 visit to the assessment page or home page. This indicates that visitors were primarily interested in the calculator page and either went directly to it from links in the original solicitation for visits or were forwarded the specific site through other visitors.
The data also indicates that there was a significant difference in the response rate of the calculator vs the assessment page. While the calculator collected responses for 95% of the visitors the assessment page only collected complete responses from 4% of the page visits.
Of the responses received, STARS receiving the best overall rating and the most amount of complete responses, as shown in Table 3.

Conclusion: the Need for a Universal Sustainability Assessment System
The primary purpose of this research was to validate the need for a universal assessment system in higher education. The research utilized empirical data to create a framework that provided stakeholders the ability to directly rate prominent sustainability assessment systems. Considering that a majority of respondents in Maragakis & Dobbelsteen (2013) had indicated that they conduct their own evaluation, this research was set up as a practical way of applying theoretical research and gaining real data.
If the test generated ample results, the utilization of the tool would provide validation for both the universal framework and also validate if STARS was indeed the assessment of choice amongst stakeholders. The results would allow for an analysis and conclusions regarding the framework and the assessments. The actuality of the research resulted in very little data actually being collected which has ultimately restricted the primary purpose of this research to anecdotal conclusions rather than measureable results.
However, the lack of data collection has provided some unexpected interpretations and conclusions.
On the same website, during the same trial period, an economic calculator received significantly more visitors than the assessment page, at almost a 4 to 1 ratio. This may be an indication that the economic returns of higher education were more pertinent to a website visitor than the actual assessment system. This supports the various research that economic returns are of paramount importance to stakeholders.
Respondents in this study were not only less interested in the assessment page, but were also highly unlikely to compete the rating form. The economic calculator gathered 408 responses compared to the 430 visitors, converting 95% of site visits to useable data. The assessment page collected a total of five complete ratings compared to the 120 visitors, converting just 4% of visits to useable data. There are many reasons that visitors may not have provided data. On interpretations is that, considering both the relatively low visitor rates and the low conversion rate of visitors to useable data, it can be inferred that the framework is not appealing for stakeholders. This lack of interest could originate from a variety of factors, including the complexity of the framework, the multitude supporting literature that each assessment systems has or that the average user may not have time or interest to provide meaningful feedback. No specific driver could be conclusively argued, however the results do raise some questions regarding stakeholder perceptions.
In previous research, stakeholders claimed to spend time assessing institutions on their own implying that they had working knowledge of an institutions initiatives and assessment systems. This interpretation of stakeholder's perceptions may merit further exploration though considering the lack of results generated by this study. Specifically, there should more research done on what stakeholders actually need in order to understand an institutions sustainability. For example, it may be an unrealistic expectation that stakeholders understand the full scope and depth of knowledge supporting each sustainability assessment. Each assessment system has a group of knowledgeable professionals that create, support and justify their methodology and it may be unrealistic to assume that the average stakeholder can review, interpret and review each assessment system.
The relatively low amount of data collected may also be explained by the psychological phenomenon of behavioral discounting. This occurs when individuals tend to engage in behaviors that have more immediate, short term rewards, and "discount", or engage less in, behaviors that have distal, long term rewards (Frederick, et al., 2002). In the case of this study, while stakeholders report that the sustainability of an institution is an important metric, it may be viewed as a distal reward for future generations. Therefore, the ability to access a more immediate and personally salient reward, the economic calculator, may have created a situation in which the assessment of a sustainability framework, which would impact future generations, was "discounted".
The poor data collection does not provide conclusive results on the usability of the framework or recommendations on a preferred assessment systems suitable for universal use. It can be argued that it offers empirical data that supports that there cannot be a universal assessment system. The debate thus far on the controversies of creating a universal system has been based on literature, opinion and little testing. This research provides a data point, albeit empirical, that a universal framework was not utilized. While the reason for the lack of utilization is not clearly identifiable, the lack of responses does provide a small piece of data that questions the need for a universal system.
The inability to collect data for this research while gathering significant data for economic returns points to stakeholder apathy towards driving the discussion surrounding sustainability assessments. Previous research also seems to indicate that sustainability may be a "want" more than a "need." One of the conclusions from Maragakis & Dobbelsteen (2013) highlighted that while 90% of students said that sustainability was an important part of their decision making, only 59% said that they would not attend an institution if it was unsustainable, which also supports that sustainability is desirable but not mandatory.
Regardless of how the results are interpreted, they do seem to support the conclusion by Selby et al. (2009) that rigorous institutional engagement with marketing of sustainability credentials provides a beneficial feedback loop that deepens and embeds the commitment and adherence by administrators, academics and students. The user ultimately discounted the framework at a grassroots level which leaves the ultimate responsibility on the creators of the various assessments as well as the institutions themselves to implement, improve and uphold sustainability initiatives and marketing material. A next step would be for institutions and assessment providers to work together and guide the average user to a simple, transparent and meaningful way of understanding what each sustainability assessment provides.

Discussion on Limitation and Uncertainties
Due to the methodology of the research, there is the potential for promoting bias in the results. The promotion of the survey through digital media may promote bias based on the researcher's contacts and groups. Although the survey was promoted on various sites, there may have been a tendency to receive more responses from technical rather than social science stakeholders.
There was a limitation of the data collected for the framework due to the time limitations of this research. Due to the research being conducted concurrently with the economic calculator, the original purpose of the research may have been impacted due to behavioral discounting. Pressing factors to students, such as debt, are more salient due to the direct personal impact, therefore the sustainability framework was discounted in the presences of the economic calculator. This limited the collection of data regarding the framework and did not allow for the comprehensive testing required to achieve a more concrete result.
The results may also be biased based on the interpretations of the empirical data. There is not a clear understanding of why data was not collected and thus is subject to the researcher's perspective.
There are also limitations on the usefulness of the rating system website itself. The site was not created by a professional website developer and may have limited the usefulness on various mediums, such as smartphones, tablets, etc. While extensive tests were conducted to improve user interface, the fact that so few reviews were collected may indicate that the tool itself was not aligned with the technological expectations of users.
Finally, there are other assessments that could have been utilized in this study. The selection of the assessments in this particular study are a reflection of empirical data collected over and are notably more reflective of North American preferences. While all the assessments in this study have a global reach, they may not necessarily reflect the prevalent assessment systems found within each country/continent.

Recommendations
The results indicate that stakeholders may not be interested in comparing assessment systems in depth, particularly in the presence of more personally saliently tools, such as assessing student debt. Further research should be conducted beyond empirical studies to see if there is a reason to create a universally acceptable sustainability assessment system, or if the current systems should be left as is to evolve organically into something that will be utilized both by institutions and supported by stakeholders.