Accurate fuel consumption estimation is crucial for efficient aviation operations and fuel management. However, limited access to detailed traffic and aircraft performance data leads researchers to rely on open data sources for estimation, introducing uncertainties due to several assumptions. In this paper, we propose an uncertainty assessment of fuel-flow calculation using open data. By analyzing real-time flight data obtained from the OpenSky Network, comprising different aircraft types, we examine several commonly used hypotheses impacting fuel consumption. Variables such as flight altitude, airspeed, weight, or motorization are found to all contribute to variations in fuel consumption. Our goal is to perform a variance-based global sensitivity analysis to show the degree of impact of these variables on the final fuel consumption. This study highlights the significance of open data for refining fuel-flow estimation methodologies and provides researchers with a valuable resource seeking to improve fuel consumption calculations and develop more accurate models.
Aviation industry sustainability has become a real concern in recent years due to increasing environmental awareness and stringent regulations. Fuel consumption, a key contributor to carbon emissions and operational costs in aviation, has attracted significant attention from researchers, practitioners, and policymakers alike. Accurate prediction and assessment of fuel consumption are critical not only for optimizing operational efficiency and cost-effectiveness but also for minimizing the environmental footprint of aviation activities.
Many methods for estimating fuel consumption on different scales [Seymour et al. 2020] often have to rely on assumptions and mean values – e.g., the weight of an average person in a flight – that may lack the necessary precision to account for the intricate interactions of diverse variables affecting aircraft performance. In response to these limitations, the integration of performance models based on real data has gained momentum [Sun et al. 2020] to take into account real-life scenarios.
One notable approach in this pursuit involves the use of performance models to generate simulated data that captures the complex dynamics [Sun et al. 2019] of aircraft behavior in various operational scenarios. These models are designed to encapsulate the relationships between parameters such as aircraft type, flight profile, and engine performance, thus providing useful insights into aircraft. By simulating a wide array of flight conditions, these performance models facilitate the generation of comprehensive datasets that span the potential operational envelope of different aircraft.
However, the inherent complexity of aviation systems introduces uncertainties that can propagate through the performance models, subsequently impacting the accuracy of fuel consumption predictions. In addition, private information not included in open databases like fretting can also introduce variability in end results. To address these challenges, this paper applies Sobol’ indices to systematically explore the influence of these flight variables over the estimation of fuel consumption. Sobol’ indices offer a robust framework for quantifying the relative importance of individual parameters and their interactions in contributing to the output uncertainty.
By conducting a Sobol’ indices experiment on generated flight profiles, we aim to provide a deeper understanding of how different factors impact fuel consumption uncertainty. This analysis not only sheds light on the key drivers of uncertainty but also informs of the necessary improvements to the performance models for better predictive capabilities. Furthermore, the insights gained from this analysis have the potential to guide decision-making processes related to aircraft operations, maintenance strategies, and fleet management.
The remainder of this paper is organized as follows: In Section 2, we review the existing literature on fuel consumption estimation in aviation, highlighting the limitations of conventional methods and the motivations for adopting data-driven approaches. Section 3 elaborates on the methodology employed, detailing the generation of simulated data using performance models and outlining the principles of Sobol’ indices for sensitivity analysis. Section 4 presents the results of the sensitivity analysis experiment, discussing the implications of the findings in terms of fuel consumption uncertainty. Finally, in Section 6, we conclude the paper by summarizing the contributions, discussing the broader implications of the research, and suggesting potential avenues for future work.
Early efforts to estimate fuel consumption in aviation primarily relied on mathematical equations based on aircraft type, distance traveled, and cruise altitude [Akcelik and Besley 2003; Collins 1982]. These methods, while practical to use, often oversimplified the underlying physics and neglected critical factors such as weather conditions, engine efficiency, and flight profiles. Consequently, their accuracy was limited, leading to significant discrepancies between predicted and actual fuel consumption.
Recognizing the shortcomings of traditional approaches, researchers have increasingly turned to computational models to simulate and predict fuel consumption more accurately. For instance, using neural networks, Trani et al. [Trani et al. 2004] created a model to evaluate the fuel consumption of a single F-100 aircraft. Similarly, Huang et al. [Huang and Cheng 2022] introduced an approach for the accurate estimation of aircraft fuel consumption based on Classification and Regression Trees and Neural Networks by using flight data from onboard flight data recorders (FDR) and automatic dependent surveillance–broadcast (ADS-B). The findings from this research demonstrate that the Classification and Regression Trees model exhibits resilience in handling data with errors and missing values and that ADS-B data emerges as a cost-effective and convenient alternative to FDR data for estimating fuel consumption. Similar methods can also be found in other works [Dalmau et al. 2020].
Another way to evaluate the fuel consumption of flights is by leveraging performance models like the Base of Aircraft Data (BADA) [Nuic et al. 2010]. BADA is a widely recognized aircraft performance modeling system, developed and maintained by Eurocontrol. BADA plays a pivotal role in comprehensively assessing and predicting various aircraft performance characteristics, which is crucial in diverse applications across the aviation industry, including flight planning, airspace management, and fuel consumption. However, due to the use of commercially sensitive airline and constructor data, BADA cannot be open-sourced despite its valuable contributions to the aviation community. This restricted access to its data and algorithms poses a barrier to open and collaborative research efforts within the field of aircraft performance modeling.
To mitigate these problems of access rights, and found in many recent works in aviation [Huang and Cheng 2022; Sun et al. 2022; Gloudemans 2016], the advent of aviation open-data repositories like the OpenSky Network [Schäfer et al. 2014] has revolutionized the research landscape by providing access to a wealth of real-world flight data and operational records. Leveraging these datasets, researchers have been able to calibrate and validate their performance models, enhancing the reliability and generalization of their predictions. Additionally, these datasets enable the generation of synthetic data to cover a broader range of scenarios, augmenting the robustness of the models. This is exemplified by initiatives like OpenAP [Sun et al. 2020].
By leveraging the availability of flight scheduling data through the OAG dataset 1 and the strength of the BADA model, Seymour et al.[Seymour et al. 2020] created a framework called FEAT (Fuel Estimation in Air Transportation). It is a two-component approach consisting of a high-fidelity flight profile simulator using BADA and a reduced order fuel consumption approximation based on the origin-destination airport pair and the aircraft type. The reduced model allows for timely and computationally efficient estimation of fuel consumption for globally scheduled aircraft movements for an entire year. They managed to approximate the total fuel consumption within a \(5\%\) margin against the fuel burn report. The main caveat with this approach is that different variables, such as the loading factors or the wind components, are fixed while having a substantial effect on the take-off weight and, thus, the consumed fuel.
As explained in the report by Sun et al.[Sun et al. 2022], the propensity for low error rates in these methodologies can be explained by the fact that they examine the global scheduled movements spanning an entire year. This phenomenon arises from the aggregation of numerous errors within the dataset, each tending to offset one another. For example, the influence of jet streams is evident in transatlantic flights, which, on average, exhibit varying durations when traveling from west to east as opposed to the reverse direction. Consequently, when evaluating an individual flight, neglecting to consider the wind component can result in a substantial discrepancy with reality, either in underestimating or overestimating its consumption depending on the specific departure and arrival points. However, when computations are conducted based on airport pairs, irrespective of the origin and destination, these individual variations tend to average out, mitigating the overall impact of such errors.
However, the wind movements are not the only sources of errors when trying to assess the fuel consumption of aircraft based on open data. Identifying which input impacts the most the output can be essential to minimize such errors, and dealing with uncertainties within these variables can help to better understand the results.
Assessing and quantifying the uncertainty associated with aircraft fuel consumption is critical and involves many different aspects of the calculation. Uncertainties about data sources, models, environmental factors, operation variability, or even the characteristics of the fuel used can all contribute to the variability of the consumption of one aircraft. Once all these different variations are determined, the uncertainty needs to be propagated through the chosen model using a wide range of techniques such as Monte Carlo simulations. In aviation, some research exists in the domain of uncertainty assessment and sensitivity analysis, especially on fuel consumption. For instance, Vasquez and al. proposed two different studies [Vazquez and Rivas 2013; Vazquez et al. 2017] on flight uncertainties, one based on the initial mass of the aircraft, the other considering the wind uncertainties during cruising. They showed that the trajectories were not much impacted by the initial take-off mass and that the winds can heavily modify the trajectories. Similarly, Casado and al. [Casado et al. 2013] evaluated the uncertainties of the output of the BADA model based on the degradation of the aircraft over time. In this study, they consider that all aircraft can be contained in a plane formed between the trajectories computed using the nominal performance model and a degraded one. As a result, it simplifies the calculation of uncertainties due to the use of models like BADA. It showed that these errors are, however, accountable for less than \(1\%\) of the estimated mass of the aircraft at the end of its journey. Other papers also assessed en-route uncertainties [Mondoloni 2006; Lee et al. 2009], but these uncertainty evaluations mostly account for the trajectory uncertainties of an aircraft rather than their fuel consumption.
All in all, while the literature [Kim et al. 2007] demonstrates promising advances in fuel consumption estimation using performance models and open data, a comprehensive exploration of the uncertainties associated with such predictions remains a notable research gap. This paper aims to bridge this gap by applying Sobol’ indices to generated data from performance models, thereby offering insights into the relative importance of the input parameters in influencing fuel consumption uncertainty. By identifying the main influencing factors within aviation, this research contributes to refining predictive models, optimizing operations, and supporting aviation sustainability assessment.
In this section, we introduce the method used to evaluate the fuel consumption of different types of aircraft and how we assess the origin of the different uncertainty sources linked to its calculation. We present how we generated our dataset, with which we performed a Monte Carlo simulation. Then, a sensitivity analysis is performed to evaluate the uncertainties linked to the different input variables.
Flight data forms the bedrock of empirical analysis when it comes to fuel estimation and exploration of its uncertainties. Though several databases offer access to flight data, there seems to be a challenge in acquiring seamless and complete flown data, especially when considering flights over non-covered areas, such as the Atlantic Ocean. OpenSky Network [Schäfer et al. 2014], an ongoing project that provides a high-fidelity and open ADS-B sensor network, is a notable source of flight data. However, its direct application in our present study is hindered by, for instance, the unavailability of data over the Atlantic. This is primarily attributed to the nature of ADS-B technology, where an aircraft’s signal is detected by ground stations and then transmitted to the OpenSky Network. Over vast oceanic distances like the Atlantic, the signal often does not reach any ground station and, hence, fails to be captured in the database. While satellite coverage for ADS-B data can be found (on websites like FlightRadar242), they are not openly accessible at the time of this study. Therefore, in order to have complete trajectories even in areas not covered by OSN, we decided to create synthetic flight trajectories using openAP [Sun et al. 2020], a robust and free-to-use performance model that catalyzes the analysis of air transport impacts which can simulate flight trajectories with accuracy.
OpenAP offers a set of aircraft statistical models covering different aircraft aspects such as aerodynamics, engine performance, and mission performance, which are all essential for our analysis. This makes OpenAP a more viable option to build a comprehensive dataset of flights. Indeed, when employing Monte Carlo simulation later on for our research on fuel estimation, OpenAP will allow us to introduce randomness in our simulation model to simulate a variety of outcomes thanks to its kinematic model [Sun et al. 2019].
It is to be noted that this work could have been done with other performance models, such as BADA, which is subject to a license contrary to OpenAP. However, it is to be noted that Section [app:bada] presents results of the fuel estimation using BADA to calculate the fuel flow in order to validate what has been found with OpenAP.
In this research, OpenAP usage is two-fold, as described in Figure 1. Firstly, it is employed as a database of flight envelopes thanks to the kinematic model. For each representative real-world flight chosen in the OpenSky Network (e.g., an A320 from Toulouse to Paris or a B744 from London to New York), different variables impacting the fuel consumption, such as the cruise altitude or the speed during climbing are chosen to be used for trajectory generation. The main advantage of OpenAP is that each variable comes as a probability density function (PDF), which greatly helps for the uncertainty analysis.
Secondly, OpenAP is used for both the first and the final flight generation. These steps are alike and only differ from the estimate of the take-off mass. During the initial generation, an estimated weight is calculated based on the flight length and the model of the aircraft. This first draft is then iteratively refined to get a better fuel estimate, which is then used to calculate a more accurate payload. This new payload is then taken as input for the second generation, on which the calculation of the final fuel flow is based.
Once the pipeline of input variables from the kinematic model is set, different flight scenarios are modeled with a Monte Carlo simulation. This technique, through repeated random sampling, enables us to predict not only the outcome of a range of possibilities but also to quantify the uncertainties associated with each decision input. With the results of the Monte Carlo simulations, we can finally compute uncertainties to rank the various inputs based on their impact on fuel consumption.
For this study, we used the Sobol’ indices [Sobol 2001] to analyze the influence of the input variables and to assess their uncertainty. The Sobol’ indices are a set of mathematical techniques used to understand the impact of input variables on the output of a model. They are particularly useful in sensitivity analysis, where the aim is to determine which inputs are most influential in determining the output.
In the context of our research, the Sobol’ indices were employed to quantify the contribution of different factors to the uncertainty in the calculation of aircraft fuel consumption. These factors include but are not limited to, the weight of the aircraft, the cruise altitude at which it flies, and the equipped engine. Each of these factors has an associated uncertainty, and the Sobol’ indices allow us to determine which of these uncertainties have the largest effect on the overall uncertainty in the fuel consumption calculation. Given a model whose output can be described as a function \(Y = f(X_1, X_2,\dots X_n)\), the sensitivity measure \(S_i\) (First-Order Sobol’ Index) of an input factor \(X_i\) can be described as followed [Saltelli et al. 2010]:
\[{S_i = \frac{\text{Var}_{X_i}[E_{X_{\sim i}}(Y\, |\, X_i)]}{\text{Var}(Y)}}\]
Where:
\(X_i\): \(i\)-th input factor.
\(X_{\sim i}\): set of all input factors except \(X_i\).
\(E_{X_{\sim i}}(Y\, |\, X_i)\): Mean of \(f(X)\) taken over all possible values of \(X_{\sim i}\) while keeping \(X_i\) fixed.
\(S_i\) is a normalised index (between \(0\) and \(1\)), as \(\text{Var}_{X_i}[E_{X_{\sim i}}(Y\, |\, X_i)]\) varies between zero and \(\text{Var}(Y)\). The First-Order Sobol’ Index evaluates the influence of the single input \(X_i\) on the output of the model. This index, however, does not provide information on this input when interacting with other input \(X_{\sim i}\). To do so, one can consider using higher-order indices. There are \(2^n - 1\) other indices that could be computationally heavy to evaluate with the number of features rising. Instead, the Total Sobol’ Index (\(S_{T_i}\)) is measured. (\(S_{T_i}\)) evaluates the contribution to the output variance of (\(X_i\)), including all variance caused by its interactions, of any order, with any other input variables.
\[{S_{T_i} = \frac{E_{X_{\sim i}}[\text{Var}_{X_i}(Y\, |\, X_{\sim i})]}{\text{Var}(Y)}}\]
The results of the Sobol’ indices analysis can provide valuable insights into the sources of uncertainty in the calculation of aircraft fuel consumption. They can help guide future research efforts by identifying the areas where reducing uncertainty could have the greatest impact on the accuracy of the fuel consumption calculations. Furthermore, they can inform decision-making by aircraft operators and regulators by highlighting the factors that contribute most to fuel consumption uncertainty.
To evaluate \(S_i\) and \(S_{T_i}\), different formulas can be found in the literature [Jansen 1999; Martinez 2011]. In their research, Saltelli and al. [Saltelli et al. 2010] presented ways to save computation times when calculating \(S_{T_i}\), e.g., by using low-discrepancy sequences instead of sequences of pseudo-random numbers (namely Quasi-Monte-Carlo [Caflisch 1998]). This is how the indices were calculated in our experiments.
Based on the method previously presented, \(4\) different scenarios were identified for this research paper. Each one of these scenarios is detailed in Table 1. While \(4\) scenarios could represent an underwhelming number of scenarios, we believe that each scenario well embodies the different main types of flying hauls found in reality through the OpenSky Network. For this experiment, the default type of engine set by OpenAP was chosen; however, the type of engine could also have been considered as an additional variable for a sensitivity analysis.
For each of these scenarios, the same variable set was used for the sensitivity analysis. These different variables are detailed in the following subsection.
Airport Pair | Avg. Cruise Distance | Aircraft type | Default Engine | Haul type |
---|---|---|---|---|
Toulouse - Amsterdam | 580 km | A320 | CFM56-5B4 | Small aircraft, short route |
Madrid - Moscow | 2300 km | A320 | CFM56-5B4 | Small aircraft, longer route |
Madrid - Moscow | 2300 km | A321 | CFM56-5B1 | Different aircraft, longer route |
London - Washington DC | 5600 km | B777 | GE90-115B | Bigger aircraft, intercontinental route |
The number of variables impacting the fuel consumption of an aircraft is overwhelming. It ranges from the type of aircraft, its engine, its age and maintenance, the duration of taxiing, the weather en route, etc. In addition, the multiple interdependencies of these variables should be considered, adding another layer of complexity to the sensitivity analysis. For this experiment, the choice of the different variables used was mainly motivated by what OpenAP could effectively use for its fuel flow calculation. Some variables, like weather-related conditions, have not yet been taken into consideration by the model. The variables used are the following:
Average Weight per Person (AWP): According to the ICAO stats from 20093, the mass of each passenger plus associated luggage in a flight is taken to be 100 kg on average. However, depending on the size of the aircraft and its seat capacity, variation in this number can considerably alter the Take-Off Weight (TOW). For this reason, this variable is modelized by a truncated Normal distribution centered on 100 kg with a standard deviation of 0.2 and a minimum set to 80kg. It is to be noted that the different scenarios in this work only considered passenger-only flights. Thus, to calculate the payload for a given flight, an AWP is drawn from the distribution and is then multiplied by the number of seats available in the aircraft.
Aircraft Load Factor (LF): With the seating capacity for each type estimated from PlaneSpotter4 database, the number of onboard passengers can be deduced from an average load factor found in IATA reports5. For 2019, the average load factor was 81.9%. However, similar to the AWP, the LF can heavily influence the TOW. For this reason, we also used a Truncated Normal distribution centered on 81.9% with a standard deviation of 0.2 and a maximum set to 1. to evaluate the impact of the LF on fuel consumption. In this study, we did not consider the utilization of the extra space in the belly of larger jet aircraft to transport freight on passenger flights. According to Graver et al. [Graver et al. 2019], a global average passenger-to-freight factor of 85.1% could be considered when calculating the total payload of an aircraft.
Range Deviation: While the range of a mission is known in advance and often rounded to the great circle distance between the origin and destination airport, the deviation can happen due to different factors including meteorological issues, collision avoidance maneuvers, etc. While usually minimal, we wanted to evaluate the impact of these deviations. To do so, we used a truncated Normal distribution centered on the great circle distance (the one displayed in Table 1) for each scenario, limiting the deviation to a maximum of 5% of the total distance. The truncation ensures that no range goes under the great circle distance as it is the actual minimum distance between the two airports.
Cruise Altitude: The evaluation of this variable can be of importance in the context of the environmental impact of aviation. Indeed, one of the main ways to prevent condensation trails, one of the main driving contributors to global warming during cruising, is the change of flight level to avoid high humidity zones. Having the cruise altitude as a part of the sensitivity analysis will help in the trade-off evaluation between fuel consumption and condensation trail formation. The distribution for the cruise altitude, like other variables, is deduced from the OpenAP parameters calculated from real-life data. For instance, for the A320, the altitude follows a normal distribution of mean 10.920 and a standard deviation of 0.56, which is multiplied by 1000 to have the altitude in meters.
Descent Thrust: Most of the performance models default to an idle descent, which often leads to underestimating the true fuel flow during descent. To counter this effect, we introduce a descent thrust factor between 0 and 1, 0 being idle flight and 1 being max thrust.
For the sake of completeness, we also considered five additional variables to our sensitivity analysis. These are the CAS and the Mach of both climbing and descending, and the Mach speed during cruising. Each of these variables depends on the aircraft and are all following a probability density function defined in OpenAP in the kinematic model.
It is to be noted that for this experiment, the different variables are considered independent, as adding dependencies between variables adds another layer of complexity to the sensitivity analysis using Sobol’ indices. A future refined study could consider the addition of other variables such as the age of the aircraft [Dray 2013], freight, etc.
Figure 2 presents the First-Order and Total-Order Sobol’ indices for the different scenarios chosen for this work. For each index, the 95% confidence intervals of the estimates is also displayed [Saltelli et al. 2010]. As a reminder, the higher the indexes, the higher the impact of the feature on the final output of the model. The first analysis of the different sensitivity analyses can be compiled as follows:
The main contributor to the variation of fuel consumption for these scenarios is the cruise altitude according to these scenarios. This can be explained by the high uncertainty of this variable on generated trajectories, as many contributing factors in real life will modify the cruise altitude. It can be observed that in the sensitivity analysis of the longer-haul (Figure 2.d), the cruise altitude is twice as impactful as any other input variables. The impact of the cruise altitude uncertainty would actually almost disappear with the use of real-life trajectories, as the en-route altitude hardly changes once set. However, this can be an important deciding factor when altitude modifications are done to avoid, e.g., condensation trails.
Both the load factor and the AWP show a high impact on fuel consumption, as they are both determining factors of the take-off weight. On medium-haul flights (Figure 2.b and 2.c), the loading factor appears to be the main input variable. These results confirm what Seymour and al. [Seymour et al. 2020] found in their own study. This means that the payload is a deciding factor in the fuel consumption of an aircraft, regardless of its model and its range. However, the uncertainties considered in this study on the payload are not as high in real life. Indeed, air companies have the leverage to minimize quasi-empty flights like extra freight in their hold. This is further discussed in Section 5.
There is a discrepancy in the impact of the AWP and load factor between the longer-haul flight (Figure 2.d) and the other scenarios. This is explained firstly by the cruise altitude taking more and more importance the longer the flight is, and secondly by the extra consumption during take-off being averaged out for longer haul.
The impact of the different speed scheduling does not seem to have a heavy impact on fuel consumption when compared to other variables. This can be explained by the low uncertainty linked to these variables, as they are very often linked to the type of aircraft and engine. For instance, for the A320, the CAS during climbing varied by only around 10 knots.
Two other results also emerge from these analyses: the low impact of both the uncertainty on the cruising range and of the descent thrust factor. Both of these factors directly impact the fuel consumed during a flight due to their very nature. However, it seems that the uncertainties revolving around the take-off weight and the cruise altitude are far more impactful.
The sensitivity analysis allowed for some comparisons between the short-haul and the medium-haul mission using the same aircraft as a secondary result. Figure 3 presents the consumption in liters of kerosene per 100 Revenue Passenger Kilometre (RPK) depending on the load factor for the \(2\) A320 scenarios. We can observe that regardless of the duration, both follow an inverse proportionality trend. This result can also be found in other scenarios. This shows that even though the load factor accounts for maximum \(10\%\) of the fuel consumption of a given flight on average when looking at the liters of kerosene per person, the load factor can be up to thrice more important in a half-empty flight compared to a full one. The figure also shows how the range of a flight impacts the consumption per passenger, as the consumption is higher for \(580\) kilometer flights compared to the \(2300\) ones. This is due to the heavy consumption during take-off being spread over more kilometers on longer routes.
Concerning the uncertainty analysis, results about the cruise altitude are relatively limited. Indeed, in most cases, the intent for the cruise altitude is known and rarely changes in any significant way during flight. As for the sensitivity analysis, the Sobol’ indices in Figure 2 show that the cruise altitude has a major impact on both short-haul and long-haul missions. This can play a role in the decision to modify the altitude of a given flight to avoid the generation of condensation trails. While the extra release of CO2 from fuel consumption might not have as high of an impact as the creation of condensation trails, we hope that this study will help in assessing the comparison.
For this work, we focused on variance-based sensitivity analysis, namely Sobol’s method, which allowed the full exploration of the input space through the Monte Carlo method. To use it, we iterated \(4000\) times for each couple of scenarios / chosen sets of variables. As shown in Figure 1, each iteration includes two generations. Each generation is constituted of data points that are calculated sequentially, as the fuel consumed by the previous point has to be known to calculate the new weight of the aircraft. For this reason, we only generated 1 data point every minute to avoid higher calculation times, especially on the longest route.
The long calculation time for each loop questions the relevance of using the variance-based method for the global sensitivity analysis of fuel consumption. Indeed, as we could see in some instances, the confidence interval of some indices was slightly outside the [0,1] interval, which is not supposed to happen for Sobol’ indices due to their very nature. This happens if the estimator has not totally converged. For the same reason, some first-order indices estimates can be greater than the corresponding total order indices estimates. A higher number of iterations could help the converging.
The other challenge for this sensitivity analysis is the dependencies between the different input variables. Indeed, calculating input uncertainty through independent marginal distribution functions might not be adequate when inputs are tightly dependent. Taking into account dependencies, e.g., between the cruise altitude and the cruise speed, could help to grasp the intricacies between the inputs and output of the model. This could be overcome by using other methods [Xu and Gertner 2008], but it is not as straightforward to implement. Some works can be found in the literature about working with Sobol’s indices when the input variables are dependent on each other [Kucherenko et al. 2012]. While this does not impair the results of this initial study as there are no strong dependencies, a correct procedure to tackle the dependency would require sampling from the joint and conditional distribution functions of the inputs that are not available. Using Shapley effects for the sensitivity analysis [Iooss and Prieur 2019] to mitigate the effect of the input dependency could be explored in future works.
Due to the necessity of having many trajectories to perform a variance-based sensitivity analysis, the direct use of real-world trajectories from the OpenSky Network would bring several limitations. As already explained, for longer routes, it is not straightforward to find a dataset of complete trajectories without using interpolation methods. The other main issue with using real-world trajectories is the addition of uncontrollable variables that might impact the results of our analysis, such as wind effects, longer re-routing due to geopolitical issues, etc. Lastly, using real-life trajectories means we cannot choose how input features are sampled from their distributions, which might make the use of variance-based approach more complicated.
However, while using the generation capacities of OpenAP to solve these different problems, it adds another layer of uncertainty in the generation itself as some required parameters are rarely known, such as the engine model. In these cases, we rely on the OpenAP defaults, which do not always reflect reality.
In this work, we presented a way to use open data and open models to evaluate uncertainties revolving around the calculation of the fuel consumption of a few representative scenarios. Using a Monte-Carlo simulation and the Sobol’ indices, we showed that performance models like OpenAP are more sensitive to some input variables than to others. Similar findings on the BADA performance model in the appendix validate the result found in the initial experiments. Results have shown that, except for the cruise altitude of the flight, the two input variables most impactful for the fuel consumption of a given mission are the load factor and average weight per person. This conclusion, obtained solely with open-sourced data, is similar to the one of Seymour and al. [Seymour et al. 2020] and their take-off weight estimation.
A direct follow-up to this research paper could be the calculation of the overall fuel consumption of air transportation, including this sensitivity analysis, to verify if the different estimates found in the literature stay within the uncertainties calculated. The end result would be two-fold: first to check if the assumptions made by other authors are acceptable when considering uncertainties, and second to verify if uncertainties calculated on a single flight can be usable for a wider study. One of the challenges of this future work would be the estimation of the individual variations averaging each other out, mitigating the overall impact of errors.
Antoine Chevrot: Conceptualisation, Methodology, Software, Validation, Writing – Original draft
Luis Basora: Software, Writing – Review & Editing
All the experiments of this paper can be found on GitHub 6. It is to be noted that all the evaluations found in the main paper are shareable and reproducible as they fully use open-source tools like OpenAP. The complementary analysis in the appendix mostly uses BADA, which is a proprietary model of Eurocontrol and is only presented in this paper with the intent of validating what has been found with OpenAP.
One of the main limitations presented in the main body of this work is the usage of a single performance model, namely OpenAP. While a larger study using other aircraft types and different scenarios could restrict errors, it would not completely make model biases disappear. To check if the uncertainty analysis in this paper was due to the fuel consumption or to OpenAP, we decided to present a limited experiment on BADA to see if similar results could be attainable.
For this first experiment with BADA, we used the same probability density functions as variables as the ones used with OpenAP while respecting eventual physical limitations set by BADA. The results of the sensitivity analysis with BADA can be found in Figure 4.
To avoid any direct comparisons between the two performance models, we used other aircraft for each of the scenarios described in Section 4. The main objective of this appendix is to show that a similar order of magnitudes regarding the Sobol’ indices can be found regardless of the performance model used. As shown in the two different analyses, the conclusions are quite similar to the ones in the main body of the paper. The take-off weight related to the average person weight and the load factor are the main contributors to the fuel consumption for small aircraft with short missions, while the cruise altitude become predominant on long-hauls.