A snapshot of the temperature and soot distribution in a temporal jet flame DNS are shown in Fig. 1 and Fig. 2. Soot is found to be strongly intermittent as it forms long filament-like structures. By means of the optimal estimator concept, a data-driven model for the filtered soot intermittency based on certain input model parameters can be determined. The performance of the data-driven model strongly depends on the choice of input model parameters, so optimal input model parameters can be determined from such an analysis. The assessment of these models with respect to the data is shown in Fig. 3 for different sets of input parameters Π_{i}. In addition, it is worth noting that the techniques used within an optimal estimator analysis may introduce an additional error by themselves. We found that among many techniques Artificial Neural Networks (ANN) and Multivariate Regression Splines (MARS) yield the most accurate results and allow a reliable assessment of the model performance. Further details about the DNS may be found in Ref. [1] and the optimal estimator analysis is described in Ref. [2].

# Data Analysis of Large-Scale Simulations

Our group has conducted several large-scale Direct Numerical Simulations (DNS) on some of the largest supercomputers worldwide. These DNS address multiple physical problems such as turbulence/flame interactions, emissions formation in highly turbulent flame, flame instabilities etc. Due to the richness of detail in DNS, any quantity of interest is available and its dependency on other quantities can be studied rigorously. However, the tremendous amount of data generated by DNS makes it challenging to extract the relevant physics from the data.

For the analysis of such large-scale datasets, our group employs systematic analysis tools such as the concept of the optimal estimator. This enables a rigorous development of new models for turbulent reacting flows, e.g., optimal parameters can be identified to model a quantity of interest. Furthermore, data-driven models can be obtained from the data and guide the development of new physically motivated models. Artificial Neural Networks have been found to compute optimal estimators more accurately than other techniques such as, e.g., histograms. In particular, the usage of ANN shows significant advantages if multi-parameter models are developed. Currently, our group is employing Deep Learning Algorithms to enhance inference from the data.

# Analysis of a large-scale sooting jet flame DNS

[1] A. Attili, F. Bisetti, M. E. Mueller and H. Pitsch. Formation, growth, and transport of soot in a three-dimensional turbulent non-premixed jet flame. Combustion and Flame, vol. 161, pages 1849-1865, 2014.

[2] L. Berger, K. Kleinheinz, A. Attili, F. Bisetti, H. Pitsch and M. E. Mueller. Numerically accurate computational techniques for optimal estimator analyses of multi-parameter models. Combustion Theory and Modelling, 2018.