Rationalizing the structure of the active sites in heterogeneous catalysts using X-ray absorption and infrared spectroscopies strengthened by machine learning algorithms

The active sites in heterogeneous catalysts consist of highly dispersed metal species, whose structure is difficult to probe due to the low concentration of active sites. The structure of active sites often evolves under reaction conditions. Therefore, a combination of operando techniques that are sensitive to the local geometric and electronic structure as well as the binding ligands (X-ray absorption spectroscopy (XAS), infrared spectroscopy (IR) and Raman spectroscopy) are the methods of choice in the catalysis community. Even with the plethora of operando techniques available nowadays, rationalizing the structure of active sites from the spectroscopic data is still very challenging.

We aim to develop and verify a new methodology for uncovering the structure of active highly dispersed cationic and metallic species using XAS and IR methods combined with machine learning algorithms. The approach will start with constructing an experimental spectral database of reference compounds with known structure, surface species with less known structure, and intermediate species identified under operando conditions. Calculations will be performed both for the reference compounds and many more possible intermediate species. We will use accurate finite difference method and density functional theory to calculate XANES spectra, vibrational frequencies and IR amplitudes. Based on the theoretical database, we will implement the machine learning algorithms, which should find hidden dependencies between the spectra and the structural parameters. The main challenge of such an approach is its application to the real experimental data where systematic differences between experimental and theoretical spectra might exist. The latter comprise of frequency shifts, variations between transmission and reflectance modes, self-absorption effects and so on. To overcome these difficulties, we propose two approaches. In the first one, the neural network will be trained on the theoretical dataset, which includes simulated experimental artefacts and different levels of theoretical approximations. The second approach will rely not on all spectral points but the use of descriptors. Descriptors of structure include radial distribution function, average bond length, types of ligands, valence, and local symmetry. Descriptors of spectra are the functions of spectral points: position and area under the pre-edge, edge energy, white line intensity, positions of maxima and minima, average value of the derivative, principle components.

The practical implementation will be based on catalytically relevant systems to provide an exploratory overview of the possible benefits and limitations. We will use the K-edge XANES data of 3d elements (V, Fe, Cu) in different local environments and on the IR spectra of CO probe molecules absorbed on single sites or metal-alloy nanoparticles (Cu, Pd, Pt-Fe, Cu-Ga, Pd-Ga). These topics are relevant for the ongoing projects of the Swiss PA   and the Operando spectroscopy group at PSI, which relate to selective alcohol oxidation on vanadium-based catalysts, CO oxidation and CO2 hydrogenation on bimetallic catalysts and NOx reduction on Cu and Fe loaded zeolites.

Output of the project will be a high-quality spectral database available for a wide community with data analysis tools trained to predict the structure of 3d metal site or nanoparticle by submitting the spectrum of unknown compound.


Paul Scherer Institute

Southern Federal University

ETH Zurich