AI-Powered Exploration of Advanced Polymer Electrodes for Sustainable Battery Technologies

AI-Driven Discovery of High-Performance Polymer Electrodes for Next-Generation Batteries

Abstract

The reliance on transition metals such as lithium, cobalt, and nickel in electric batteries raises significant environmental concerns due to their extensive use and the associated mining activities. A promising alternative lies in the use of redox-active organic materials, which can significantly reduce the carbon footprint of batteries. However, this approach is hampered by challenges, including the limited availability of suitable redox-active organic materials and issues like lower electronic conductivity, voltage, specific capacity, and long-term stability. To address these challenges, we have developed and implemented a machine learning (ML) driven battery informatics framework. This framework leverages a comprehensive battery dataset and advanced ML techniques to expedite and improve the identification, optimization, and design of redox-active organic materials. In this study, we present a data-fusion ML model coupled with meta-learning capabilities that predict battery properties such as voltage and specific capacity for various organic negative electrode and charge carrier (positive electrode material) combinations. The ML models not only enhance experimentation but also facilitate the inverse design of battery materials, enabling the identification of suitable candidates from three extensive material libraries to promote sustainable energy-storage technologies.

Keywords: data-fusion, multi-task machine learning, organic materials, batteries, energy-storage, meta-learning

1. Introduction

To reduce dependence on fossil fuels, electric batteries have emerged as a significant focus of modern scientific research. However, studies indicate that the soaring demand for electric vehicles (EVs) cannot be met solely through the use of transition metals due to the intensive mining required and the depletion of these resources. Currently, redox-active organic materials play a crucial role in addressing the challenges posed by heavy reliance on fossil fuels and the increasing demand for transition-metal elements like lithium, cobalt, and nickel in conventional batteries. These organic materials exhibit a wide variety of chemistries, structures, and applications for energy storage and mobility, providing versatility, high-rate performance, and high theoretical capacity.

Figure 1: Our battery-informatics workflow. (a) Data collection pipeline: we gather SMILES (Simplified Molecular Input Line Entry System) strings of organic negative electrodes corresponding to different charge carriers (positive electrode materials). This is followed by converting the SMILES strings into a numerical format. (b) Architecture of our multi-task machine learning (MT-ML) predictors. MT-ML models are trained to predict multiple properties concerning various battery components. The variation of charge carriers, properties, and organic material classes (polymers/molecules) is represented using a selector vector. The fingerprints are concatenated with the selector vector and serve as inputs to the MT-ML model. (c) Meta learners are trained on a holdout dataset, using outputs from multiple MT-ML models as inputs, with the output being property values (voltage and specific capacity). (d) The inverse design approach employs reference organic materials that demonstrate higher battery performance, stability, or biodegradability. We iteratively add redox-active moieties or replace elements and bonds at various positions within the organic materials to create a library of millions of candidates, screening for those with higher voltage and specific capacity using the proposed meta-learner model.

Currently, approximately two hundred redox-active organic materials are utilized as electrodes in batteries. However, they are limited by challenges such as dissolution in electrolytes, poor electrical conductivity, or low volumetric density. To tackle these issues, we require novel organic materials with improved electrochemical performance. The vast chemical space of redox-active organic materials makes conventional approaches, like high-throughput experimentation or combinatorial chemistry, time-consuming and costly for identifying suitable candidates. ML methods provide a robust platform for navigating this extensive chemical space and designing materials with desirable properties. They have proven successful in exploring material spaces in inorganic energy-storage applications (primarily batteries and super-capacitors). In these ML models, compositional, elemental, and structural parameters are utilized as inputs, with predicted outputs such as discharge capacity.

For a unique and machine-readable representation of chemical structures, SMILES strings are often employed to encode chemistries, including side chains, branches, rings, and chemical bonds. Tools like polyBERT and Morgan fingerprints convert SMILES strings into vectors that serve as numerical inputs for ML models. Previous investigations have utilized these fingerprinting approaches to predict diverse properties, such as the HOMO-LUMO gap, atomization energy, and redox potential. However, limited research has focused on predicting experimental properties such as voltage for organic batteries. Data scarcity presents a significant limitation for training ML models on experimental properties, but advanced learning approaches like multi-task, multi-fidelity, and transfer learning provide contemporary solutions to address these challenges.

In this contribution, we use an ML approach to screen and identify high-voltage and high-specific capacity redox-active polymers for electrochemical applications. We represent the structures of organic materials (molecules and polymers) in our dataset using SMILES strings. A key component of our workflow is the implementation of polyBERT as a fingerprinting tool, which transforms SMILES strings into numerical representations suitable for ML models. We develop a data-fusion MT-ML model, followed by a meta-learner to predict voltage and specific capacity across varying charge carriers. This multi-step data-fusion methodology demonstrates improved generalizability and superior performance metrics, including enhanced coefficient of determination (R²) values and reduced RMSE (Root Mean-Square Error). By employing an inverse design methodology, we create three large libraries of polymeric materials, proposing new redox-active polymer candidates with maximum energy density that facilitate the efficient discovery and development of novel materials for battery applications.

2. Results and Discussion

Dataset

We compiled a dataset comprising 771 data points for training our multi-step data-fusion ML models. The dataset includes both redox-active and non-redox active organic materials as negative electrodes, along with their electrochemical properties (voltage in volts and specific capacity in milliamp hours per gram). For non-redox active materials, the dataset features 128 data points for both properties, totaling 256 data points. These data points are evenly divided for both properties concerning lithium and sodium charge carriers (positive electrode materials), with 85 points for each property and positive electrode for both polymers and molecules. Including non-redox active materials in the training dataset allows the models to differentiate between materials that can and cannot undergo redox reactions. The redox-active materials dataset is more extensive, containing 336 data points for specific capacity and 179 data points for voltage measurements, totaling 515 data points. Notably, 66% of the data points (338 out of 506) for specific capacity are theoretically calculated, enabling the ML models to learn inherent correlations between theoretical and measured discharge capacity. This theoretical capacity is encoded using the selector vector, allowing the model to effectively integrate and interpret the relationship between theoretical predictions and experimental measurements.

Table 1: Synopsis of the dataset for the battery property prediction models. This table outlines the number of data points for different charge carriers (positive electrode materials) and polymer or molecule negative electrodes for voltage (V) and specific capacity (Sc).

Model Performance

Utilizing our comprehensive dataset of 771 data points, we train multi-task and meta-learning models for predicting voltage and specific capacity. The comparative performance analysis of these models is illustrated in Figure 3. The parity plots for MT-ML models, derived from averaged results across all five-fold testing sets, demonstrate their efficiency in accurately predicting multiple electrochemical properties. A single MT-ML model is trained on multiple properties simultaneously, enhancing predictive efficiency and addressing data scarcity by eliminating the need to train individual models for each property.

The MT-ML models achieve RMSE values of 0.43 for voltage and 70.9 for specific capacity. The voltage predictions for lithium charge carriers and polymeric negative electrodes are closer to the parity line compared to their molecular counterparts. Conversely, the prediction of specific capacity performs better for molecular negative electrodes and lithium charge carriers. Our multi-task models demonstrate robust performance across diverse electrode materials, successfully predicting properties for various charge carriers and organic material classes.

We subsequently implement and train meta-learning techniques to enhance the predictive capabilities of our MT-ML framework. The resulting parity plots showcase the exceptional generalization capacity of our meta-learner models, achieving high R² values of 0.99 and 0.95 for voltage and specific capacity, respectively. A comparative analysis between multi-task and meta-learner models reveals significant improvements in alignment with the parity line and enhanced predictions for charge carriers with limited data points. The strong improvement in predictive accuracy from multi-task to meta-learner models highlights the efficiency of our multi-step data-fusion approach in addressing the complexities of electrochemical property predictions.

Inverse Design

In our inverse design methodology, we establish a search space from 11 selected reference organic polymers. Two polymers are chosen for their high voltage, six for their structural stability, and two based on biodegradability. The polymer candidates are categorized into three distinct libraries based on their reference: high-voltage polymeric negative electrodes, stable plastics, and biodegradable polymers. These candidate libraries serve as a search space for potential replacements for existing organic electrodes exhibiting high voltage, specific capacity, or energy density. Through systematic structural modifications, such as strategic addition and substitution of redox-active moieties, we expand the 11 reference polymers into an extensive library of approximately 1.8 million candidates. The electrochemical properties of these candidates are predicted using our two-step data-fusion models.

Our screening process identifies several promising candidates with operating voltages between 4.1 and 4.5 volts, exceeding the maximum voltage of 4.07 volts present in our training dataset. The distribution patterns reveal varying characteristics across the three libraries. The high-voltage organic negative electrode library shows a notable concentration of candidates within the high-voltage regime, while the biodegradable polymer library exhibits an opposite trend. Batteries made from biodegradable redox-active polymers present a viable solution for disposable electronic devices, enhancing safety in post-usage disposal. The stable plastics library achieves the highest energy density, indicating that incorporating redox-active moieties into stable organic backbones improves multiple battery properties. Our model also identifies materials with insulating properties, characterized by near-zero values across both properties, showcasing its unbiased prediction capabilities. The comprehensive performance spectrum—from high-performing candidates to insulators—demonstrates the model’s robust predictive capacity across different material classes.

Each of the three libraries—stable plastics, biodegradable materials, and high-performing polymers—was screened for suitable candidates in an average of 90 seconds per library.

3. Conclusion

Our investigation illustrates the effectiveness of our data-fusion ML model for predicting electrochemical properties across multiple domains and fidelity levels. The MT-ML framework addresses the critical challenge of data scarcity while enhancing prediction accuracy. Our meta-learning approach further improves model performance and generalization capabilities across various property domains and material classifications. The framework demonstrates significant computational efficiency, enabling rapid prediction of electrochemical properties for large-scale material screening, which is crucial for discovering next-generation battery materials. By leveraging an inverse-design approach, our framework facilitates the quick generation of candidate material libraries, suggesting both promising novel candidates and effective replacements for existing materials. This significantly accelerates the screening process and reduces the resources required for experimental validation.

Our research aims to enhance the accuracy and efficiency of organic battery informatics by integrating the influence of electrolytes and separators. Additionally, we are developing a novel fingerprinting approach for inorganic materials in batteries, which has the potential to significantly increase the number of available data points by utilizing extensive databases for inorganic batteries. The implications of our research extend beyond theoretical advancements, promoting the development of cost-effective and environmentally sustainable energy-storage solutions. By paving the way for discovering high-performing organic battery materials, this work represents a significant step toward identifying all-organic batteries using AI. These advancements could lead to batteries with improved capacity, longer lifespans, and reduced environmental impact, addressing critical challenges in energy storage.

4. Methods

Data Preparation

Our dataset comprises 771 data points curated from peer-reviewed research publications, primarily focusing on review articles. We collected and organized data for various negative and charge carriers (positive electrode materials), voltages, and specific capacities. The UMAP (Uniform Manifold Approximation and Projection) plot of our dataset visualizes the distribution of different organic material classes in a two-dimensional embedding space.

Prior to partitioning the dataset into training and testing subsets, we applied min-max scaling to normalize each target property (voltage and specific capacity) independently, transforming output values to a range based on their respective minimum and maximum values within the dataset.

Fingerprinting

Fingerprinting involves converting chemical information represented by SMILES into a machine-readable numerical format in the form of vectors. For this purpose, we utilize the tool polyBERT, generating 600-dimensional vectors. The conversion to fingerprints is followed by concatenation with selector vectors, which indicate distinct domains in the data. These selector vectors encode key experimental and material parameters: active material content, C-rate, property type, organic material class (molecule, polymer, or ladder polymer), and charge carrier. The input to our ML models yields a 605-dimensional numerical representation, enabling the representation of multiple fidelity levels within a single computational framework, thus reducing generalization error.

Model Architecture and Training Methodology

Our two-step data-fusion ML approach includes both MT-ML and meta-learning components. The dataset was randomly split, designating 80% for MT-ML model development and 20% for meta-learner training. The MT-ML training dataset was further subdivided into 5-folds for cross-validation, with five independent models trained for each fold. Neural network-based MT-ML optimization was performed using the TensorFlow framework, supported by Optuna, enabling systematic hyper-parameter tuning across neural network architecture and training parameters.

The meta-learning step integrates insights from all cross-validated models into a deployment-ready ensemble framework. Predictions are generated for the held-out 20% dataset using the cross-validated models, followed by training a neural network that uses these predictions as inputs to learn final property values. The model’s generalization capability is tested against the original 80% training dataset.

Uncertainty quantification is implemented for both multi-task and meta-learning predictions using Monte Carlo dropout methodology, providing confidence intervals at the 95% level. This meta-learning approach ensures comprehensive model generalization across the entire dataset while maintaining prediction reliability.

5. CO2 Emission and Timing

Experiments were conducted using a computing cluster at the University of Bayreuth, with carbon efficiency of 0.344. A total of 10 computations were carried out on four A-100-40GB GPUs. Training of MT-ML and meta-learner models produced emissions around 1.4. In inference mode, computation per polymer takes approximately two seconds, combining fingerprinting, concatenation of the selector vector, and prediction from the trained model, resulting in emissions of around 57.

6. Data Availability

The data supporting the findings of this study are available from the authors upon reasonable request.

7. Code Availability

The code used for training the MT-ML and meta-learner models is available for academic use at https://github.com/kuennethgroup/organic_battery_predictor and Zenodo.

8. Acknowledgements

The authors sincerely thank the graduate school of the Bavarian Center for Battery Technology (BayBatt) for funding this ongoing research.

9. Author Contributions

S.V.S.G collected data, prepared, trained, and evaluated the ML and meta-learning models. L.W. played a key role in data collection and curation. C.K. supervised and conceptualized this work.

10. Competing Interests

The authors declare no competing financial interests.

Original article by NenPower, If reposted, please credit the source: https://nenpower.com/blog/ai-powered-exploration-of-advanced-polymer-electrodes-for-sustainable-battery-technologies/

AI-Powered Exploration of Advanced Polymer Electrodes for Sustainable Battery Technologies

相关推荐