Application of LLM to Search and Systematize the Properties of Thermoelectric Materials in Scientific Literature

Authors

  • M.M. Korop 1. Institute of Thermoelectricity of the NAS and MES of Ukraine, 1 Nauky str., Chernivtsi, 58029, Ukraine. 2. Yuriy Fedkovych Chernivtsi National University 2 Kotsiubynskyi str., Chernivtsi, 58012, Ukraine https://orcid.org/0009-0000-4921-3419
  • A.V. Prybyla 1. Institute of Thermoelectricity of the NAS and MES of Ukraine, 1 Nauky str., Chernivtsi, 58029, Ukraine. 2. Yuriy Fedkovych Chernivtsi National University 2 Kotsiubynskyi str., Chernivtsi, 58012, Ukraine https://orcid.org/0000-0003-4610-2857

DOI:

https://doi.org/10.63527/1607-8829-2025-1-16-25

Keywords:

thermoelectricity, materials science, machine learning, large language models, thermoelectric energy converters, computer simulation

Abstract

Thermoelectric materials find applications in a variety of fields due to their ability to directly convert heat into electricity. Selecting the optimal thermoelectric material is a challenging task, limited by empirical, time, and economic factors. Recent advances in artificial intelligence (AI), in particular large language models (LLMs), demonstrate significant potential for automatically extracting and organizing information from the scientific literature on the properties of thermoelectric materials. This review analyzes the evolution of machine learning-based methods, from early unsupervised NLP models such as Word2Vec to modern approaches using GPT models. The research results show that LLMs allow for the efficient identification of new promising thermoelectric materials, automation of experimental data collection processes, and the formation of structured databases, which significantly accelerates the search for materials with high efficiency rates. The paper also outlines directions for further research, such as extending the methods to tabular and graphical data, as well as optimizing computational resources.

References

1. Anatychuk L. I., Prybyla A. V. (2017). Limiting possibilities of thermoelectric liquid-liquid heat pumps. J.Thermoelectricity, 4, 51-55.

2. Rifert V., Anatychuk L., Barabash P., Solomakha A., Usenko V., Prybyla A., Sereda V. (2019). Comparative analysis of thermal distillation methods with heat pumps for long space flights. J.Thermoelectricity, 4, 5–17. Retrieved from http://jte.ite.cv.ua/index.php/jt/article/view/70

3. Anatychuk L., Lysko V., Prybyla A. (2022). Rational areas of using thermoelectric heat recuperators. J.Thermoelectricity, 3-4, 43–67. https://doi.org/10.63527/1607-8829-2022-3-4-43-67

4. Anatychuk L., Prybyla A., Korop M., Kiziuk Y., & Konstantynovych I. (2024). Thermoelectric power sources using low-grade heat: Part 1. J. Thermoelectricity, 1-2, 90–96. https://doi.org/10.63527/1607-8829-2024-1-2-90-96

5. Anatychuk L. (2020). Efficiency criterion of thermoelectric energy converters using waste heat. J.Thermoelectricity, 4, 58–63. Retrieved from http://jte.ite.cv.ua/index.php/jt/article/view/47

6. Anatychuk L.I., Lysko V.V., Havryliuk M.V. (2018). Ways for quality improvement in the measurement of thermoelectric material properties by the absolute method. J.Thermoelectricity, 2, 90 – 100.

7. Anatychuk L.I., Lysko V.V., Havryliuk M.V., Tiumentsev V.A. (2018). Automation and computerization of measurements of thermoelectric parameters of materials. J. Thermoelectricity, 3, 80 – 88.

8. Anatychuk L.I., Lysko V.V. (2012). Investigation of the effect of radiation on the precision of thermal conductivity measurement by the absolute method. J.Thermoelectricity, 1, 65–73.

9. Anatychuk L.I., Lysko V.V. Modified Harman's method. (2012) AIP Conference Proceedings, 1449, 373 – 376. DOI: 10.1063/1.4731574.

10. Korop M. M. (2023). Machine learning in thermoelectric materials science. In: J. Thermoelectricity, 1, 44–54. Institute of Thermoelectricity. https://doi.org/10.63527/1607-8829-2023-1-44-54

11. Anatychuk L. I., Korop M. M. (2023). Application of machine learning to predict the properties of Bi2Te3 -based thermoelectric materials. In: J.Thermoelectricity, 2, 59–71. Institute of Thermoelectricity. https://doi.org/10.63527/1607-8829-2023-2-59-71

12. Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser L., Polosukhin, I. (2017). Attention Is All You Need (Version 7). arXiv. https://doi.org/10.48550/ARXIV.1706.03762

13. Tshitoyan V., Dagdelen J., Weston L., Dunn A., Rong Z., Kononova O., Persson K. A., Ceder G., & Jain A. (2019). Unsupervised word embeddings capture latent knowledge from materials science literature. In Nature, 571(7763), 95–98 . Springer Science and Business Media LLC. https://doi.org/10.1038/s41586-019-1335-8

14. Sierepeklis O., Cole J. M. (2022). A thermoelectric materials database auto-generated from the scientific literature using ChemDataExtractor. In Scientific Data (Vol. 9, Issue 1). Springer Science and Business Media LLC. https://doi.org/10.1038/s41597-022-01752-1

15. Jia X., Yao H., Yang Z., Shi J., Yu J., Shi R., Zhang H., Cao F., Lin X., Mao J., Wang C., Zhang Q., & Liu X. (2023). Advancing thermoelectric materials discovery through semi-supervised learning and high-throughput calculations. In Applied Physics Letters, 23, 20. AIP Publishing. https://doi.org/10.1063/5.0175233

16. Thway M., Low A. K. Y., Khetan S., Dai H., Recatala-Gomez J., Chen A. P., Hippalgaonkar K. (2024). Harnessing GPT-3.5 for text parsing in solid-state synthesis – case study of ternary chalcogenides. In Digital Discovery,. 3(2), 328–336). Royal Society of Chemistry (RSC). https://doi.org/10.1039/d3dd00202k

17. Polak M. P., Morgan D. (2024). Extracting accurate materials data from research papers with conversational language models and prompt engineering. In Nature Communications.. 15(1), Springer Science and Business Media LLC. https://doi.org/10.1038/s41467-024-45914-8

18. Dagdelen J., Dunn A., Lee S., Walker N., Rosen A. S., Ceder G., Persson K. A., Jain A. (2024). Structured information extraction from scientific text with large language models. In Nature Communications, 15(1). Springer Science and Business Media LLC. https://doi.org/10.1038/s41467-024-45563-x

19. Itani S., Zhang Y., & Zang J. (2025). Large Language Model-Driven Database for Thermoelectric Materials (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2501.00564

Downloads

How to Cite

Korop, M., & Prybyla, A. (2025). Application of LLM to Search and Systematize the Properties of Thermoelectric Materials in Scientific Literature. Journal of Thermoelectricity, (1), 16–25. https://doi.org/10.63527/1607-8829-2025-1-16-25

Issue

Section

Materials research

Most read articles by the same author(s)

1 2 3 > >> 

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.