Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/11916
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGupta, Sonakshien_US
dc.date.accessioned2023-06-20T15:37:06Z-
dc.date.available2023-06-20T15:37:06Z-
dc.date.issued2023-
dc.identifier.citationShetty, P., Rajan, A. C., Kuenneth, C., Gupta, S., Panchumarti, L. P., Holm, L., . . . Ramprasad, R. (2023). A general-purpose material property data extraction pipeline from large polymer corpora using natural language processing. Npj Computational Materials, 9(1) doi:10.1038/s41524-023-01003-wen_US
dc.identifier.issn2057-3960-
dc.identifier.otherEID(2-s2.0-85153092307)-
dc.identifier.urihttps://doi.org/10.1038/s41524-023-01003-w-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/11916-
dc.description.abstractThe ever-increasing number of materials science articles makes it hard to infer chemistry-structure-property relations from literature. We used natural language processing methods to automatically extract material property data from the abstracts of polymer literature. As a component of our pipeline, we trained MaterialsBERT, a language model, using 2.4 million materials science abstracts, which outperforms other baseline models in three out of five named entity recognition datasets. Using this pipeline, we obtained ~300,000 material property records from ~130,000 abstracts in 60 hours. The extracted data was analyzed for a diverse range of applications such as fuel cells, supercapacitors, and polymer solar cells to recover non-trivial insights. The data extracted through our pipeline is made available at polymerscholar.org which can be used to locate material property data recorded in abstracts. This work demonstrates the feasibility of an automatic pipeline that starts from published literature and ends with extracted material property information. © 2023, The Author(s).en_US
dc.language.isoenen_US
dc.publisherNature Researchen_US
dc.sourcenpj Computational Materialsen_US
dc.titleA general-purpose material property data extraction pipeline from large polymer corpora using natural language processingen_US
dc.typeJournal Articleen_US
dc.rights.licenseAll Open Access, Gold, Green-
Appears in Collections:Department of Metallurgical Engineering and Materials Sciences

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: