Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/4773
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChaudhari, Narendra S.en_US
dc.date.accessioned2022-03-17T01:00:00Z-
dc.date.accessioned2022-03-17T15:35:26Z-
dc.date.available2022-03-17T01:00:00Z-
dc.date.available2022-03-17T15:35:26Z-
dc.date.issued2012-
dc.identifier.citationThakur, R., Jain, S., Chaudhari, N. S., & Singhai, R. (2012). Information extraction from semi-structured and un-structured documents using probabilistic context free grammar inference. Paper presented at the Proceedings - 2012 International Conference on Information Retrieval and Knowledge Management, CAMP'12, 273-276. doi:10.1109/InfRKM.2012.6204988en_US
dc.identifier.isbn9781467310901-
dc.identifier.otherEID(2-s2.0-84863088952)-
dc.identifier.urihttps://doi.org/10.1109/InfRKM.2012.6204988-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/4773-
dc.description.abstractLarge number of research papers are available in the form of un-structured (text) format. Knowledge discovery in un-structured document has been recognized as promising task. These documents are typically formatted for human viewing, which varies widely from document to document. Frequent change in their formatting causes difficulties in constructing a global schema. Thus, discovery of interesting rules from it is a complex and tedious process. Recently, conditional random fields (CRFs) and hand-coded wrappers have been used to label the text (such as Title, Author Name(s), Affiliation, Email, Contact number, etc. in research papers). In this paper we propose a novel hybrid approach to infer grammar rules using alignment similarity and probabilistic context free grammar. It helps in extracting desired information from the document. © 2012 IEEE.en_US
dc.language.isoenen_US
dc.sourceProceedings - 2012 International Conference on Information Retrieval and Knowledge Management, CAMP'12en_US
dc.subjectConditional random fields (CRFs)en_US
dc.subjectGlobal schemasen_US
dc.subjectGrammar inferenceen_US
dc.subjectGrammar rulesen_US
dc.subjectHybrid approachen_US
dc.subjectInformation Extractionen_US
dc.subjectInteresting rulesen_US
dc.subjectProbabilistic context free grammarsen_US
dc.subjectResearch papersen_US
dc.subjectSemi-structureden_US
dc.subjectSequence miningen_US
dc.subjectAlignmenten_US
dc.subjectData miningen_US
dc.subjectInformation retrievalen_US
dc.subjectKnowledge managementen_US
dc.subjectLearning systemsen_US
dc.subjectContext free grammarsen_US
dc.titleInformation extraction from semi-structured and un-structured documents using probabilistic context free grammar inferenceen_US
dc.typeConference Paperen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: