Literature Review


1st paragraph: topic

Popularity-based, keyword, and semantic searches, while useful for everyday tasks, have significant limitations in scientific research, where researchers need highly specific details that are often mentioned only tangentially in documents. To address this challenge, Information Extraction (IE) presents two promising solutions: applying IE after the search to reduce manual skimming or leveraging IE beforehand to generate data structure that enhances search efficiency. The first method is still constrained by traditional search techniques, while the second, by building a knowledge graph from extracted data, allows researchers to bypass documents and directly target information.

2nd paragraph: gap

However, state-of-the-art IE models predominantly rely on supervised fine-tuning of large language models (LLMs), which derive their effectiveness from the quality and scope of their training data. Creating such datasets is resource-intensive, which lead to datasets that focus on extracting details relevant only to narrow topics, while discarding information that falls outside this regiment. This approach introduces a significant limitation: each dataset represents a distinct extraction task, with varying annotation schemes that differ across datasets. The transition to applications carried these disadvantages with it and as a result, information fragmented across separate databases, each structured by its own unique schema and ontology.

The relevant literature for text-to-graph translation is best divided respective two elements of this project. Please navigate to the one that is more exciting to you:

  • Information extraction Literature
  • Universal Knowledge Representation Literature