Towards precise and convenient semantic search on text and knowledge bases

to connected objects

Abstract: In this dissertation, we consider the problem of making semantic search on text and knowledge bases more precise and convenient. In a nutshell, semantic search is search with meaning. To this respect, text and knowledge bases have different advantages and disadvantages. Large amounts of text are easily available on the web, and they contain a wealth of information in natural language. However, text represents information in an unstructured form. It follows no pre-defined schema, and without further processing, a machine can understand its meaning only on a superficial level. Knowledge bases, on the other hand, contain structured information in the form of subject-predicate-object triples. The meaning of triples is well defined, and triples can be retrieved precisely via a query language. However, formulating queries in this language is inconvenient and compared to text only a small fraction of information is currently available in knowledge bases.

In this document, we summarize our contributions on making semantic search on text and knowledge bases more precise and convenient. For knowledge bases, we introduce an approach to answer natural language questions. A user can pose questions conveniently in natural language and ask, for example, "who is the ceo of apple?", instead of having to learn and use a specific query language. Our approach applies learning-to-rank strategies and improved the state of the art on two widely used benchmarks at the time of publication. For knowledge bases, we also describe a novel approach to compute relevance scores for triples from type-like relations like profession and nationality. For example, on a large knowledge base, a query for "american actors" can return a list of more than 60 thousand actors in no particular order. Relevance scores allow to sort this list so that, e.g., frequent lead actors appear before those who only had single cameo roles. In a benchmark that we generated via crowdsourcing, we show that our rankings are closer to human judgments than approaches from the literature. Finally, for text, we introduce a novel natural language processing technique that identifies which words in a sentence "semantically belong together". For example, in the sentence "Bill Gates, founder of Microsoft, and Jeff Bezos, founder of Amazon, are among the wealthiest persons in the world", the words "Bill Gates", "founder", and "Amazon" do not belong together, but the words "Bill Gates", "founder", and "Microsoft" do. We show that when query keywords are required to belong together in order to match, search results become more precise.

Given the characteristics of text and knowledge bases outlined above, it is promising to consider a search that combines both. For example, for the query "CEOs of U.S. companies who advocate cryptocurrencies", a list of CEOs of U.S. companies can be retrieved from a knowledge base. The information who is advocating cryptocurrencies is rather specific and changes frequently. It is, therefore, better found in full text. As part of this thesis, we describe how a combined search could be achieved and present and evaluate a fully functional prototype. All of our approaches are accompanied by an extensive evaluation which show their practicability and, where available, compare them to established approaches from the literature

Location: Deutsche Nationalbibliothek Frankfurt am Main

Extent: Online-Ressource

Language: Englisch

Notes: Universität Freiburg, Dissertation, 2017

Keyword: Questions and answers

Event: Veröffentlichung

(where): Freiburg

(who): Universität

(when): 2018

Creator: Haussmann, Elmar

Contributor: Bast, Hannah
Albert-Ludwigs-Universität Freiburg. Fakultät für Angewandte Wissenschaften

DOI: 10.6094/UNIFR/16031

URN: urn:nbn:de:bsz:25-freidok-160318

Rights: Kein Open Access; Der Zugriff auf das Objekt ist unbeschränkt möglich.

Last update: 25.03.2025, 1:51 PM CET

Data provider

This object is provided by:
Deutsche Nationalbibliothek. If you have any questions about the object, please contact the data provider.

Show original at data provider

Associated

Time of origin

2018

Other Objects (12)

Search and Analytics Using Semantic Annotations

zweidimensionales bewegtes Bild

13 Semantic Web Technologien - Semantic Search

LakeBase Semantic Service

Hochschulschrift

Building a semantic search engine with games and crowdsourcing

Hochschulschrift

Semi-automated ontology generation for biocuration and semantic search

Hochschulschrift

Building a semantic search engine with games and crowdsourcing

zweidimensionales bewegtes Bild

Context-driven semantic multimedia search

Hochschulschrift

Semantic search for novel information

Aufsatzsammlung

Semantic search over the web

Process-oriented semantic web search

Hochschulschrift

Precise jet measurements and search for supersymmetric particles with the CMS experiment

Hochschulschrift

Semantic search and composition in unstructured peer-to-peer networks

Search and Analytics Using Semantic Annotations

zweidimensionales bewegtes Bild

13 Semantic Web Technologien - Semantic Search

LakeBase Semantic Service

Hochschulschrift

Building a semantic search engine with games and crowdsourcing

Hochschulschrift

Semi-automated ontology generation for biocuration and semantic search

Hochschulschrift

Building a semantic search engine with games and crowdsourcing

zweidimensionales bewegtes Bild

Context-driven semantic multimedia search

Hochschulschrift

Semantic search for novel information

Aufsatzsammlung

Semantic search over the web

Process-oriented semantic web search

Hochschulschrift

Precise jet measurements and search for supersymmetric particles with the CMS experiment

Hochschulschrift

Semantic search and composition in unstructured peer-to-peer networks

Search and Analytics Using Semantic Annotations

zweidimensionales bewegtes Bild

13 Semantic Web Technologien - Semantic Search

LakeBase Semantic Service

Hochschulschrift

Building a semantic search engine with games and crowdsourcing

Hochschulschrift

Semi-automated ontology generation for biocuration and semantic search

Hochschulschrift

Building a semantic search engine with games and crowdsourcing

zweidimensionales bewegtes Bild

Context-driven semantic multimedia search

Hochschulschrift

Semantic search for novel information

Aufsatzsammlung

Semantic search over the web

Process-oriented semantic web search

Hochschulschrift

Precise jet measurements and search for supersymmetric particles with the CMS experiment

Hochschulschrift

Semantic search and composition in unstructured peer-to-peer networks

Cultural heritage institutions wishing to register will find more information here.

Fields marked * need to be filled in.

Username*

Please enter your username

Email*

Please enter your email address

Please do not fill this field

First name

Last name

Password*

Please enter your password

Confirm password*

Please enter the same password

I have read the terms of use and the privacy policy for the collection of personal data and accept them. *

This field is required.

I would like to subscribe to the newsletter of the Deutsche Digitale Bibliothek. See newsletter subscription info.

Account created

Your "My DDB" account has been successfully created. Before you can log in to your account, you must click the confirmation link in the message we just sent to the email address you provided.

Towards precise and convenient semantic search on text and knowledge bases

Object Details

Classification and Topics

Contributors, Places and Time

Further information

Data provider

Associated

Time of origin

Other Objects (12)

Search and Analytics Using Semantic Annotations

13 Semantic Web Technologien - Semantic Search

LakeBase Semantic Service

Building a semantic search engine with games and crowdsourcing

Semi-automated ontology generation for biocuration and semantic search

Building a semantic search engine with games and crowdsourcing

Context-driven semantic multimedia search

Semantic search for novel information

Semantic search over the web

Process-oriented semantic web search

Precise jet measurements and search for supersymmetric particles with the CMS experiment

Semantic search and composition in unstructured peer-to-peer networks

Search and Analytics Using Semantic Annotations

13 Semantic Web Technologien - Semantic Search

LakeBase Semantic Service

Building a semantic search engine with games and crowdsourcing

Semi-automated ontology generation for biocuration and semantic search

Building a semantic search engine with games and crowdsourcing

Context-driven semantic multimedia search

Semantic search for novel information

Semantic search over the web

Process-oriented semantic web search

Precise jet measurements and search for supersymmetric particles with the CMS experiment

Semantic search and composition in unstructured peer-to-peer networks

Search and Analytics Using Semantic Annotations

13 Semantic Web Technologien - Semantic Search

LakeBase Semantic Service

Building a semantic search engine with games and crowdsourcing

Semi-automated ontology generation for biocuration and semantic search

Building a semantic search engine with games and crowdsourcing

Context-driven semantic multimedia search

Semantic search for novel information

Semantic search over the web

Process-oriented semantic web search

Precise jet measurements and search for supersymmetric particles with the CMS experiment

Semantic search and composition in unstructured peer-to-peer networks

Related objects

Reset password