Semantic annotators (keywords, determination of facets, similar articles etc.)

Semantic annotators are programs that make interesting and valuable additions to texts. Such enrichments or annotations can be keywords (from a defined inventory) or headwords (without restrictions). On this basis, facets of the archive can also be determined, for example which specific medications could be classified under the facet 'medication' and which texts belong to them. These facets can be used in a faceted search (like SEMPRIA-Search) or other systems. Finally, the articles that are particularly similar to a given article can also be determined. This allows information boxes, link boxes, topic pages and specials to be filled sensibly.

For all the annotators mentioned, it should be emphasized that we do not use standard procedures, but rather build on the results of our deep language understanding technology. This is a little more complex, but the results are much more complete and accurate. For example, ambiguities can often be clearly distinguished. The repaired Lincoln, a city of Lincoln and a human named Lincoln are distinguished. The different meanings lead to different keywords, facets and derived results. The regional and technical characteristics of an archive are also extensively and cleverly taken into account.

The semantic annotators are available as inexpensive modules for SEMPRIA-Search. They can also be used as a SaaS (Software-as-a-Service) or stand-alone server solution. We will gladly advise you in an uncomplicated and comprehensive manner.

Text technology functions

Finally, a list of text technology functions we can offer. Others that build on these or combine them virtuously are possible at any time.

  • automatic keywords and headwords, facets
  • readability assessment for texts
  • finding repetitions and duplicates (at a semantic level, not just at a character level)
  • finding contradictions
  • info boxes, link boxes, topic pages and specials
  • automatic summaries (summarization, abstracting)
  • semantic version analysis of texts
  • review of terminologies (organization-wide spellings and technical terms)
  • comparison of different catalogs or price lists
  • monitoring of markets and competitors
  • classification of support tickets and customer inquiries
  • matching of longer texts (e.g. tenders, competence profiles)