Synonyms

Contents

Introduction

Use synonyms to improve search results. This is especially helpful when working with domain specific texts. For instance, when a user searches fish, but the search actually searches both fish and salmon.

Concepts

Thesaurus

In order to manage synonyms, you may group them into one or more thesauri.

Each thesaurus may contain synonyms in one or more languages/locales.

Languages/Locales

Explorer uses the same locales which are made available by Java.

In addition we’ve added zxx for (non linguistic content / not applicable).

Stemming Language

Here’s a list of the languages Enonic supports for stemming:

And some documentation on how it works:

Explorer is able to match multiple locales down to a specific stemming language.

Use: From, To, Both

You must select how a synonym is used. It may be used in three ways:

  1. From (Only used to find which synonyms to apply to the main query, itself is not applied to the main query.)

  2. To (Only used to apply to the main query. Not used when finding synonyms.)

  3. Both (Used to find synonyms, but also added to the main query.)

Enabled & Comments

Turning on and off certain synonyms may greatly affect the search results. You may supply a comment as to why it’s a good idea to keep, or disable a synonym.

How it works

The combination of all these concepts, makes it possible to create both simple and quite complex thesauri:

Single language/dialect

You may setup a thesaurus with just a single language. For instance English [en].

Let’s say you want to help with some common misspellings. You could make a synonym from: artic, to: arctic

Or depending upon your data, and/or the focus of your search engine, you may want a "two-way" synonym between car and auto. To achieve that set usage to both instead of using from and to.

Multi language & Codes

In your domain language, maybe there are special codes which doesn’t really belong to a specific language (and thus shouldn’t be stemmed).

For example in the medical world, there’s something called ATC codes. Let’s say you have a search page, which can search and present both Norwegian and English results.

You have the ATC code A01AA02, the Norwegian word natriummonofluorofosfat and the English term sodium monofluorophosphate.

Management alternatives

Depending on how you want to manage it, you could either setup a single "complex" thesaurus or multiple "simple" ones.

Both alternatives will give the same results…​

Alternative A: Multiple simple thesauri

  • One thesaurus with two locales zxx and no, where zxx from: A01AA02 and no to: natriummonofluorofosfat

  • A second thesaurus with two locales zxx and en, where zxx from: A01AA02 and en to: sodium monofluorophosphate

  • A third thesaurus with two locales no and en, where no both: natriummonofluorofosfat and en both: sodium monofluorophosphate

Alternative B: Single complex thesaurus

  • A single thesaurus with three locales zxx, no and en with zxx from: A01AA02, no both: natriummonofluorofosfat and en both: sodium monofluorophosphate

Interface languages parameter

When an interface is used to search, it has a languages parameter. If you leave it empty "all" languages in the selected thesauri will be searched and applied (with stemming). Supplying a languages parameter will limit which languages are used when searching for synonyms to apply.

Given the example above:

languages: ["en"]

  • will give 0 synonyms applied (no matter what the search string is).

languages: ["no"]

  • will give 0 synonyms applied (no matter what the search string is).

languages: ["zxx"]

  • will give 0 synonyms applied (no matter what the search string is).

languages: ["zxx", "en"]

  • will apply sodium monofluorophosphate if the search string is A01AA02

languages: ["zxx", "no"]

  • will apply natriummonofluorofosfat if the search string is A01AA02

languages: ["en", "no"]

  • will apply sodium monofluorophosphate if the search string is natriummonofluorofosfat

  • and natriummonofluorofosfat if the search string is sodium monofluorophosphate

  • if the search string is A01AA02, no synonyms will be applied.

languages: ["zxx", "en", "no"]

  • will apply sodium monofluorophosphate if the search string is natriummonofluorofosfat

  • and natriummonofluorofosfat if the search string is sodium monofluorophosphate

  • if the search string is A01AA02, both natriummonofluorofosfat and sodium monofluorophosphate will be applied.


Contents

Contents