Synonyms
Contents
Introduction
Use synonyms to improve search results. This is especially helpful when working with domain-specific texts. For instance, when a user searches for fish
but gets hits for both fish
and salmon
.
Concepts
Thesaurus
In order to manage synonyms, you may group them into one or more thesauri.
Each thesaurus may contain synonyms in one or more languages/locales.
Languages/Locales
Explorer uses the same locales which are made available by Java.
In addition, we’ve added zxx for (non-linguistic content / not applicable).
Stemming Language
Here’s a list of the languages Enonic supports for stemming:
And some documentation on how it works:
Explorer is able to match multiple locales down to a specific stemming language.
Use: From, To, Both
You must select how a synonym is used. It may be used in one of three ways:
-
from
(Only used to find which synonyms to apply to the main query, itself is not applied to the main query.) -
to
(Only used to apply to the main query. Not used when finding synonyms.) -
both
(Used to find synonyms, but also added to the main query.)
Enabled & Comments
Turning certain synonyms on and off may greatly affect the search results. You may supply a comment as to why it’s a good idea to keep, or disable a synonym.
How it works
The combination of all these concepts makes it possible to create both simple and quite complex thesauri:
Single language/dialect
You may set up a thesaurus with just a single language. For instance English [en].
Let’s say you want to help with some common misspellings. You could make a synonym from: artic, to: arctic
Or, depending on your data and/or the focus of your search engine, you may want a "two-way" synonym between car and auto. To achieve that set usage to both instead of using from and to.
Multi language & Codes
In your domain language, maybe there are special codes which don’t really belong to a specific language (and thus shouldn’t be stemmed).
For example in the medical world, there’s something called ATC codes. Let’s say you have a search page, which can search and present both Norwegian and English results.
You have the ATC code A01AA02, the Norwegian word natriummonofluorofosfat and the English term sodium monofluorophosphate.
Management alternatives
Depending on how you want to manage it, you could either set up a single "complex" thesaurus or multiple "simple" ones.
Both alternatives will give the same results…
Alternative A: Multiple simple thesauri
-
One thesaurus with two locales zxx and no, where zxx from: A01AA02 and no to: natriummonofluorofosfat
-
A second thesaurus with two locales zxx and en, where zxx from: A01AA02 and en to: sodium monofluorophosphate
-
A third thesaurus with two locales no and en, where no both: natriummonofluorofosfat and en both: sodium monofluorophosphate
Alternative B: Single complex thesaurus
-
A single thesaurus with three locales zxx, no and en with zxx from: A01AA02, no both: natriummonofluorofosfat and en both: sodium monofluorophosphate
Interface languages parameter
When an interface is used to search, it has a languages
parameter. If you leave it empty, "all" languages in the selected thesauri will be searched and applied (with stemming). Supplying a languages
parameter will limit which languages are used when searching for synonyms to apply.
Given the example above:
-
languages: ["en"]
will give 0 synonyms applied (no matter what the search string is).
-
languages: ["no"]
will give 0 synonyms applied (no matter what the search string is).
-
languages: ["zxx"]
will give 0 synonyms applied (no matter what the search string is).
-
languages: ["zxx", "en"]
will apply sodium monofluorophosphate if the search string is A01AA02
-
languages: ["zxx", "no"]
will apply natriummonofluorofosfat if the search string is A01AA02
-
languages: ["en", "no"]
will apply sodium monofluorophosphate if the search string is natriummonofluorofosfat, and natriummonofluorofosfat if the search string is sodium monofluorophosphate, if the search string is A01AA02, no synonyms will be applied.
-
languages: ["zxx", "en", "no"]
will apply sodium monofluorophosphate if the search string is natriummonofluorofosfat, and natriummonofluorofosfat if the search string is sodium monofluorophosphate, if the search string is A01AA02, both natriummonofluorofosfat and sodium monofluorophosphate will be applied.