DSL
Contents
Query DSL
Introduction
Enonic XP provides a Query DSL (Domain Specific Language) based on JSON to define queries. Usually query JSON object looks like syntax tree consisting of two types of operators:
-
Expression
: Provides a way to check particular fields for various native and analysed values. -
Compound
: Provides a way to logically combine various expressions and compounds to fetch nodes based on complex conditions.
Expressions
Each node field could be indexed in many ways according to its type and user mappings. By default, a query searches in base primitive value index of a field (number, string, boolean). To force search in a field’s advanced analysed index (like dates and geoPoints), a type
of expression must be provided.
Search could be done by an expression with particular type only for analyzed fields with compatible value type and not prohibiting type mapping. |
Value types
As said above, fields could be indexed in many ways. Which one will be used to make a search for an expression? By default, it depends on a specified JSON value type (string, number or boolean). But a developer can force a search by pointing 'type' expression property for some indices.
time
Time expression converts expression as LocalTime string and makes a search for a field indexed as 'string'.
dateTime
DateTime expression makes a search for fields indexed as 'datetime' (Instant
, LocalDate
, LocalDateTime
value types). If time is not specified it will be set to zero. If timezone is not specified it will be set to UTC.
Expression types
term
Fetch nodes with exact value in a provided field.
Parameter | Type | Required | Description |
---|---|---|---|
field |
string |
true |
property name to search. |
value |
object |
true |
exact property value. |
type |
string |
false |
value type. |
boost |
number |
false |
multiplier for |
Request examples:
{
"term": {
"field": "myNumber",
"value": 4.2
}
}
{
"term": {
"field": "myBoolean",
"value": true,
"boost": 2
}
}
in
Fetch nodes if a provided field contains any of listed values.
Parameter | Type | Required | Description |
---|---|---|---|
field |
string |
true |
property name to search. |
values |
object[] |
true |
array of possible property values. All values must be of the same type. |
type |
string |
false |
value type. |
boost |
number |
false |
multiplier for |
Request examples:
{
"in": {
"field": "myNumber",
"values": [
3.2,
4.0,
5
]
}
}
You cannot mix values of different types in the same in expression. |
{
"in": {
"field": "myDateTime",
"type": "dateTime",
"values": [
"2015-02-26T12:00:00.030Z",
"2015-02-26T12:00:00-02:23"
]
}
}
like
Returns nodes that contain the field matching a wildcard pattern. A wildcard operator ( *
) is a placeholder that matches one or more characters.
Parameter | Type | Required | Description |
---|---|---|---|
field |
string |
true |
property name to search. |
value |
string |
true |
search string. |
type |
string |
false |
value type. |
boost |
number |
false |
multiplier for |
Request examples:
{
"like": {
"field": "myString",
"value": "start*"
}
}
{
"like": {
"field": "myString",
"value": "*end"
}
}
{
"like": {
"field": "myString",
"value": "*middle*",
"boost": 2.2
}
}
range
Matches the nodes with fields that have terms within a certain range. Index to search depends on the field and expression types.
Parameter | Type | Required | Description |
---|---|---|---|
field |
string |
true |
property name to search. |
lt |
object |
false |
less than. |
lte |
object |
false |
less than or equals. |
gt |
object |
false |
greater than. |
gte |
object |
false |
greater than or equals. |
type |
string |
false |
value type. |
boost |
number |
false |
multiplier for |
-
gt
andgte
cannot be used together. -
lt
andlte
cannot be used together. -
At least one range property must be specified.
-
All specified properties must be of the same type.
Request examples:
{
"range": {
"field": "myNumber",
"lte": 5
}
}
{
"range": {
"field": "myString",
"gte": "a",
"lt": "d",
"boost": 1.5
}
}
{
"range": {
"field": "myDateTime",
"type": "dateTime",
"gt": "2017-09-11T09:00:00Z"
}
}
pathMatch
The path-match matches a path in a same branch, scoring the paths closest to the given query path first. Also, a number of minimum matching elements that must match could be set.
Parameter | Type | Required | Description |
---|---|---|---|
field |
string |
true |
property name to search. |
path |
string |
true |
path value. |
minimumMatch |
number |
false |
number of minimum matching elements. |
boost |
number |
false |
multiplier for |
Request examples:
{
"pathMatch": {
"field": "_path",
"path": "/mySite/folder1/folder2/images",
"minimumMatch": 2
}
}
matchAll
Matches all nodes, giving them all a _score
of '1.0'.
Parameter | Type | Required | Description |
---|---|---|---|
boost |
number |
false |
multiplier for |
{
"matchAll": { }
}
{
"matchAll": {
"boost": 2
}
}
exists
Returns nodes that contain a value for a field.
Parameter | Type | Required | Description |
---|---|---|---|
field |
string |
true |
name of a field to check on existence. |
{
"exists": {
"field": "displayName"
}
}
String expressions
Parameter | Type | Required | Description | ||||
---|---|---|---|---|---|---|---|
fields |
string[] |
true |
List of propertyPaths to match. |
||||
query |
string |
true |
A query string to match field value(s). Support the set of operators. |
||||
operator |
string |
false |
A default operator used if no explicit operator is specified in the query.
|
-
+
signifies AND operation. -
|
signifies OR operation. -
-
negates a single token. -
*
at the end of a term signifies a prefix query. -
(
and)
signify precedence. -
“` and `”
wraps a number of tokens to signify a phrase for searching -
~N
after a word signifies edit distance (fuzziness) with a number representing Levenshtein distance. -
~N
after a phrase signifies slop amount (how far apart terms in phrase are allowed)
To use one of these characters literally, escape it with a preceding backslash (\).
You can boost - thus increasing or decreasing hit-score per field basis. By providing more than one field to the query by appending a weight-factor: ^N |
fulltext
The fulltext expression is searching for words in a field, and calculates relevance scores based on a set of rules (e.g number of occurences, field-length, etc).
Only analyzed properties are considered when applying the fulltext function. |
Request examples:
{
"fulltext": {
"fields": [
"_name^3",
"my.inner.analyzed.property",
"custom.*"
],
"query": "apple pork fish",
"operator": "OR"
}
}
{
"fulltext": {
"fields": "displayName",
"query": "~apple pork+pie*"
}
}
ngram
An edge n-gram is a sequence of n letters from a term. During ngram indexing, the term "foxy" is also indexed as: "f", "fo", and "fox".
When using the nGram search expression, we are able get matches, even if the search only contains parts of a term. This is for instance useful when creating autocomplete functionality. The max limit of the ngram tokenizer is 25 characters, meaning that search strings over 25 characters will not match. As such, ngram queries may successfully be combined with the fulltext search function or other query expressions, to both match fragments of words as well as full phrases.
Only properties analyzed as text are considered when applying the ngram expression. This includes, by default, all text-based fields in the content domain. |
Request examples:
{
"ngram": {
"fields": [
"displayName",
"_name^3"
],
"query": "fish boat"
}
}
{
"ngram": {
"fields": "custom.",
"query": "lev alg",
"operator": "AND"
}
}
stemmed
The stemmed expression is similar to Fulltext except that it searches language optimized tokens instead of a source text. E.g. source text The monkey loved bananas
will be transformed to the
, monkey
, love
, banana
tokens and they will be used for search.
Stemming is language-dependent, so language must be set either on the content or directly in the node indices via indexConfig. |
- fields
-
Comma-separated list of propertyPaths to include in the search. Only
_allText
field is currently indexed for stemming by default. - language
-
Content language that was used for stemming. List of supported languages
Request examples:
{
"stemmed": {
"fields": "_allText",
"query": "fish boat",
"language": "en"
}
}
Compounds
Compound expressions wrap other compounds or expressions, either to combine their results and scores or to change their behaviour.
boolean
Boolean provides the way to combine logical operations on expressions. All sub-expressions can contain a single expression or an array.
{
"boolean": {
"should": {
...
},
"must": {
...
},
"mustNot": [
{
"boolean": {
"should": {
...
}
}
}
...
]
}
}
must
All expressions must evaluate to true to include a node in the result.
{
"boolean": {
"must": {
"term": {
"field": "myNumber",
"value": 2.4
}
}
}
}
{
"boolean": {
"must": [
{
"term": {
"field": "myNumber",
"value": 2.4
}
},
{
"ngram": {
"field": "displayName",
"query": "fisk"
}
}
]
}
}
should
One or more expressions must evaluate to true to include a node in the result.
{
"boolean": {
"should": [
{
"pathMatch": {
"field": "_path",
"path": "/fisk/a/b",
"minimumMatch": 2
}
},
{
"like": {
"field": "displayName",
"value": "fol*der"
}
}
]
}
}
mustNot
All expressions in the mustNot must evaluate to false for nodes to match.
{
"boolean": {
"must":
{
"pathMatch": {
"field": "_path",
"path": "/fisk/a/b",
"minimumMatch": 2
}
},
"mustNot":
{
"like": {
"field": "displayName",
"value": "fol*der"
}
}
}
}
filter
All expressions must evaluate to true to include a node in the result (similar to must), but they will not affect the score for matching nodes.
{
"boolean": {
"filter": [
{
"range": {
"field": "date_field",
"type": "dateTime",
"gt": "2017-09-11T09:00:00Z"
}
},
{
"like": {
"field": "displayName",
"value": "my*"
}
}
]
}
}
Query filtering is a preferable way to filter nodes, however it can be used together with filters. |
Boosting
Any query operator result (expression or compound) can be boosted to change the relevance score of the nodes.
{
"boolean": {
"should": [
{
"term": {
"field": "myString",
"value": "value 1"
}
},
{
"term": {
"field": "myString",
"value": "value 2",
"boost": 2.0 (1)
}
},
{
"term": {
"field": "myString",
"value": "value 3",
"boost": 0.5 (2)
}
}
]
}
}
1 | Positive boost, increasing the score. |
2 | Negative boost, decreasing the score. |
To boost the bunch of expressions they could be wrapped by an inner boolean query:
{
"boolean": {
"should": [
{
"boolean": {
"should": [
{
"term": {
"field": "field",
"value": "a"
}
},
{
"term": {
"field": "field",
"value": "b"
}
}
],
"boost": 2.2
}
},
{
"term": {
"field": "field",
"value": "c"
}
}
]
}
}
Avoid combining group boosting and single term boosting in the same expression. |
Sort DSL
It’s a way to place result nodes in specific order based on their property values. The sort is defined on a per property level, with special field name for _score
to sort by relevance. Relevance is done by scoring each individual item based on how it matches your query.
{
"sort": {
"field": "_score"
}
}
Order defaults to DESC when sorting by _score , and ASC when sorting by anything else. |
To change a direction of sorting use direction
property:
{
"sort": {
"field": "myField",
"direction": "ASC"
}
}
To sort by a few fields just set them in the right order: .Score sorting
{
"sort": [
{
"field": "myFirstField",
"direction": "DESC"
},
{
"field": "mySecondField" //Defaults to ASC
}
]
}
geoDistance
The geoDistance allows to order the results according to distance to a given geo-point.
Parameter | Type | Description | ||||||
---|---|---|---|---|---|---|---|---|
field |
string |
geoPoint type property |
||||||
direction |
string |
'ASC' or 'DESC'. |
||||||
location |
object |
A geoPoint from which the distance factor should be calculated
|
||||||
unit |
string |
The string representation of distance unit to use. Defaults to "m" or "meters".
|
{
"sort": [
{
"field": "myGeoPoint",
"direction": "ASC",
"location": {
"lat": "90.0",
"lon": "0.0"
},
"unit": "km"
}
]
}