NoQL reference

Introduction

The Node Query Language, or NoQL for short, is inspired by traditional SQL. As with other NoSQL solutions, it has special capabilities and limitations.

Selectors, joins and update statements are essentially not supported. However, NoQL adds cool features like relevance sorting and aggregations.

Selectors are currently not supported, and the only result of a query will only be identifiers for the matching nodes. Developers must then get the desired nodes (with their data) through a separate request.

A NoQL statement is essentially composed from two parts: query, and sorting.

Query

Queries are just one of the ways to access data in the storage. Developers may also access nodes by ID, path or by fetching child items. A query normally targets a single repository, but may also query multiple repositories at once.

There are two types of query’s format available in XP: String format and DSL (Domain Specific Language).

Query DSL

Query DSL is an AST (Abstract Syntax Tree) expressed in JSON to define query expressions. See documentation for further explanation and examples.

String Query

String queries are built from traditional expressions. For instance, the following query would return all nodes in the repo, where the property weight is greater than 10.

weight > 10

Expressions may be combined by using traditional logical operators such as AND, and OR. For instance, we could limit the result further:

weight > 10 AND fulltext('article', 'should have these words', 'AND')

In this case we are adding a so-called dynamic expression to the query. The fulltext() expression performs a free text search on the property article for the specified search string.

For both the integer comparison and fulltext expression to work, the weight, and article properties need to be indexed properly.

If the query is empty, all nodes will be considered matches and returned.

The results matching a given query will be assigned a score. The scoring is for instance used to achieve relevance sorting.

Syntax

The query syntax is based on SQL and looks very similar:

queryExpr

Grammar

queryExpr = [ constraintExpr ];

Empty query: If no queryExpression is given, all documents will be in the result.

constraintExpr

Grammar

constraintExpr = compareExpr
               | logicalExpr
               | functionExpr
               | notExpr ;

compareExpr

Grammar

compareExpr   = fieldExpr operator valueExpr ;
fieldExpr     = propertyPath ;
operator      = '=', '!=', '>', '>', '<', '<=', 'LIKE', 'NOT LIKE', 'IN', 'NOT IN' ;
valueExpr     = string | number | valueFunc ;
valueFunc     = geoPoint | instant | time | date | dateTime | localDateTime ;
geoPoint      =  '"' lat ',' lon '"' ;
instant       = 'instant(' string ')' ;
date          = 'date(' string ')' ;
time          = 'time(' string ')' ;
dateTime      = 'dateTime(' string ')' ;
localDateTime = 'localDateTime(' string ')' ;

Examples

user.myCategory = "articles"
user.myCategory IN ("articles", "documents")
user.myCategory != "articles"
user.myCategory LIKE "*tic*"
myPriority < 10
myPriority <= 10
myPriority > 10
myPriority < 100
myPriority != 10
myInstant = instant('2014-02-26T14:52:30.00Z')
myInstant <= instant('2014-02-26T14:52:30.00Z')
myInstant <= dateTime('2014-02-26T14:52:30.00+02:00')
myTime = time('09:00')
myDate = date('2015-02-26')
myLocalDateTime = localDateTime('2015-02-26T15:00')
myLocation = '59.9127300,10.7460900'
myLocation IN ('59.9127300,10.7460900','59.2181000,10.9298000')

instant, dateTime, localDateTime, date functions will force search in datetime index, meanwhile for time function usual text index will be used. Using expressions without functions perform a search in text index for any time-related field. See more about indexing there.

logicalExpr

Grammar

logicalExpr = constraintExpr operator constraintExpr ;
operator    = 'AND' | 'OR' ;

Examples

myCategory = "articles" AND myPriority > 10
myCategory IN ("articles", "documents") OR myPriority <= 10

notExpr

Grammar

notExpr = 'NOT' constraintExpr ;

Examples

NOT myCategory = 'article'

functionExpr

Grammar

functionExpr = functionName '(' arguments ')' ;

Examples

Fulltext analyzed property contains 'fish' and 'spot'

fulltext('_allText', 'fish spot', 'AND')

Sub properties of 'data.' contains 'fish' and 'spot'

fulltext('data.*', 'fish spot', 'AND')

Property 'myCategory' is either 'article' or 'document' and title starts with 'fish'.

myCategory IN ('article', 'document') AND ngram('title', 'fish', 'AND')

propertyPath

Grammar

propertyPath = pathElement ( '.' pathElement )* ;
pathElement  = ( [ validJavaIdentifier - '.' ] )* ;

Examples

myProperty
data.myProperty
data.myCategory.myProperty

Wildcards in propertyPaths are only supported in fulltext and ngram functions at the moment. When using these functions, expressions like these are valid:

myProp*
*Property
data.*
*.myProperty
data.*.myProperty

Query functions

Here’s a description of all functions that can be used in a query.

fulltext()

The fulltext function is searching for words in a field, and calculates relevance scores based on a set of rules (e.g number of occurences, field-length, etc).

Only analyzed properties are considered when applying the fulltext function.

Function definition

fulltext(<fields>, <search-string>, <operator>)

fields: Comma-separated list of propertyPaths to include in the search.

Example fields

'displayName' // Search in single field
'displayName,data.description,my.title' // Search in multiple fields
'data.*' // Wildcard usage

You can boost - thus increasing or decreasing hit-score per field basis. By providing more than one field to the query by appending a weight-factor: ^N

Boosting example

fulltext('displayName^5,data.description', 'my search string', 'AND')

search-string

The search-string supports a set of operators

+ signifies AND operation.
| signifies OR operation.
- negates a single token.
* at the end of a term signifies a prefix query.
( and ) signify precedence.
“` and `” wraps a number of tokens to signify a phrase for searching
~N after a word signifies edit distance (fuzziness) with a number representing Levenshtein distance.
~N after a phrase signifies slop amount (how far apart terms in phrase are allowed)

operator

Allowed operators are:

OR Any of the words in the search-string matches.
AND All words in search-string matches.

Examples

Match if "myField" contains any of the given words.

fulltext("myField", "cheese fish cake onion", "OR")

Match if any field with path starting with "myData.myProperties" contains any of the given words.

fulltext("myData.myProperties.*", "cheese fish cake onion", "OR")

Match if "myField" contains any of the given words and "myCategory" = "soup".

myCategory = '"soup" AND fulltext("myField", "cheese fish cake onion", "OR")'

Match if "myField" contains all the given words.

fulltext("myField", "cheese fish cake onion", "AND")

Match if "myField" contains "Levenshtein" with a fuzziness distance of 2.

fulltext("myField", "Levenshtein~2", "AND")

Match if "myField" contains "fish" and not "boat".

fulltext("myField", "fish -boat", "AND")

Match if any field under data-set data contains "fish" and not "boat".

fulltext("data.*", "fish -boat", "AND")

Match exact phrase.

fulltext('data.*', '"gone fishing today"', "AND")

Match phrase where maximum distance between words are 2.

fulltext('data.*', '"gone fishing today"~2', "AND")

stemmed()

The stemmed function is similar to fulltext() except that it searches language optimized tokens instead of a source text. E.g. source text The monkey loved bananas will be transformed to the, monkey, love, banana tokens and they will be used for search.

Stemming is language-dependent, so language must be set either on the content or directly in the node indices via indexConfig.

Function definition

stemmed(<fields>, <search-string>, <operator>, <language>)

fields: Comma-separated list of propertyPaths to include in the search.

Only _alltext field is currently indexed for stemming

search-string

The search-string supports a set of operators

+ signifies AND operation.
| signifies OR operation.
- negates a single token.
* at the end of a term signifies a prefix query.
( and ) signify precedence.
“` and `” wraps a number of tokens to signify a phrase for searching
~N after a word signifies edit distance (fuzziness) with a number representing Levenshtein distance.
~N after a phrase signifies slop amount (how far apart terms in phrase are allowed)

operator

Allowed operators are:

OR Any of the words in the search-string matches.
AND All words in search-string matches.

language

Content language that was used for stemming. List of supported languages

Examples

Match if any field contains any of the given words or their derivatives in english ("fishing", "cakes"…)

stemmed("_alltext", "fish cake", "OR", "en")

nGram()

An n-gram is a sequence of n letters from a term. During ngram indexing, the term "foxy" is also indexed as: "f", "fo", and "fox".

When using the nGram search function, we are able get matches, even if the search only contains parts of a term. This is for instance useful when createing autocomplete functionality. The max limit of the ngram tokenizer is 25 characters, meaning that search strings over 25 characters will not match. As such, ngram queries may successfully be combined with the fulltext search function or other query expressions, to both match fragments of words as well as full phrases.

Only properties analyzed as text are considered when applying the ngram-function. This includes, by default, all text-based fields in the content domain.

Function definition

ngram(<field>, <search-string>, <operator>)

fields: Comma-separated list of propertyPaths to include in the search.

operator

Allowed operators are:

OR Any of the words in the search-string matches (default)
AND All words in search-string matches.

Examples

Property "myProp" contains any word beginning with "lev", e.g "Levenshteins Algorithm".

ngram("myProp", "lev")

Property "myProp" contains words beginning with "lev" and "alg", e.g "Levenshteins Algorithm".

ngram("myProp", "lev alg", "AND")

Properties within "myProp" contains words beginning with "fish" or "boat", e.g "fishpond" or "boatman".

ngram("myProp.*", "fish boat", "OR")

range()

The range functions test each value in the given property for a given range.

Function definition

range(<field>, <from>, <to>, [<includeFrom>], [<includeTo>])

The from and to values must be of the same value type.

includeFrom and includeTo are optional with default value 'false', meaning that the actual values for the from and to are not included as matches.

Unbounded ranges can be queried by providing an empty string as argument.

Examples

Property version in the range, including '6.3.0'

range('version', '6.3.0', '6.4.0', 'true', 'false')

Property publishFrom is between two points in time

range('publishFrom', instant('2015-08-01T09:00:00Z'), instant('2015-08-01T11:00:00Z') )

Property myValue between 2.0 and 3.0, including 2.0

range('myValue', 2.0, 3.0, 'true', 'false' )

Property publishFrom newer that the given date.

range('publishFrom', instant('2015-08-01T09:00:00Z'), '')

Property publishTo older that the given date.

range('publishTo', '', instant('2015-08-01T09:00:00Z'))

pathMatch()

The path-match matches a path in a same branch, scoring the paths closest to the given query path first. Also, a number of minimum matching elements that must match could be set.

Function definition

pathMatch(<field>, <path>, [<minimum_elements_must_match>])

If not given, the default minimum-must-match value will be 1.

Example

Given the following nodes

/content/mySite
/content/mySite/fish
/content/mySite/fish/onion
/content/mySite/cheese
/content/mySite/cheese/jam
/content/myOtherSite

Property _path matching minimum 2 path elements

pathMatch('_path', '/content/mySite/fish/onion/mayonnaise', 2)

This will return (orded by _score):

/content/mySite/fish/onion
/content/mySite/fish
/content/mySite/cheese/jam
/content/mySite/cheese
/content/mySite

Sort

Similar to traditional SQL databases, XP lets you sort the result by property in ascending or descending order. A basic sort statement is simply defined by property and sorting direction i.e.:

myproperty DESC

Additionally, similar to Google, text-based query results may be sorted by ranking. Ranking is done through an internal algorithm that scores each individual item based on how it matches with your search. To sort by ranking, use the following statement:

_score DESC

A sorting statement can consist of one or more comma separate expressions:

If sort specified for a query, results will contain a meta system property _sort. Its exposes field values used for sorting.

To use DSL sort check documentation.

Syntax

sortExpr

Grammar

sortExpr = (  | dynamicSortExpr ) ( ',' ( fieldSortExpr | dynamicSortExpr ) )* ;

fieldSortExpr

Grammar

fieldSortExpr = propertyPath [ direction ] ;
direction     = 'ASC' | 'DESC' ;

Examples

_name ASC
_timestamp DESC
title DESC
data.myProperty

dynamicSortExpr

Grammar

dynamicSortExpr = functionExpr [ direction ] ;
direction        = 'ASC' | 'DESC' ;

Example sorting by distance from a geoPoint

geoDistance('locationProperty', '59.9127300,10.746090')

Sort functions

Here’s a description of all functions that can be used in order-by clause.

geoDistance()

The geoDistance-function enables you to order the results according to distance to a given geo-point.

Documents with no geo-point property with the given path will be ordered last if matching the query.

geoDistance syntax

geoDistance(<field>, <location>, [<unit>])

field

Any geoPoint property

location

The location is a geoPoint from which the distance factor should be calculated, formatted as "latitude,longitude".

unit

The string representation of distance unit to use. Defaults to "m" or "meters", the list of available values below:

"m" or "meters"
"in" or "inch"
"yd" or "yards"
"ft" or "feet"
"km" or "kilometers"
"NM" or "nmi" or "nauticalmiles"
"mm" or "millimeters"
"cm" or "centimeters"
"mi" or "miles"

geoDistance example

geoDistance("shopLocation", "59.9127300,10.7460900", "km")

Relevance sorting

All properties with values automatically get a sort index entry. This means that any property can be used for sorting, either (default) DESC (descending) or ASC (ascending).

_score: The dynamic system property _score is the default sort property

The score value of a node is dynamically calculated based on a number of factors: For instance, number of matching clauses in boolean expressions, how often the term appears in the documents when searching for text etc.

For more details, check out the Elasticsearch documentation.

Sort DSL

Sort DSL is an AST (Abstract Syntax Tree) expressed in JSON to define sort expressions. See documentation for further explanation and examples.

More examples

Querying paths

All nodes have three system-properties concerning the node placement in a branch, all of type String:

_name: The node name without path.
_parentPath: The parent node path.
_path: The full path of the node.

See the pathMatch() query function for advanced path matching

When working with the content API, all items are placed a special root node: /content. While this mostly is explicit when working in the content-domain, this has to be dealt with when using paths in query-expressions and functions since you are actually querying nodes.

Examples

Finds node with path /content/mySite/myCategory/myContent.

_path = '/content/mySite/myCategory/myContent'

Find all nodes with name myContent in a folder named myCategory

_name = 'myContent' AND _parentPath LIKE '*myCategory'

Find all nodes under the path /content/mySite/myCategory including children of children.

_path LIKE '/content/mySite/myCategory/*'

Find only first level children under the path /content/mySite/myCategory.

_parentPath = '/content/mySite/myCategory'

Querying missing values

From time to time, it is also interesting to search for nodes with or without a value:

Sample query where property myValue "exists"

myValue LIKE "*"

Sample query where "publish.first" does NOT exists

publish.first NOT LIKE "*"

Using exists and notExists filters is considered the most efficient way searching for nodes with missing values

Querying references

An aggregation of all outbound references in a node are stored in the system property _references. This property can then effectively be used also to find incoming references to a node.

Find all nodes referring to the node with id = 'abc':

_references = 'abc'

Date and time

Querying against date and time-fields may require some knowledge on how data is stored and indexed.

LocalDate

LocalDate represents a date without time-zone in the ISO-8601 calendar, e.g 2015-03-19. LocalDate-properties are stored as a ISO LocalDate-formatted string in the index, thus all searches are done against string-values.

LocalDate string-format:

yyyy-MM-dd

Given a node with a property named 'myLocalDate' of type localDate and value 2015-03-19, all of the following queries will match:

myLocalDate = '2015-03-19'
myLocalDate > '2015-03-18'
myLocalDate <= '2015-03-19'

LocalTime

LocalTime represents a time without time-zone in the ISO-8601 calendar, e.g 11:39:49. LocalTime-properties are stored as a ISO LocalTime-formatted string in the index, thus all searches are done against string-values.

LocalTime string-format:

HH:mm[:ss[.SSS]]

LocalTime string value examples:

09:30
10:00
10:00:30
10:00:30.142

Since the queries are matching string-values, the input time in query must either adhere the same string-format restrictions, or be wrapped in a function time which accepts a time-formatted string as input.

Given a node with a property named 'myLocalTime' of type localTime and value = 09:36:00, all the following queries will match:

myLocalTime > '09:00'
myLocalTime = '09:36'
myLocalTime = '09:36:00'
myLocalTime LIKE '09:*'
myLocalTime < '09:36:01'
myLocalTime < '09:36:00.1'

This must be wrapped in time-function since its not padded with a leading 0:

myLocalTime > time('9:00')

If optional fractions of seconds are given, the string format will also contain this even if 0, and expression will not match unless wrapped in time-function::

myLocalTime = time('09:36:00.0')

Even if the string-matching will do the job 99% of the time, the safest bet is to always go with the time-function when applicable.

LocalDateTime

LocalDateTime represents a date-time without time-zone in the ISO-8601 calendar, e.g 2015-03-19T11:39:49. LocalDateTime-properties are stored as a ISO LocalDateTime-formatted string in the index, thus all searches are done against string-values.

LocalDateTime string-format:

yyyy-MM-ddTHH:mm[:ss[.SSS]]

Since the queries are matching string-values, the input dateTime in query must either adhere the same string-format restrictions, or be wrapped in a function dateTime which accepts a dateTime-formatted string as input.

Given a node with a property named 'myLocalDateTime' of type localDateTime and value 2015-03-19T10:30:00, all of the following queries will match: myLocalDateTime = '2015-03-19T10:30:00' myLocalDateTime = dateTime('2015-03-19T10:30') myLocalDateTime < dateTime('2015-03-19T10:30:00.001')

DateTime / Instant

DateTime represents a date-time with time-zone in the ISO-8601 calendar, e.g 2015-03-19T11:39:49+02:00. Its possible to query properties of with value-type DateTime both as an ISO instant and as ISO dateTime, using the provided built-in functions instant and dateTime.

Instant string-format (instant always given in UTC-time):

yyyy-MM-ddTHH:mm[:ss[.SSS]Z

Instant string value examples:

2015-03-19T16:30:20Z
2015-03-19T16:30:20.123Z

DateTime string-format (Z for UTC, else offset in hours and minutes):

yyyy-MM-ddTHH:mm[:ss[.SSS](Z|+hh:mm|-hh:mm)

DateTime string value examples:

2015-03-19T16:30:20Z
2015-03-19T16:30:20+01:00
2015-03-19T16:30:20-01:30
2015-03-19T16:30:20.123-01:30

Given a node with a property named 'myDateTime' of type dateTime and value 2015-03-19T10:25:00+02:00, all of the following queries will match:

myDateTime = instant('2015-03-19T08:25:00Z')
myDateTime = dateTime('2015-03-19T08:25:00Z')
myDateTime = dateTime('2015-03-19T10:25:00+02:00')
myDateTime = dateTime('2015-03-19T11:25:00+03:00')

NoQL reference

Contents

Introduction

Query

Query DSL

String Query

Syntax

queryExpr

constraintExpr

compareExpr

logicalExpr

notExpr

functionExpr

propertyPath

Query functions

fulltext()

stemmed()

nGram()

range()

pathMatch()

Sort

Syntax

sortExpr

fieldSortExpr

dynamicSortExpr

Sort functions

geoDistance()

Relevance sorting

Sort DSL

More examples

Querying paths

Querying missing values

Querying references

Date and time

LocalDate

LocalTime

LocalDateTime

DateTime / Instant

Contents

Contents