Collector task

Contents

In this chapter you will learn about tasks, the heart of a collector

Introduction

Collectors essentially build on a standard feature of the Enonic framework - tasks. Tasks provide out-of-the-box support for running background jobs with ease. Also, tasks can be scheduled via the Enonic framework.

Task

Tasks are essentially JS controllers, following a specific pattern.

Your app contains an example task within in the following file structure:

Application code:
src/
 main/
  resources/
   tasks/
    sample-collector/ (1)
     sample-collector.ts (2)
     sample-collector.xml (3)
1 The task folder
2 The task controller: Same name as folder
3 The task schema. Same name as folder

For more detais on tasks, visit the task documentation.

Collectors.json

The task must be explicitly declared in the collectors.json file.

The task name is the name of the collector, it must be unique within the app. Changing the name of the task will break the link between Explorer and the collector.
/src/main/resources/collectors.json
[{
	"displayName": "Sample Collector (Edit in collectors.json)",
	"formLibraryName": "SampleCollector",
	"formAssetPath": "collector/form/SampleCollector.esm.js",
	"taskName": "sample-collector"
}]

As you can see from the above file, the task name is listed per collector definition.

Task schema

The task schema is defined in sample-collector.xml.

This file basically defines an interface for tasks. It should basically look the same for all collectors (remember, you may have multiple collectors within a single app).

The following fields are defined:

collectionId

Which collection is calling the collector

collectorId

Registered ID of this collector

configJson

Configuration based on the collector form

language

Default language, as specified by the collector

This is what the schema looks like:

/src/main/resources/tasks/collect/collect.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<task>
	<description>Sample Collector (edit in sample-collector.xml)</description>
	<form>
		<input name="collectionId" type="TextLine">
			<label>Collection ID</label>
			<occurrences minimum="0" maximum="1"/>
		</input>
		<input name="collectorId" type="TextLine">
			<label>Collector ID</label>
			<occurrences minimum="1" maximum="1"/>
		</input>
		<input name="configJson" type="TextLine">
			<label>Config JSON</label>
			<occurrences minimum="1" maximum="1"/>
		</input>
		<input name="language" type="TextLine">
			<label>Language</label>
			<occurrences minimum="0" maximum="1"/>
		</input>
	</form>
</task>

Task controller

The actual task controller supplied by the starter is called sample-collector.ts.

This task controller is written in TypeScript, and will be compiled to JavaScript by the build script.

We’ll not dive into the exact details of how the controller gets its data, but rather look at the two most common parts of any collector task:

Constructor

Every collector needs to implement the run() function, and accept the standard parameters - as defined above.

/src/main/resources/tasks/collect/collect.ts
import {Collector} from '/lib/explorer';

export function run({
    collectionId,
    collectorId,
    configJson,
    language
}) {
    const collector = new Collector<CollectorConfig>({
        collectionId, collectorId, configJson, language
    });
}

Persisting

Another common element for any collector is to store documents in Explorer, for indexing and search.

Remember, a document is essentially a JSON file.
Example code for persisting a document
collector.persistDocument(documentToPersist, {
        documentTypeName: 'starter_doctype'
    });
    collector.addSuccess({
        message: "Sweet, it worked"
    });
} catch (e) {
    log.error(`message:${e.message}`, e);
    collector.addError({message: `${e.message}`});
}
Feel free to study the full task controller code for more details.

For an extensive task example - check out the Webcrawler collector code. The Webcrawler is shipped with Explorer by default.

Check out the Collector API documentation for a complete list of available functions.


Contents

Contents