Setting up the eXist scheduler

Introduction

There are a couple of scheduled jobs that are required to guarantee proper working of servers and some optional ones that you might consider to add to your installation such as backup and consistency checks. We strongly recommend to follow the instructions here, install the REQUIRED scheduled jobs and check whether you need the RECOMMENDED ones.

General information

ART-DECOR's eXist database has a job scheduler based on Quartzopen in new window, a full-featured, open source job scheduling system. If you want to read more, information of the general approach of eXist can be found hereopen in new window. We also compiled information about using cron-triggers in scheduled job definitions that can be found here.

In ART-DECOR environments the scheduled jobs are defined and set up in file conf.xml in the the etc directory of the eXist database root directory. If you followed our the instructions this will be thus in /usr/local/exist_atp/etc/conf.xml.

The major topics in this documentation to be visted are

  • Required preliminary configuration
  • Processing Queues
  • Refreshers
  • Notifications
  • Backup and Consistency checks
  • Restart the database

Required preliminary configuration

This configuration is REQUIRED.

Open the conf.xml file described in the General information section above. Find the closing XML element bracket </scheduler>of the scheduler part. Paste the following comment text above the closing XML element.

<!-- ====================================================== -->
<!-- ============ ART-DECOR Release 3 Jobs BEGIN ========== -->
<!-- 
     uses period in ms ...
			 period of 20s: period="20000"
     ...or cron-trigger, e.g.
       cron-trigger every 4 hours: "0 0 0/4 * * ?"
       cron-trigger every minute: "0 0/1 * * * ?"
-->

<!-- ====================================================== -->
<!-- ============ ART-DECOR Release 3 Jobs END ============ -->
<!-- ====================================================== -->

This helps you to immediatedly locate the ART-DECOR configuration part of the otherwise very long conf.xml file for the database.

Now walk through the following sections and add the required parts and consider also to add the recommended ones.

Processing Queues

This eXist scheduled job is REQUIRED to be configured.

WHAT IT DOES

This adds a scheduled job that runs every 20 second, checks for project related requests such as compilation and execute them.

Add the following lines to the ART-DECOR Releasse 3 Jobs part in the configuration file...

<!--
    Scan/process project related requests such as compilation every 20 seconds 
-->
<job type="user" name="scheduled-tasks" xquery="/db/apps/api/modules/library/scheduled-tasks.xql"
     period="20000" unschedule-on-exception="false"/>

...so that it looks like this

<!-- ====================================================== -->
<!-- ============ ART-DECOR Release 3 Jobs BEGIN ========== -->
<!-- 
     uses period in ms ...
			 period of 20s: period="20000"
     ...or cron-trigger, e.g.
       cron-trigger every 4 hours: "0 0 0/4 * * ?"
       cron-trigger every minute: "0 0/1 * * * ?"
-->

<!--
    Scan/process project related requests such as compilation every 20 seconds 
-->
<job type="user" name="scheduled-tasks" xquery="/db/apps/api/modules/library/scheduled-tasks.xql"
     period="20000" unschedule-on-exception="false"/>

<!-- ====================================================== -->
<!-- ============ ART-DECOR Release 3 Jobs END ============ -->
<!-- ====================================================== -->








 
 
 
 
 






Refreshers

Cache refresh

This eXist scheduled job is strongly RECOMMENDED to be configured.

WHAT IT DOES

This adds a scheduled job that refreshes the ART-DECOR Cloud Cache every 6 hours. If a refresh fails, the job is unschedules.

The scheduled job has two parameters that should be fixed for this refesh action.

  • topic shall be valued cache
  • format shall be value decor

There are other constellations described elsewhere.

To activate this scheduled job, add the following lines to the ART-DECOR Releasse 3 Jobs part in the configuration file. More information about using cron-triggers in scheduled job definition that can be found here.

<!--
    Cache refresh every 6 hours, unschedule if fails
    parameter topic : 'cache'
    parameter format : 'decor'
-->
<job type="user" name="scheduled-refreshs" xquery="/db/apps/api/modules/library/scheduled-refreshs.xql" 
     cron-trigger="0 0 0/6 * * ?" unschedule-on-exception="true">
     <parameter name="topic" value="cache"/>
     <parameter name="format" value="decor"/>
</job>

Notifications

Periodic notifications

This eXist scheduled job is strongly RECOMMENDED to be configured.

WHAT IT DOES

This adds a scheduled job that runs every 10 minutes (600 seconds), checks for periodic notification requests, e.g. notifications on changed issues, compiles them and sends them out.

The scheduled job has two parameters that should be fixed for this action.

  • sendmail shall be valued true or false , where false is only meant to be the testing mode; in a true production environment this shall be always set to true.
  • mysender shall be a formal valid email sender address, XML escaped chars like &lt; for <; an example from out main server is ART-DECOR Notifier &lt;reply.not.possible@art-decor.email> so that users receive their notification with this sender address.

WARNING

Make sure that the mysender email address for the sender at the server is set up correctly so that notification emails are not classfied as spam elsewhere and so won't reach recipients.

Typically you use a reply.not.possible address as you don't expect anybody sending replies to the notifications.

To activate this scheduled job, add the following lines to the ART-DECOR Releasse 3 Jobs part in the configuration file...

<!--
    Scan/process for periodic notifications every 600 seconds (10 mins)
    e.g. notifications on changed issues and release
    parameter sendmail : true or false
    parameter mysender : a formal valid email sender address, XML escaped chars like &lt;
-->
<job type="user" name="periodic-notifier" xquery="/db/apps/api/modules/library/periodic-notifier.xql"
     period="600000" unschedule-on-exception="true">
     <parameter name="sendmail" value="true"/>
     <parameter name="mysender" value="ART-DECOR Notifier &lt;reply.not.possible@art-decor.email>"/>
</job>

Scheduled notifications

This eXist scheduled job is strongly RECOMMENDED to be configured.

WHAT IT DOES

This adds a scheduled job that runs every 12 minutes (720 seconds), checks for scheduled notification requests, e.g. (new) users to be notified about the username, password (reset) and his/her projects and sends them out.

The scheduled job has four parameters that should be fixed for this action.

  • sendmail shall be valued true or false , where false is only meant to be the testing mode; in a true production environment this shall be alsways set to true.
  • mysender shall be a formal valid email sender address, XML escaped chars like'&lt;' for <; an example from out main server is ART-DECOR Notifier &lt;reply.not.possible@art-decor.email> so that users receive their notification with this sender address.
  • accounting shall be a formal valid email address, XML escaped chars like &lt; for < that may get inquries or reply messages from users who received a notification; an example from out main server is ART-DECOR Notifier &lt;reply.not.possible@art-decor.email> so that users receive their notification with this sender address.
  • myserverurl shall be the formal valid server URL that repesents your server. This URL is included in the notifiction message to inform the user about the server where he got credetials for, for example.

WARNING

Make sure that the mysender email address for the sender at the server is set up correctly so that notification emails are not classfied as spam elsewhere and so won't reach recipients.

Typically you use a reply.not.possible address as you don't expect anybody sending replies to the notifications.

Make sure that the myserverurl is a valid URL pointing to your server.

To activate this scheduled job, add the following lines to the ART-DECOR Releasse 3 Jobs part in the configuration file...

<!--
    Scan/process for scheduled notifications every 720 seconds (12 mins)
    e.g. (new) users to be notified about the username, password (reset) and his/her projects
    parameter sendmail : true or false
    parameter mysender : a formal valid email sender address
    parameter accounting : a valid email address of accountings where to write emails to
    parameter myserverurl : the server url for which this task runs, e.g. https://my-server.org
-->
<job type="user" name="scheduled-notifier" xquery="/db/apps/api/modules/library/scheduled-notifier.xql" 
     period="720000" unschedule-on-exception="true">
     <parameter name="sendmail" value="true"/>
     <parameter name="mysender" value="ART-DECOR Notifier &lt;reply.not.possible@art-decor.email>"/>
     <parameter name="accounting" value="ART-DECOR Accounts &lt;accounts@art-decor.email>"/>
     <parameter name="myserverurl" value="https://develop.art-decor.org"/>
</job>

Backup, Export and Consistency checks

There are a couple of option for backup, export and consitency checks.

Backup of the pure database files

This eXist scheduled job is RECOMMENDED to be configured.

WHAT IT DOES

This adds a scheduled job that runs a backup of the pure database files, zipped, every night at 1:00 am.

This will result in a file like 202211240100007.zip containing files and directories in the data folder like *.dbx files etc. into the exist database subdirectory /data/backup.

The scheduled job has two parameters.

  • output-dir shall be valued backup; leave it as is in production environments as this is the expected location/path.
  • zip-files-max shall be an integer, e.g. 1 to hold a maximum of 1 zip file as backups or 5 for example to hold 5 backups. Older backups will be deleted from directory output-dir.

To activate this scheduled job, add the following lines to the ART-DECOR Releasse 3 Jobs part in the configuration file. More information about using cron-triggers in scheduled job definition that can be found here.

<!--    
    Run a backup of the pure database files, zipped, every night at 1:00 am
    This will result in a file like 202211240100007.zip containing *.dbx files etc in /data/backup
-->
<job type="system" name="databackup" class="org.exist.storage.DataBackup" 
     cron-trigger="0 0 1 * * ?">
     <parameter name="output-dir" value="backup"/>
     <parameter name="zip-files-max" value="10"/>
</job>

Full Export of the database

This eXist scheduled job is RECOMMENDED to be configured.

WHAT IT DOES

This adds a scheduled job that runs a consistency check and full export of the database every night at 2:00 am.

This will result in a file like full20221101-0201.zip containing the exported files and the check report file in /data/export.

The scheduled job has six parameters.

  • output shall be valued export; leave it as is in production environments as this is the expected location/path.
  • backup shall be valued yes as we want a consistency check and an export.
  • zip shall be valued yes to zip the resulting export.
  • incremental shall be valued no as we want a full export. We recommend to do full nightly exports only.
  • incremental-check shall be valued no. If you want repeating checks and full + incremental backups, please see follwoing section.
  • max shall be an integer, e.g. 1 to hold a maximum of 1 zip file as exports or 5 for example to hold 5 exports. Older exports will be deleted from directory output.

WARNING

Due to limitations of the zip format, archives larger than 4 gigabytes may not be readable. Consider to set the zip option to no (see above) which will create a backup on the file system which has no such limitations.

To activate this scheduled job, add the following lines to the ART-DECOR Releasse 3 Jobs part in the configuration file. More information about using cron-triggers in scheduled job definition that can be found here.

<!--    
    Run a consistency check and export of the database every night at 2:00 am
    This will result in a file like full20221101-0200.zip containing the exported
    files and the check report file in /data/export
-->
<job type="system" name="check-backup" class="org.exist.storage.ConsistencyCheckTask" 
     cron-trigger="0 0 2 * * ?">
     <parameter name="output" value="export"/>
     <parameter name="backup" value="yes"/>
     <parameter name="zip" value="yes"/>
     <parameter name="incremental" value="no"/>
     <parameter name="incremental-check" value="no"/>
     <parameter name="max" value="4"/>
</job>

NOTE

You should use a Full Export of the database or a Full and Incremental Exports of the database as described here, not both.

Full and Incremental Exports of the database

This eXist scheduled job is OPTIONAL to be configured. If you have a heavily working server with lots of users this option might be considered.

WHAT IT DOES

This adds a scheduled job that runs a consistency check and full export every night at starting at 2:00 am and subsequent incremental exports of the database every 2 hours.

This will result in a file like full20221101-0201.zip for the first full backup and inc20221101-1500.zip containing the exported files and the check report file in /data/export.

The scheduled job has six parameters.

  • output shall be valued export; leave it as is in production environments as this is the expected location/path.
  • backup shall be valued yes as we want a consistency check and an export.
  • zip shall be valued yes to zip the resulting export.
  • incremental shall be valued yes as we want additional an incremental exports. We recommend to do full nightly exports only. The first backup will always be a full backup. Subsequent backups will be incremental: only resources which were modified since the last backup will be saved.
  • incremental-check shall be valued no as incremental backups should not do consistency checks because this may take too long.
  • max On incremental backup, create a full backup every max backup runs. For eaxmple, if you set the parameter to 2, a full backup will be performed after every two incremental backups. For our setting we recommend to set max to 12.

WARNING

Due to limitations of the zip format, archives larger than 4 gigabytes may not be readable. Consider to set the zip option to no (see above) which will create a backup on the file system which has no such limitations.

To activate this scheduled job, add the following lines to the ART-DECOR Releasse 3 Jobs part in the configuration file. More information about using cron-triggers in scheduled job definition that can be found here.

<!--    
    Run a consistency check and full export of the database every night at 2:00 am
    and incremental export every 2 hours.
    This will result in a file like full20221101-0200.zip containing the exported
    files and inc20221101-0400.zip for the incrementals
    plus all the check report files in /data/export
-->
<job type="system" name="check-backup" class="org.exist.storage.ConsistencyCheckTask" 
     cron-trigger="0 0 2/2 * * ?">
     <parameter name="output" value="export"/>
     <parameter name="backup" value="yes"/>
     <parameter name="zip" value="yes"/>
     <parameter name="incremental" value="yes"/>
     <parameter name="incremental-check" value="no"/>
     <parameter name="max" value="12"/>
</job>

NOTE

You should use a Full Export of the database or a Full and Incremental Exports of the database as described here, not both.

Consistency check only

This eXist scheduled job is RECOMMENDED to be configured.

WHAT IT DOES

This adds a scheduled job that runs consistency check only every 3 hours. Typically a concistency check takes only a few seconds even for larger databases.

This will result in a check report file in /data/export; A backup is started only if inconsistencies are found.

The scheduled job has three parameters.

  • output shall be valued export; leave it as is in production environments as this is the expected location/path.
  • backup shall be valued no as we require an export only when the consistency check failed.
  • zip shall be valued yes to zip the resulting export.

To activate this scheduled job, add the following lines to the ART-DECOR Releasse 3 Jobs part in the configuration file. More information about using cron-triggers in scheduled job definition that can be found here.

<!--    
    Run a consistency check only every 3 hours
    This will result in a check report file in /data/export, a backup 
    is started only if inconsistencies are found
-->
<job type="system" name="check" class="org.exist.storage.ConsistencyCheckTask" 
     cron-trigger="0 0 0/3 * * ?">
     <parameter name="output" value="export"/>
     <parameter name="backup" value="no"/>
     <parameter name="zip" value="yes"/>
</job>

Restart the database

After walked through the instruction chapters above stop the database service...

systemctl stop eXist-db.service

...and then immediately start the system again to reload your new database server configuration.

systemctl start eXist-db.service
...
systemctl status eXist-db.service

This concludes the database configuration.

Last Update:
Contributors: dr Kai U. Heitmann