Configuration

Configuration is done via a TOML file located at .erddaputil.toml (using the above Docker configuration). Use the .erddaputil.example.toml file to create this. Common settings are below, use the typical TOML format for nested keys to specify them, e.g.:

[erddaputil]
secret_key = "foo"

[erddaputil.erddap]
big_parent_directory = "/erddap_data"

Environment variables can also be used by replacing all periods with underscores in the keys below. This does not work for configuration options that are lists at the moment, such as the peppers.

ERDDAPUtil Core

erddaputil.compile_on_boot
Type: bool
Required: False
Default: true

By default, ERDDAPUtil will attempt to compile the datasets.xml file on boot. Set this to false to prevent this behaviour.

erddaputil.create_default_user_on_boot
Type: bool
Required: False
Default: true

By default, ERDDAPUtil will attempt to create a user using erddaputil.default_username and erddaputil.default_password on boot or reset the password of that user if it already exists. Set this to false to prevent this behaviour.

erddaputil.default_username
Type: str
Required: False
Default: admin

The username to create on boot, if erddaputil.create_default_user_on_boot is set.

erddaputil.default_password
Type: str
Required: False
Default: admin

The password to set for the user created on boot, if erddaputil.create_default_user_on_boot is set.

erddaputil.fix_erddap_bpd_permissions
Type: bool
Required: False
Default: true

ERDDAP’s BPD requires the Tomcat user have access to create and manage the directories within it. When a Docker volume is used, the permissions are not typically correct. Therefore, ERDDAPUtil attempts to correct this on boot to ensure that ERDDAP is useable by running os.chown() on it (Linux only). Set this to false to prevent this behaviour. The Tomcat user ID and group ID can be specified via erddaputil.erddap.tomcat_uid and erddaputil.erddap.tomcat_gid.

erddaputil.metrics_manager
Type: str
Required: False

Leave this blank to not use the metrics backend or specify a class that has the methods send_message(metric: _Metric), start(), terminate(), and join(). The first will be called every time a metric needs to be updated, start() is used when the metrics are first loaded, and terminate() and join() are called in that sequence when shutting down. Since the metrics manager is typically a separate thread, terminate() should instruct the thread to gracefully exit and start() and join() should be inherited from threading.Thread.

ERDDAPUtil provides erddaputil.main.metrics.LocalPrometheusSendThread which uses the HTTP API’s Prometheus metrics.

erddaputil.secret_key
Type: str
Required: True

This should be a secret that is the same between all servers that will share an AMPQ exchange or daemon and is used to validate that the messages passed are not malicious. It should have at least 192 bits of randomness.

erddaputil.show_config
Type: bool
Required: False
Default: false

Set to true to dump the configuration to stdout on boot. This is useful for debugging.

erddaputil.use_ampq_exchange
Type: bool
Required: False
Default: false

Set this to true to use the AMPQ features.

erddaputil.use_local_daemon
Type: bool
Required: False
Default: true

Set this to false if you want to only send messages to AMPQ from the CLI or HTTP API.

ERDDAP Configuration

erddaputil.erddap.base_url
Type: str
Required: False

The base URL for ERDDAP (e.g. http://localhost:8080/erddap). Note that, unlike ERDDAP, this should include the /erddap path.

erddaputil.erddap.big_parent_directory
Type: path
Required: False

Set to the same value as ERDDAP’s bigParentDirectory configuration value

erddaputil.erddap.datasets_d
Type: path
Required: False

Set to the directory containing XML files with dataset definitions in them. These should be identical to the ones created for ERDDAP, but each in their own XML file. Each XML file should contain exactly one <dataset> tag as the root-level element. While ERDDAP requires datasets use ISO-8859-1 encoding, these datasets can use any encoding as long as it is declared and compatible with ISO-8859-1 (illegal characters will be replaced).

erddaputil.erddap.datasets_xml_template
Type: path
Required: False

By default, ERDDAPUtil will use an empty <erddapDatasets> tag to generate datasets.xml. If you want to supply your own template, provide it here. ERDDAPUtil will only modify it by (a) adding all of the datasets found in datasets.d and (b) updating the block and allow lists. Your template file may use a different character encoding as long as it is ISO-8859-1 compatible.

erddaputil.erddap.ip_block_list
Type: path
Required: False

A path to a text file of IP addresses, ranges, or subnets to block requests from (one entry per line). Defaults to {BIG_PARENT_DIRECTORY}/.ip_block_list.txt

erddaputil.erddap.subscription_block_list
Type: path
Required: False

A path to a text file of emails to block subscriptions for (one email per line). Defaults to {BIG_PARENT_DIRECTORY}/.email_block_list.txt

erddaputil.erddap.unlimited_allow_list
Type: path
Required: False

A path to a text file of IP addresses, ranges, or subnets to allow unlimited access to (one entry per line). Defaults to {BIG_PARENT_DIRECTORY}/.unlimited_allow_list.txt

Tomcat Configuration

erddaputil.tomcat.log_directory
Type: str
Required: False

The directory where Tomcat’s access logs are written to. If omitted, Tomcat log parsing and management will be disabled.

erddaputil.tomcat.log_prefix
Type: str
Required: False
Default: access_log

The prefix for Tomcat’s access log files (should match the setting in AccessLogValve).

erddaputil.tomcat.log_suffix
Type: str
Required: False

The suffix for Tomcat’s access log files (should match the setting in AccessLogValve).

erddaputil.tomcat.log_pattern
Type: str
Required: False
Default: common

The pattern for Tomcat’s access log files (should match the setting in AccessLogValve).

erddaputil.tomcat.log_encoding
Type: str
Required: False
Default: utf-8

The encoding for Tomcat’s access log files (should match the setting in AccessLogValve).

erddaputil.tomcat.major_version
Type: int
Required: False
Default: 10

The major version of Tomcat in use. What is most important is that this is 10 or higher on Tomcat 10+ and under 10 on versions under 10 as this affects how some log strings are parsed.

erddaputil.tomcat.gid
Type: int
Required: False
Default: 1000

The group ID that tomcat runs as. This is used only if erddaputil.fix_erddap_bpd_permissions is set to true.

erddaputil.tomcat.uid
Type: int
Required: False
Default: 1000

The user ID that tomcat runs as. This is used only if erddaputil.fix_erddap_bpd_permissions is set to true.

Dataset Management

erddaputil.dataset_manager.backups
Type: path
Required: False

If specified, whenever a new datasets.xml file is generated, the old one will be backed-up into this folder. Backups are cleaned up according to the below retention setting.

erddaputil.dataset_manager.backup_retention_days
Type: int
Required: False
Default: 31

Backups of datasets.xml are deleted after the given number of days.

erddaputil.dataset_manager.max_delay_seconds
Type: float
Required: False
Default: 0

ERDDAPUtil delays briefly before performing a reload of a dataset, in case another similar request comes in (e.g. if your automation pipeline is pushing dozens of requests at once). This setting allows you to control the longest ERDDAPUtil will wait after the last request for a given dataset to be reloaded before it will execute the request. Set to 0 to always immediately execute every request for a reload.

erddaputil.dataset_manager.max_pending
Type: int
Required: False
Default: 0

ERDDAPUtil delays briefly before performing a reload of a dataset, in case another similar request comes in (e.g. if your automation pipeline is pushing dozens of requests at once). This setting allows you to control the maximum number of datasets pending reload; once the threshold is exceeded, the oldest request is executed immediately. Set to 0 to ignore the threshold.

erddaputil.dataset_manager.max_recompile_delay
Type: float
Required: False
Default: 0

Similar to how dataset reloads are delayed, recompilation can also be delayed for similar reasons. ERDDAPUtil will wait until this many seconds have elapsed since the last request for recompilation before actually performing the recompilation. Set to 0 to always recompile immediately when requested.

erddaputil.dataset_manager.skip_misconfigured_datasets
Type: bool
Required: False
Default: true

When recompiling datasets, users may instruct ERDDAPUtil to either skip datasets that are not well-formed XML, raise an error and fail when such a dataset is found, or use the default value. This is the default value; set to true to skip the datasets (the default) or false to raise an error. Note that failed datasets are still logged by ERDDAPUtil so they can be remedied; if false, this mostly means that ERDDAPUtil will not update datasets.xml until the file is fixed (the default is to omit it from datasets.xml)

Log Management

erddaputil.logman.enabled
Type: bool
Required: False
Default: true

Set to false to disable log management.

erddaputil.logman.file_prefixes
Type: list
Required: False
Default: logArchivedAt, logPreviousArchivedAt, emailLog

A list of files to remove by prefix. Includes all of ERDDAP’s log files by default.

erddaputil.logman.retention_days
Type: int
Required: False
Default: 31

Days to keep ERDDAP log files (i.e. files in {BIG_PARENT_DIRECTORY}/logs) before removing them.

erddaputil.logman.sleep_time_seconds
Type: float
Required: False
Default: 3600

Number of seconds to wait between log clean-up jobs.

erddaputil.logman.include_tomcat
Type: bool
Default: false

Whether to cleanup tomcat log files.

erddaputil.logman.include_tomtail
Type: bool
Default: true

Whether to cleanup tomtail output files.

erddaputil.logman.include_erddap
Type: bool
Default: true

Whether to cleanup ERDDAP log files.

Tomcat Log Parsing

erddaputil.logman.enabled
Type: bool
Required: False
Default: true

Set to false to disable log parsing.

erddaputil.logman.sleep_time_seconds
Type: float
Required: False
Default: 3600

Number of seconds to wait between checking the log files.

erddaputil.logman.memory_file
Type: str
Required: False
Default: ./.tomtail.mem

File to save information about the tomcat logs to.

erddaputil.logman.output_directory
Type: str
Required: False

Set to a directory to write output files to.

erddaputil.logman.output_file_pattern
Type: str
Required: False
Default: erddap_access_log_%Y%m%d.log

Used as a parameter to strftime to format the name of the output file. Log files are rotated based on this value returning a different value.

erddaputil.logman.output_pattern
Type: str
Required: False
Default: %dataset_id %request_type %s %b %T "%U%q"

Used to format the output string for the output files. All placeholders below will return “-” if not available in the logs. The

Output Pattern Placeholders

Placeholder

Value

%a

Remote IP (see tomcat docs)

%U

Request URI (see tomcat docs)

%T

Request processing time in seconds (see tomcat docs)

%s

Request status (see tomcat docs)

%r

Request first line (see tomcat docs)

%q

Query string (see tomcat docs)

%m

Request method (see tomcat docs)

%h

Host (see tomcat docs)

%b

Bytes (see tomcat docs for %B)

%(request_type)s

Either web (default), data (for data downloads), or metadata (metadata downloads)

%(dataset_id)s

The ID of the dataset or - if not detected.

%(dap_variables)s

A semi-colon delimited list of DAP variable names included in the request

%(dap_constraints)s

An ampersand delimited list of DAP constraints included in the request

%(dap_grid_bounds)s

A semi-colon delimited list of bounds on a griddap request for ERDDAP

AMPQ Integration

erddaputil.ampq.cluster_name
Type: str
Required: False

If you are using AMPQ, this should be a unique value for each set of ERDDAP machines that should all respond to the same commands.

erddaputil.ampq.connection
Type: str
Required: False

Either a string for pika.connection.URLParameters (for pika integration) or the connection string (for Azure Service Bus)

erddaputil.ampq.create_queue
Type: bool
Required: False
Default: true

If set to false, prevents ERDDAPUtil from automatically trying to create and bind the queue or create the subscription/rules.

erddaputil.ampq.exchange_name
Type: str
Required: False
Default: erddap_cnc

The RabbitMQ exchange name or the Azure Service Bus topic name.

erddaputil.ampq.hostname
Type: str
Required: False

If you are using AMPQ, this should be a unique value for each machine. Defaults to socket.gethostname()

erddaputil.ampq.implementation
Type: str
Required: False
Default: pika

Set to pika or azure_service_bus depending which client library to use.

Web API

erddaputil.webapp.enable_management_api
Type: bool
Required: False
Default: true

Set to false to disable the management API

erddaputil.webapp.enable_metrics_collector
Type: bool
Required: False
Default: true

Set to false to disable the metrics collector (this is like our own pushgateway)

erddaputil.webapp.iterations_jitter
Type: int
Required: False
Default: 100000

ERDDAPUtil uses hashlib.pbkdf2_hmac() to hash and store user passwords, using a unique salt and number of iterations for each user. The number of iterations will be up to this value higher than erddaputil.webapp.min_iterations, chosen at random.

erddaputil.webapp.min_iterations
Type: int
Required: False
Default: 700000

ERDDAPUtil uses hashlib.pbkdf2_hmac() to hash and store user passwords, using a unique salt and number of iterations for each user. The number of iterations will be at least this many.

erddaputil.webapp.password_file
Type: path
Required: False

Set to the path of a file where passwords for the web API will be stored.

erddaputil.webapp.password_hash
Type: str
Required: False
Default: sha256

Set to the name of a hash function supported by hashlib.

erddaputil.webapp.peppers
Type: list
Required: True

Set to a list of random strings that are hard to guess. The first one will be used to create new passwords and they will all be tried when validating a user’s password.

erddaputil.webapp.salt_length
Type: int
Required: False
Default: 16

The length of the salt for new passwords (in bytes). Salts are generated using secrets.token_urlsafe()

Metrics Manager - LocalPrometheus

erddaputil.localprom.host
Type: str
Required: False
Default: localhost

The host to push statistics to.

erddaputil.localprom.port
Type: int
Required: False
Default: 7193

The port to push statistics to

erddaputil.localprom.username
Type: str
Required: False

The username to login to the web API with.

erddaputil.localprom.password
Type: str
Required: False

The password to login to the web API with.

erddaputil.localprom.batch_size
Type: int
Required: False
Default: 200

The maximum number of metric updates to send it one batch to the web API.

erddaputil.localprom.batch_wait_seconds
Type: float
Required: False
Default: 2

The maximum amount of time to delay sending metrics while waiting for a whole batch.

erddaputil.localprom.max_retries
Type: int
Required: False
Default: 3

The maximum number of times to retry sending a batch to the web API before discarding them. Set to -1 to retry forever or 0 to only try once. When the daemon is being shutdown, this is overridden to not retry at all.

erddaputil.localprom.max_tasks
Type: int
Required: False
Default: 5

The maximum number of batches that will be handled at the same time (defaults to 5). Metrics wait in a queue while not being handled.

erddaputil.localprom.retry_delay_seconds
Type: float
Required: False
Default: 2

The delay between retries to send metrics.

Status Scraper

erddaputil.status_scraper.enabled
Type: bool
Required: False
Default: true

Set to false to disable the scraping of status.html.

erddaputil.status_scraper.memory_path
Type: path
Required: False
Default: ./.status_scrape.mem

Set the path of a file where information about the last scrape of status.html is stored.

erddaputil.status_scraper.sleep_time_seconds
Type: float
Required: False
Default: 300

The time to wait between scrapes.

erddaputil.status_scraper.start_delay_seconds
Type: float
Required: False
Default: 180.0

The time to wait after startup before starting to scrape to give ERDDAP time to boot.

Daemon Service

erddaputil.daemon.host
Type: str
Required: False
Default: 127.0.0.1

The host the ERDDAP HTTP, CLI, and AMPQ APIs will send messages to.

erddaputil.daemon.port
Type: int
Required: False
Default: 9172

The port the ERDDAP HTTP, CLI, and AMPQ APIs will send messages to.

erddaputil.service.host
Type: str
Required: False
Default: 127.0.0.1

The IP address the ERDDAPUtil service will listen to.

erddaputil.service.port
Type: int
Required: False
Default: 9172

The port the ERDDAPUtil service will listen on.

erddaputil.service.backlog
Type: int
Required: False
Default: 20

The backlog of TCP connections that the daemon server will hold.

erddaputil.service.listen_block_seconds
Type: float
Required: False
Default: 0.25

The time to block while waiting for a new connection. Tidying jobs will be run approximately this often.