Connectors¶

Lava provides a mechanism to assist with connections to external resources to minimise the need for individual jobs to manage connectivity details and credentials. This also simplifies the process of migrating jobs from one environment or lava realm to another.

Configuration information for connection handlers (aka connectors) is stored in the connections table. The required fields are dependent on the connector type.

Note

Loosely speaking, a connector is a handler for a specific type of resource and a connection is an instance of a connector for a specific instance of a resource.

It probably doesn't matter that much and, historically, the user guide has played fast and loose with this distinction.

The underlying implementation of connectors is specific to the type of target resource and the job type. Lava attempts to provide a connection handle to jobs in a form that is relatively native to the job type. For example, a database connection for an sql job is provided as a Python DBAPI 2.0 connection instance. For an exe or pkg job, it is provided as a command line wrapper that handles credential management and connectivity behind the scenes.

Connector credentials are typically stored in the AWS SSM Parameter Store to provide a level of isolation and security. SSM parameters for a given realm should be stored with parameter names starting with /lava/<REALM>/.

Connectors are implemented using a simple plugin architecture and new ones can be added relatively easily.

Database Connectors¶

Lava provides database connectors for a number of common database types, including MySQL, Postgres and Oracle.

If used with sql, sqlc, sqli, sqlv, db_from_s3 and redshift_unload jobs, lava manages the connection process in the background.

If used with exe, pkg and docker jobs, lava provides an environment variable pointing to a script that will connect to the database to run SQL. The executable in the job payload can run the script to access the database without worrying about managing database connectivity.

Python programs in job payloads can access the lava connector subsystem directly to obtain either a DBAPI 2.0 connection object or an SQLAlchemy engine object. Refer to Developing Lava Jobs for more information.

Database Authentication Using AWS SSM Parameter Store¶

The database connectors typically require a number of connection and authentication parameters to be specified, such as:

host name
port
user name
password.

These can be defined explicitly in the connection specification, except for the password. By default, the value of this field is interpreted as the name of an encrypted SSM parameter that contains the actual password.

The standard lava worker IAM policies will provide read access to SSM parameters with names of the form /lava/<REALM>/*. These must be encrypted with the realm KMS key lava-<REALM>-sys.

Database Authentication Using AWS Secrets Manager¶

The lava database connectors support the AWS Secrets Manager as an alternative source for some of the connection specification parameters where they are not provided directly in the specification.

If the connection specification contains a secret_id field, a field in the named secret will be used to populate a missing component in the connector specification.

Note that Secrets Manager and lava use slightly different naming conventions for fields. Lava will map Secrets Manager fields to lava fields automatically using the following translation:

Secrets Manager Field	Lava Field
dbClusterIdentifier	description
dbname	database
host	host
password	password
port	port
serviceName	service_name
sid	sid
username	user

The standard lava worker IAM policies will provide read access to secrets with names of the form /lava/<REALM>/*. These must be encrypted with the realm KMS key lava-<REALM>-sys.

Database Authentication Using IAM Credential Generation¶

Some AWS database types provide an IAM based mechanism for obtaining temporary database credentials. Lava supports this mechanism for some connectors. The mechanism will be used where the connection specification (after inclusion of any AWS Secrets Manager components) does not contain a password.

Refer to individual connector details for more information.

Database Client Application Identification¶

Some database types support a mechanism for the client to identify itself when connecting, in addition to the user authentication. This information may then be available in things such as connection logs, activity logs etc. The mechanism used is database dependent and not all databases provide a mechanism.

Lava attempts to provide a uniform interface to the underlying database client identification mechanism where possible.

For most of the built in database related job types, lava will automatically provide a client identifier when connecting. By default, this is in the form lv-<REALM>-<JOB-ID>. (See the CONN_APP_NAME worker configuration parameter.)

Support in sqlc jobs is dependent on the capabilities of the database specific CLI tool used to support the connection. Likewise for executable job types (e.g. exe and pkg) using a CLI based connector. See also Connection Handling for Executable Jobs.

When using the lava API get_pysql_connection(), a new, optional application_name parameter is available. If a value is not provided, a value in the form described above is used if the lava job ID can be determined from the presence of a LAVA_JOB_ID environment variable. This should work whenever the API is being used within a lava job. See also Connection Handling for Python Based Jobs.

In short, in most normal usage patterns for databases for which lava supports client identification, it will, more or less, do the right thing without modifying jobs or additional configuration.

Lava's support for a client identification mechanism is summarised in the following table:

Job Type	MS SQL	MySQL	Postgres	Redshift
sql	Yes	Yes	Yes	Yes
sqli	Yes	Yes	Yes	Yes
sqlc			Yes	Yes
sqlv	Yes	Yes	Yes	Yes
db_from_s3	Yes	Yes	Yes	Yes
redshift_unload				Yes
lava-sql CLI	(1)	(1)	(1)	(1)
Lava API	(2)	(2)	(2)	(2)

Notes:

The lava-sql utility will automatically populate a client connection identifier when used as part of a lava job payload. In other usages, the -a / --app-name argument will need to be specified.
The get_pysql_connection() API will automatically populate a client connection identifier when used as part of a lava job payload. In other usages, the otherwise optional application_name parameter will need to be specified.

Note

This article by Andy Grunwald was very helpful when implementing database client identification in lava: your database connection deserves a name

Client Application Identification for Postgres¶

Postgres flavoured databases use the application_name connection parameter to identify client connections. Postgres will truncate the supplied value to 63 characters.

The following sample query will display connected application names.

SELECT usename, application_name, client_addr, backend_type
FROM pg_stat_activity;

Client Application Identification for Redshift¶

Redshift, like Postgres, uses the application_name connection parameter to identify client connections. Redshift allows application names up to 250 characters.

The following sample query can display application names:

SELECT RTRIM(username)         AS user,
       sessionid,
       SUBSTRING(event, 1, 20) AS event,
       recordtime,
       RTRIM(authmethod)       AS auth,
       RTRIM(sslversion)       AS ssl,
       RTRIM(application_name) AS app_name
FROM stl_connection_log
ORDER BY recordtime DESC;

Client Application Identification for MySQL¶

MySQL use the program_name connection parameter to identify client connections.

The performance schema must be enabled to run queries that access the program_name parameter. For AWS Aurora instances, see Turning on the Performance Schema for Performance Insights on Aurora MySQL for information on enabling the performance schema.

The following sample query, when run as an admin user, shows currently active connections:

SELECT
 session_connect_attrs.ATTR_VALUE AS program_name,
 processlist.*
FROM information_schema.processlist
LEFT JOIN  performance_schema.session_connect_attrs ON (
 processlist.ID = session_connect_attrs.PROCESSLIST_ID
 AND session_connect_attrs.ATTR_NAME = "program_name"
)

The following query shows active connections for the current user:

SELECT
 session_account_connect_attrs.ATTR_VALUE AS program_name,
 processlist.*
FROM information_schema.processlist
LEFT JOIN  performance_schema.session_account_connect_attrs ON (
 processlist.ID = session_account_connect_attrs.PROCESSLIST_ID
 AND session_account_connect_attrs.ATTR_NAME = "program_name";

Client Application Identification for SQL Server (MS SQL)¶

SQL Server uses the program_name connection parameter to identify client connections.

The following sample query, when run as an admin user, shows currently active connections:

SELECT hostname, program_name, loginame, cmd
FROM sys.sysprocesses
WHERE loginame != 'rdsa';

Other Connectors¶

Lava also provides connectors for various other types of resource, including sFTP servers, SharePoint sites, SMB fileshares and the AWS CLI. These are typically used either by a job type that is specific to the target resource or in exe, pkg and docker jobs.

Connector type: aws¶

The aws connector manages access to AWS access keys. It supports static access keys as well as session credentials obtained by assuming an IAM role in either the current AWS account or another account.

Note

IAM assumed role session credentials are new in version 8.1 (Kīlauea).

When used with redshift_unload jobs, this connector provides the access keys that are used in the S3 AUTHORIZATION parameters in the UNLOAD command.

When used with db_from_s3 jobs, this connector provides the access keys that are used to provide the database the required access to S3 to load the data.

When used with exe and pkg jobs, it provides an environment variable pointing to a script that will run the AWS CLI with an appropriate AWS authentication profile.

Field	Type	Required	Description
access_keys	String	Note 1	The name of an encrypted SSM parameter containing the access keys. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...`. The value must be in the format `access_key_id,access_secret_key` and must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
duration	Duration	No	The duration of a session created when an IAM role is assumed. Defaults to the value of the AWS_CONN_DURATION configuration parameter. As AWS credentials are cached, it is critical that this is significantly longer than the cache duration as specified by the AWS_ACCESS_KEY_CACHE_TTL parameter.
enabled	Boolean	Yes	Whether or not the connection is enabled.
external_id	String	No	Name of an SSM parameter containing an external ID to use when assuming a role to obtain session credentials. While AWS does not consider this to be a sensitive security parameter, it is stored in the SSM parameter store for ease of management. It is still recommended to use a secure parameter. Can't hurt.
policy_arns	String \| List[String]	No	The ARNs of IAM managed policies to use as managed session policies. The policies must exist in the same account as the role. The session permissions are the intersection of these policies and the policies of the role being assumed. It is not possible to expand the underlying role permissions.
policy	Map[String,*]	No	An IAM policy to use as an inline session policy. The value must be a fully-formed AWS IAM policy. The session permissions are the intersection of the specified policy and the policies of the role being assumed. It is not possible to expand the underlying role permissions.
region	String	No	The AWS region name. If not specified, the current region is assumed.
role_arn	String	Note 1	The ARN of an IAM role to assume to obtain session credentials.
tags	Map[String,String]	No	A map of session tags to pass. See Tagging Amazon Web Services STS Sessions.
type	String	Yes	`aws`.

Notes:

One of access_keys or role_arn must be specified.
If a role_arn is specified, The trust policy on the role must allow it to be assumed by the lava worker. If session tags are specified using the tags field, the trust policy must also permit this.
When assuming a role, the lava worker will set the role session name. By default, this is in the form lv-<REALM>-<JOB-ID>, cleansed as necessary to satisfy the requirements for session names. (See the CONN_APP_NAME worker configuration parameter.)

Using the AWS Connector in Shell Scripts¶

The aws connector creates a small shell script that is a wrapper around the AWS CLI that handles the access keys. The shell script is a drop in replacement for the AWS CLI when used in lava jobs.

Consider the following exe job:

{
    "description": "Show usage of aws CLI connector in a shell script",
    "enabled": true,
    "job_id": "aws-cli-example",
    "parameters": {
        "connections": {
            "aws1": "aws-conn-id-1",
            "aws2": "aws-conn-id-2"
        }
    },
    "payload": "example/aws-cli-conn.sh",
    "type": "exe",
    "worker": "core"
}

The connections element in the parameters will result in lava preparing connector shell scripts whose names are placed in the environment variables LAVA_CONN_AWS1 and LAVA_CONN_AWS2 respectively.

The payload (example/aws-cli-conn.sh in this example) can then use these scripts just like the AWS CLI. For example:

#!/bin/bash

$LAVA_CONN_AWS1 sts get-caller-identity

$LAVA_CONN_AWS2 s3 ls

Using the AWS Connector in Python¶

Note

New in version 8.1 (Kīlauea).

Python jobs can call the lava connector subsystem directly via the lava API.

Consider the following exe job:

{
    "description": "Show usage of aws CLI connector in a Python program",
    "enabled": true,
    "job_id": "aws-python-example",
    "parameters": {
        "connections": {
            "aws3": "aws-conn-id-3"
        }
    },
    "payload": "example/aws-cli-conn.py",
    "type": "exe",
    "worker": "core"
}

Once again, lava will create a shell script accessed via the LAVA_CONN_AWS3 environment variable. It will also populate the LAVA_CONNID_AWS3 environment variable with the connection ID. This can be used with the lava connector API to obtain a boto3 Session object, thus:

import os
from lava.connection import get_aws_session

realm = os.environ['LAVA_REALM']

# Note we want the connection ID, not the CLI script here.
conn_id = os.environ['LAVA_CONNID_AWS3']

# Use the lava API to the connection subsystem to obtain a boto3 Session.
aws_session = get_aws_session(conn_id, realm)

sts = aws_session.client('sts')
print(sts.get_caller_identity())

Note

A Python script can use the CLI script as well (e.g. via the subprocess module) but why would you want to?

Connector type: docker¶

The docker connector manages access to a docker daemon and docker registry for use with docker jobs.

Lava supports the following registry options:

AWS ECR
Private docker registries
The standard docker public registry.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
email	String	No	Email address for registry login.
enabled	Boolean	Yes	Whether or not the connection is enabled.
password	String	No	Name of the SSM parameter containing the password for authenticating to the registry. Required for private docker repositories. Ignored for ECR registries. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
registry	String	No	Either the URL for a standard registry or `ecr[:account-id]`. In the latter case, lava will connect to the AWS ECR registry in the specified AWS account or the current account if no `account-id` is specified. If no registry is specified, the default public docker registry is used.
server	String	No	URL for the docker server. If not specified, then the normal docker environment variables are used. Generally, this means using the local docker daemon accessed via the UNIX socket.
timeout	Number	No	Timeout on docker API calls in seconds.
tls	Boolean	No	Use TLS when connecting to the docker server. Default True.
type	String	Yes	`docker`.
user	String	No	User name for authenticating to the registry. Required for private docker repositories. Ignored for ECR registries.

Accessing External Registries¶

Lava prefers to obtain its docker images from the local AWS ECR. It's safer, simpler and more robust than relying on external registries to provide safe, secure code at run-time, particularly for a production environment.

Tip

If you need to use an external image, copy it to the local AWS ECR and use it from there. The lava job framework will place the built payloads for docker jobs in ECR. A trivial Dockerfile can copy an external image as part of the build process.

If you must do this damn fool thing, lava permits it. There are some considerations:

Private registries (i.e. requiring authentication to access) will require a connection specification as described above, including the registry identifier and credentials. The registry will also be part of the image name as usual.
Public registries, such as Docker Hub and public repositories on GitHub Container Registry (GHCR), can be addressed by a common connection specification containing neither registry, nor credentials. The registry will be part of the image name as usual (except for Docker Hub which is the default registry).
Proxies can be a problem. Lava will not help you here. The docker daemon proxy configuration will need to be handled at the platform level, however that is done.

Examples¶

ECR ConnectorPublic RegistriesPrivate Registries

This is the standard connection specification for the local AWS ECR.

{
    "type": "docker",
    "conn_id": "docker/ecr",
    "description": "Docker ECR connection",
    "enabled": true,
    "registry": "ecr",
}

This connection specification should handle most public registries.

{
    "type": "docker",
    "conn_id": "docker/public",
    "description": "Docker basic connection (covers public repos)",
    "enabled": true
}

This connection specification is for a private registry on the Github Container Registry:

{
    "type": "docker",
    "conn_id": "docker/ghcr/xyzzy",
    "description": "Github Container Registry for user xyzzy",
    "enabled": true,
    "registry": "ghcr.io"
    "user": "not-used-by-ghcr",
    "password": "/lava/my-realm/ghcr/xyzzy/access-token"
}

Connector type: email¶

The email connector provides a generic interface for an email sending subsystem. It is implemented by one or more actual email handlers. The email subsystem type is selected by the subtype field in the connection specification. Each subtype may have extra field requirements of its own.

Currently supported email handler subtypes are:

ses: AWS Simple Email Service (SES)
smtp: SMTP, including optional TLS support.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
from	String	No	The email address that is sending the email. While this is not a mandatory field in the connector, there must be a value available at the time an email is sent, either from the job itself, the connection specification or an email handler specific mechanism. It is strongly recommended to include a default value in the connection specification.
reply_to	String or List[String]	No	The default reply-to email address(es) for messages.
subtype	String	No	Specifies the underlying email handler. If not specified, `ses` is assumed, in which case the field requirements for this subtype must be met.
type	String	Yes	`email`.

Subtype: ses¶

The ses subtype uses AWS Simple Email Service to send email.

The following fields are specific to the ses subtype.

Field	Type	Required	Description
configuration_set	String	No	Use the specified SES Configuration Set when sending an email. If not specified, the value specified by the SES_CONFIGURATION_SET realm configuration parameter is used.
from	String	No	The email address that is sending the email. This email address must be either individually verified with Amazon SES, or from a domain that has been verified with Amazon SES. If not specified, the value specified by the SES_FROM realm configuration parameter is used. A value must be specified by one of these mechanisms.
region	String	No	The AWS region name for the SES service. If not specified, the value specified by the SES_REGION realm configuration parameter is used, which itself defaults to `us-east-1`.
subtype	String	No	Either `ses` or missing.

Subtype: smtp¶

The smtp subtype uses standard SMTP to send email. SMTP over TLS is also supported.

The following fields are specific to the smtp subtype.

Field	Type	Required	Description
host	String	Yes	The SMTP server host DNS name or IP address.
password	String	Sometimes	The name of an encrypted SSM parameter containing the SMTP server password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key. This field is required if the `host` field is specified.
port	Number	No	The SMTP port number. If not specified, the default is 25 without TLS and 465 with TLS. Note that Gmail requires TLS on port 587.
subtype	String	Yes	`smtp`
tls	Boolean	No	If `true`, use SMTP over TLS. Default is `false`.
user	String	No	SMTP server user name. If specified, the `password` field must also be specified. If not specified, the connection will be unauthenticated.

Using the Email Connector¶

The email connector provides two distinct interfaces:

A native Python interface
A command line interface.

Python Interface for Email Connectors¶

Python scripts can directly access the underlying Python interface of an email connector. In this case, the connector returns a lava.lib.email.Emailer object as described in the lava API documentation.

As an example, consider an exe job specification that looks something like this:

{
    "job_id": "...",
    "parameters": {
        "connections": {
            "email": "email-connection-id"
        }
    },
    "payload": "my-payload.py ..."
}

A Python program can use the email connector like this:

import os
from lava.connection import get_email_connection

# If running as a lava exe/pkg/docker, get some info provided by lava in the
# environment. Assume our connector is labeled `email` in the job spec.
realm = os.environ['LAVA_REALM']
conn_id = os.environ['LAVA_CONNID_EMAIL']

# We can use the email connection as a context manager
with get_email_connection(conn_id, realm) as emailer:
    emailer.send(
        subject='Oh no',
        message='Your oscillation overthruster has malfunctioned',
        to='Buckaroo.Banzai@dimension8.com',
        cc=[
            'Professor.Hikita@dimension8.com',
            'Sidney Zweibel@dimension8.com'
        ]
    )

Executable Interface for Email Connectors¶

When used with exe, pkg and docker job types (e.g. shell scripts), the connection is implemented by the lava-email command.

When used as a connection script within a lava job, the -r REALM and -c CONN_ID arguments don't need to be provided by the job as these are provided by lava in the connection script.

Also, values for the --from and --reply-to options will be provided by lava if it has values available from the connection specification or other configuration data. These values can be overridden by providing the appropriate options to then connection script.

lava-email Usage

usage: lava-email [-h] [--profile PROFILE] [-v] -c CONN_ID [-r REALM]
                  [--bcc EMAIL] [--cc EMAIL] [--from EMAIL] [--reply-to EMAIL]
                  [--to EMAIL] -s SUBJECT [--html FILENAME] [--text FILENAME]
                  [--no-colour] [-l LEVEL] [--log LOG] [--tag TAG]
                  [FILENAME]

Send email using lava email connections.

optional arguments:
  -h, --help            show this help message and exit
  --profile PROFILE     As for AWS CLI.
  -v, --version         show program's version number and exit

lava arguments:
  -c CONN_ID, --conn-id CONN_ID
                        Lava connection ID. Required.
  -r REALM, --realm REALM
                        Lava realm name. If not specified, the environment
                        variable LAVA_REALM must be set.

email arguments:
  --bcc EMAIL           Recipients to place on the Bcc: line of the message.
                        Can be used multiple times.
  --cc EMAIL            Recipients to place on the Cc: line of the message.
                        Can be used multiple times.
  --from EMAIL          Message sender. If not specified, a value must be
                        available in either the connection specification or
                        the realm specification.
  --reply-to EMAIL      Reply-to address of the message.Can be used multiple
                        times.
  --to EMAIL            Recipients to place on the To: line of the message.Can
                        be used multiple times.
  -s SUBJECT, --subject SUBJECT
                        Message subject. Required.

message source arguments:
  At most one of the following arguments is permitted.

  --html FILENAME       This is a legacy argument for backward compatibility.
  --text FILENAME       This is a legacy argument for backward compatibility.
  FILENAME              Name of file containing the message body. If not
                        specified or "-", the body will be read from stdin. An
                        attempt is made to determine if the message is HTML and
                        send it accordingly. Only the first 2MB is read.

logging arguments:
  --no-colour, --no-color
                        Don't use colour in information messages.
  -l LEVEL, --level LEVEL
                        Print messages of a given severity level or above. The
                        standard logging level names are available but debug,
                        info, warning and error are most useful. The default
                        is info.
  --log LOG             Log to the specified target. This can be either a file
                        name or a syslog facility with an @ prefix (e.g.
                        @local0).
  --tag TAG             Tag log entries with the specified value. The default
                        is lava-email.

As an example, consider an exe job specification that looks something like this:

{
    "job_id": "...",
    "parameters": {
        "connections": {
            "email": "email-connection-id"
        }
    },
    "payload": "my-payload.sh ..."
}

Note the email connection. This will provide the job with an environment variable LAVA_CONN_EMAIL which points to the executable handling the connection.

If the job payload is a shell script, the connector would be invoked thus:

# Send an email with a text message body.
$LAVA_CONN_EMAIL --to Buckaroo.Banzai@dimension8.com --subject "Oh no" <<!
    Dear Buckaroo,

    Your oscillation overthruster has malfunctioned.

    -- John Bigbooté
!

# But wait -- we can do HTML as well
$LAVA_CONN_EMAIL --to Buckaroo.Banzai@dimension8.com --subject "Oh no" <<!
<HTML>
    <BODY>
        <P>Dear Buckaroo,</P>
        <P>Your oscillation overthruster has malfunctioned</P>
        <P>-- John Bigbooté</P>
    </BODY>
</HTML>
!

Connector type: generic¶

The generic connector provides a general purpose mechanism to group a set of associated attributes together and have them made available to lava jobs at run-time. Lava doesn't actually connect to any external resources other than to obtain attribute values.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
attributes	Map[String,*]	Yes	A map comprising the attributes for the connector. The keys are the attribute names and the values are either simple scalars or another map specifying how to obtain the value. See below for more information.
type	String	Yes	`generic`.

Specifying Generic Connector Attribute Values¶

The attributes field of the generic connector specifies the names of the connector attributes and how the attribute values are obtained. The following variants are supported.

Simple Scalar Attributes¶

Simple scalar attributes are specified thus:

{
  "attributes": {
    "name": "value"
  }
}

In addition to string values, integer and float values are also supported.

Local Parameters¶

This is an alternative syntax to the simple scalar attribute syntax described above.

{
  "attributes": {
    "name": {
      "type": "local",
      "value": "value"
    }
  }
}

SSM Parameters¶

Values from SSM parameters are specified thus:

{
  "attributes": {
    "name": {
      "type": "ssm",
      "parameter": "SSM parameter name"
    }
  }
}

Lava will obtain the value from the SSM parameter store, decrypting as required.

Example Generic Connector Specification¶

{
  "conn_id": "widget-conn-id",
  "description": "Sample generic connector",
  "enabled": true,
  "type": "generic",
  "attributes": {
    "a": "a string",
    "b": {
      "type": "local",
      "value": 30
    },
    "c": {
      "type": "ssm",
      "parameter": "/lava/<REALM>/my_var"
    }
  }
}

Using the Generic Connector¶

The generic connector provides two distinct interfaces:

A native Python interface
A command line interface.

Python Interface for Generic Connectors¶

Python scripts can directly access the underlying Python interface of a generic connector. In this case, the connector returns a dictionary of resolved attribute values.

As an example, consider an exe job specification that looks something like this:

{
    "job_id": "...",
    "parameters": {
        "connections": {
            "widget": "widget-connection-id"
        }
    },
    "payload": "my-payload.py ..."
}

A Python program can use the generic connector like this:

import os
from lava.connection import get_generic_connection

# If running as a lava exe/pkg/docker, get some info provided by lava in the
# environment. Assume our connector is labeled `widget` in the job spec.
realm = os.environ['LAVA_REALM']
conn_id = os.environ['LAVA_CONNID_WIDGET']

attributes = get_generic_connection(conn_id, realm)

The attributes dictionary would then look like:

{
    'a': 'a string',
    'b': 30,
    'c': 'Value of SSM parameter /lava/<REALM>/my_var'
}

Executable Interface for Generic Connectors¶

When used with exe, pkg and docker job types (e.g. shell scripts), the connection is implemented by a simple script that can be used to obtain the value of individual attributes.

As an example, consider an exe job specification that looks something like this:

{
  "job_id": "...",
  "parameters": {
    "connections": {
      "widget": "widget-connection-id"
    }
  },
  "payload": "my-payload.sh ..."
}

Note the widget connection. This will provide the job with an environment variable LAVA_CONN_WIDGET which points to the executable handling the connection.

If the job payload is a shell script, the connector would be invoked thus:

# Get the values of the attributes
ATTR_A=$($LAVA_CONN_WIDGET a)
ATTR_B=$($LAVA_CONN_WIDGET b)
ATTR_C=$($LAVA_CONN_WIDGET c)

Connector type: git¶

The git connector manages access to Git repositories by providing support for managing SSH private keys.

When used with exe and pkg jobs, it provides an environment variable pointing to a script that will run the Git CLI with SSH keys managed in the background.

Note that only SSH access to repositories is supported. HTTPS is not supported.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
ssh_key	String	Yes	The name of an encrypted SSM parameter containing the SSH private key. There must not be any passphrase on the key. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key. Refer to the ssh connector for more information on how to prepare and store the key.
ssh_options	List[String]	No	A list of SSH options as per ssh_config(5). e.g. `StrictHostKeyChecking=no`
type	String	Yes	`git`.

Connector type: mariadb-rds¶

This is currently a synonym for mysql.

It has been defined in the event of future feature differences between conventional MySQL and AWS RDS MariaDB.

Connector type: mariadb¶

This is a synonym for mysql.

## Connector type: mssql

The mssql connector handles connections to Microsoft SQL Server databases.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
database	String	Yes*	The name of the database within the database server.
driver	String	No	The ODBC driver specification. This must correspond to the name of a section in `/etc/odbcinst.ini`. The default is `FreeTDS`.
enabled	Boolean	Yes	Whether or not the connection is enabled.
host	String	Yes*	The database host DNS name or IP address.
password	String	Yes*	The name of an encrypted SSM parameter containing the password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
port	Number	Yes*	The database port number.
preserve_case	Boolean	No	If `true`, don't fold database object names to lower case when quoting them for use in db_from_s3 jobs. The default is `false` (i.e. case folding is enabled).
secret_id	String	No	Obtain missing fields from AWS Secrets Manager. More information.
subtype	String	No	Specifies the underlying DBAPI 2.0 driver. The default and only allowed value is `pyodbc`.
timeout	Integer	No	Connection timeout in seconds. If not specified, no timeout is applied.
type	String	Yes	`mssql`.
user	String	Yes*	Database user name.

Info

Fields with a Required column marked with * can have a value provided directly in the connection specification or indirectly via AWS Secrets Manager using the secret_id field. See Database Authentication Using AWS Secrets Manager for more information.

SSL connections are not currently supported.

When used with exe and pkg job types, the connection is implemented by the lava-sql CLI.

Note

There are some MSSQL CLI tools that come with the TDS or unixODBC packages. None of them are wonderful so for now lava-sql will have to do. Also not wonderful but what do you expect for free?

Implementation Notes¶

The current implementation requires the following components be installed and configured on the lava worker:

Configuring unixODBC with Free TDS

Connector type: mysql-aurora¶

The mysql-aurora connector handles connections to AWS RDS Aurora MySQL database clusters. This is almost a synonym for mysql. Key differences are:

The db_from_s3 job can take advantage of an AWS facility to load data directly from S3.
Database authentication using IAM credential generation is supported.

Field	Type	Required	Description
ca_cert	String	No	The name of a file containing the CA certificate for the database server. Ignored unless `ssl` is `true`.
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
database	String	Yes*	The name of the database (schema) within the database server.
enabled	Boolean	Yes	Whether or not the connection is enabled.
host	String	Yes*	The database host DNS name or IP address.
password	String	No*	The name of an encrypted SSM parameter containing the password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key. If not specified, the worker will attempt to generate temporary IAM user credentials.
port	Number	Yes*	The database port number.
preserve_case	Boolean	No	If `true`, don't fold database object names to lower case when quoting them for use in db_from_s3 jobs. The default is `false` (i.e. case folding is enabled).
secret_id	String	No	Obtain missing fields from AWS Secrets Manager. More information.
ssl	Boolean	No	Set to `true` to enable SSL. Default is `false`.
type	String	Yes	`mysql-aurora`
user	String	Yes*	Database user name.

Info

Fields with a Required column marked with * can have a value provided directly in the connection specification or indirectly via AWS Secrets Manager using the secret_id field. See Database Authentication Using AWS Secrets Manager for more information.

When used with exe and pkg job types, the connection is implemented by the mysql CLI. Apart from the connection parameters, it is invoked with the following options:

mysql --batch --connect-timeout=10

Creating Temporary IAM User Credentials for AWS RDS Aurora MySQL¶

If the password field is not present in the connection specification, lava will attempt to generate temporary IAM credentials using the generate-db-auth-token mechanism.

The specified user must already exist in the database. Enable IAM authentication for a user thus:

CREATE USER a_user IDENTIFIED WITH AWSAuthenticationPlugin AS 'RDS';

The IAM policy attached to the worker will need to contain an element something like this:

"Statement": [
    {
        "Sid": "GetRdsCreds",
        "Effect": "Allow",
        "Action": "rds-db:connect",
        "Resource": [
            "arn:aws:rds-db:ap-southeast-2:123456789012:dbuser:db-JMH2...6KW6Q/a_user"
        ]
    }
]

The DB instance ID for use in the IAM policy can be obtained thus:

aws rds describe-db-instances --db-instance-identifier 'DB_ID' \
     --query 'DBInstances[0].DbiResourceId' --output text

Info

SSL is mandatory when using temporary IAM user credentials.

Connector type: mysql-rds¶

This is currently a synonym for mysql-aurora.

It has been defined in the event of future feature differences between conventional AWS RDS Aurora MySQL and AWS RDS MySQL.

Connector type: mysql¶

The mysql connector handles connections to MySQL compatible databases.

Field	Type	Required	Description
ca_cert	String	No	The name of a file containing the CA certificate for the database server. Ignored unless `ssl` is `true`.
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
database	String	Yes*	The name of the database (schema) within the database server.
enabled	Boolean	Yes	Whether or not the connection is enabled.
host	String	Yes*	The database host DNS name or IP address.
password	String	Yes*	The name of an encrypted SSM parameter containing the password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
port	Number	Yes*	The database port number.
preserve_case	Boolean	No	If `true`, don't fold database object names to lower case when quoting them for use in db_from_s3 jobs. The default is `false` (i.e. case folding is enabled).
secret_id	String	No	Obtain missing fields from AWS Secrets Manager. More information.
ssl	Boolean	No	Set to `true` to enable SSL. Default is `false`.
type	String	Yes	`mysql`.
user	String	Yes*	Database user name.

Info

Fields with a Required column marked with * can have a value provided directly in the connection specification or indirectly via AWS Secrets Manager using the secret_id field. See Database Authentication Using AWS Secrets Manager for more information.

When used with exe and pkg job types, the connection is implemented by the mysql CLI, either the MySQL Community version, or the MariaDB version, depending on the variant installed on the worker. These have some minor CLI parameter differences which lava manages for the connection parameters. Apart from the connection parameters, it is invoked with the following options:

mysql --batch --connect-timeout=10

Connector type: oracle-rds¶

This is currently a synonym for oracle.

It has been defined in the event of future feature differences between conventional Oracle and AWS RDS Oracle.

Connector type: oracle¶

The oracle connector handles connections to Oracle databases.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
database	String	No*	A deprecated synonym for `sid`.
description	String	No	Description.
edition	String	No	Oracle version for compatibility in the form `x.y[.z]`.
enabled	Boolean	Yes	Whether or not the connection is enabled.
host	String	Yes*	The database host DNS name or IP address.
password	String	Yes*	The name of an encrypted SSM parameter containing the password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
port	Number	Yes*	The database port number.
secret_id	String	No	Obtain missing fields from AWS Secrets Manager. More information.
service_name	String	No*	The Oracle data base service name. Generally exactly one of `service_name` or `sid` must be specified.
sid	String	No*	The Oracle System Identifier of the database. Generally exactly one of `service_name` or `sid` must be specified.
type	String	Yes	`oracle`.
user	String	Yes*	Database user name.

Info

Fields with a Required column marked with * can have a value provided directly in the connection specification or indirectly via AWS Secrets Manager using the secret_id field. See Database Authentication Using AWS Secrets Manager for more information.

When used with exe and pkg job types, the connection is implemented by the SQL*Plus CLI, sqlplus. Apart from the connection parameters, it is invoked with the following options:

sqlplus -NOLOGINTIME -L -S -C <version>

The SQL*Plus CLI is a particularly contrary beast. It is important to explicitly exit the CLI using an EXIT command at the end of any session or else it will drop into interactive mode and sit there waiting for further commands until the job reaches its timeout and is killed by lava. A safer approach is to send commands to the connector via stdin, thus:

# Assume our conn_id is ora

$LAVA_CONN_ORA <<!
SELECT whatever FROM whichever;
!

When used with sql jobs, do not terminate the SQL with a semi-colon or a syntax error results.

When used with sqlc jobs, SQL commands must be terminated with a semi-colon or either a syntax error or no output will result.

Security Warnings¶

Oracle CLI clients, including sqlplus, do not provide any means to automate login to the database without specifying the password on the command line. This means the password is exposed in a process listing. Do not use the oracle command line connector on any worker that has multi-user access.

The oracle connector does not currently support SSL/TLS.

Connector type: postgres-aurora¶

This connector support AWS RDS Aurora PostgreSQL clusters. This is almost a synonym for postgres. Key differences are:

The db_from_s3 job can take advantage of an AWS facility to load data directly from S3.
Database authentication using IAM credential generation is supported.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
database	String	Yes*	The name of the database within the database server.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
host	String	Yes*	The database host DNS name or IP address.
password	String	No*	The name of an encrypted SSM parameter containing the password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key. If not specified, the worker will attempt to generate temporary IAM user credentials.
port	Number	Yes*	The database port number.
preserve_case	Boolean	No	If `true`, don't fold database object names to lower case when quoting them for use in db_from_s3 jobs. The default is `false` (i.e. case folding is enabled).
secret_id	String	No	Obtain missing fields from AWS Secrets Manager. More information.
ssl	Boolean	No	Set to `true` to enable SSL. Default is `false`
subtype	String	No	Specifies the underlying DBAPI 2.0 driver. The default is `pg8000` which should be used wherever possible. The `pygresql` driver is also available.
type	String	Yes	`psql` or `postgres`.
user	String	Yes*	Database user name.

Info

Fields with a Required column marked with * can have a value provided directly in the connection specification or indirectly via AWS Secrets Manager using the secret_id field. See Database Authentication Using AWS Secrets Manager for more information.

When used with exe and pkg job types, the connection is implemented by the psql CLI. Apart from the connection parameters, it is invoked with the following options:

psql --no-psqlrc --quiet --set ON_ERROR_STOP=on --pset footer=off

Creating Temporary IAM User Credentials for AWS RDS Aurora PostgreSQL¶

If the password field is not present in the connection specification, lava will attempt to generate temporary IAM credentials using the generate-db-auth-token mechanism.

The specified user must already exist in the database. Enable IAM authentication for a user thus:

CREATE USER a_user; 
GRANT rds_iam TO a_user;

Info

SSL is mandatory when using temporary IAM user credentials.

Psql CLI Password Limitations¶

The psql CLI will not accept passwords in a PGPASS file (or entered interactively) that are longer than a certain (undocumented) length. IAM based authentication for RDS involves temporary passwords that are much longer than this limit.

To workaround this limitation, lava has to put long passwords into an environment variable. While this is not ideal from a security perspective, at least the passwords are short lived.

Connector type: postgres-rds¶

This is currently a synonym for postgres-aurora.

It has been defined in the event of future feature differences between conventional AWS RDS Aurora PostgreSQL and AWS RDS PostgreSQL.

Connector type: postgres¶

The postgres connector handles connections to Postgres compatible databases.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
database	String	Yes*	The name of the database within the database server.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
host	String	Yes*	The database host DNS name or IP address.
password	String	Yes*	The name of an encrypted SSM parameter containing the password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
port	Number	Yes*	The database port number.
preserve_case	Boolean	No	If `true`, don't fold database object names to lower case when quoting them for use in db_from_s3 jobs. The default is `false` (i.e. case folding is enabled).
secret_id	String	No	Obtain missing fields from AWS Secrets Manager. More information.
ssl	Boolean	No	Set to `true` to enable SSL. Default is `false`
subtype	String	No	Specifies the underlying DBAPI 2.0 driver. The default is `pg8000` which should be used wherever possible. The `pygresql` driver is also available.
type	String	Yes	`psql` or `postgres`.
user	String	Yes*	Database user name.

Info

Fields with a Required column marked with * can have a value provided directly in the connection specification or indirectly via AWS Secrets Manager using the secret_id field. See Database Authentication Using AWS Secrets Manager for more information.

When used with exe and pkg job types, the connection is implemented by the psql CLI. Apart from the connection parameters, it is invoked with the following options:

psql --no-psqlrc --quiet --set ON_ERROR_STOP=on --pset footer=off

Connector type: psql¶

This is a synonym for postgres.

Connector type: redshift-serverless¶

This is the connector for Redshift Serverless clusters.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
database	String	Yes*	The name of the database within the Redshift Serverless namespace.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
external_id	String	No	Name of an SSM parameter containing an external ID to use when assuming the IAM role specified by `role_arn` when generating temporary IAM user credentials. While AWS does not consider this to be a sensitive security parameter, it is stored in the SSM parameter store for ease of management. It is still recommended to use a secure parameter. Can't hurt.
host	String	Yes*	The Redshift serverless workgroup endpoint address.
password_duration	String	No	The password duration when generating temporary IAM user credentials in the form `nnX` where `nn` is a number and `X` is `s` (seconds), `m` (minutes) or `h` (hours). If not specified, the default worker configuration is used. Limits imposed by the Redshift Serverless GetCredentials API apply.
password	String	No*	The name of an encrypted SSM parameter containing the password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key. If not specified, the worker will attempt to generate temporary IAM user credentials.
port	Number	Yes*	The Redshift serverless workgroup port number.
preserve_case	Boolean	No	If `true`, don't fold database object names to lower case when quoting them for use in db_from_s3 jobs. The default is `false` (i.e. case folding is enabled).
role_arn	String	No	The ARN of an IAM role that will be assumed when generating temporary IAM user credentials.
secret_id	String	No	Obtain missing fields from AWS Secrets Manager. More information.
ssl	Boolean	No	Set to `true` to enable SSL. Default is `false`.
subtype	String	No	Specifies the underlying DBAPI 2.0 driver. See Redshift Connector Subtypes below.
type	String	Yes	`redshift-serverless`.
user	String	Yes*	Database user name.
workgroup	String	No	The name of the workgroup associated with the database. This is used when generating temporary IAM user credentials. If required and not specified, the first component of the `host` field is used.

Info

Fields with a Required column marked with * can have a value provided directly in the connection specification or indirectly via AWS Secrets Manager using the secret_id field. See Database Authentication Using AWS Secrets Manager for more information.

Redshift Serverless Connector Subtypes¶

The subtype field of the connection specification allows selection of different database drivers.

Subtype	Description
pg8000	Pg8000 is the default if no subtype is specified.
redshift	This is the AWS Redshift connector.

Creating Temporary IAM User Credentials for Redshift Serverless¶

Note

The AWS documentation on this leaves a lot to be desired.

If a password is not obtained from the password field or secrets manager, lava will attempt to use the Redshift Serverless GetCredentials API to generate temporary IAM-based database user credentials.

Unlike the Redshift provisioned GetClusterCredentials API, the Redshift Serverless GetCredentials API does not allow the target database user name to be specified. The username is derived automatically from the IAM principal as follows:

For IAM users, the database username is IAM:<IAM-USER-NAME>.
For IAM roles, the database username is IAMR:<IAM-ROLE-NAME>.

If the user does not already exist in the database, it will be automatically created and given access to the public schema. This is daft but that's how it is. The user can be created manually or given additional database permissions via the normal GRANT mechanism, as required.

This can be very limiting in terms of fine grained access control from lava to Redshift. To provide some flexibility, the Redshift Serverless connector can assume a different IAM role prior to generating database access credentials by specifying the role_arn (and optional external_id) elements in the connection specification. The assumed role is then the one that will determine the database user name.

For example, assume the lava worker normally operates under the IAM role lava-prod-worker-core. If no role_arn is specified, the database user will be IAMR:lava-dev-worker-core.

If role_arn is arn:aws:iam::123456789123:role/rs01, the database user will be IAMR:rs01.

The IAM policy attached to the lava-dev-worker-core role will need to contain something like this:

"Statement": [
    {
        "Sid": "AssumeRoleForRedshiftServerlessAccess"
        "Effect": "Allow",
        "Action": "sts:AssumeRole",
        "Resource": [
            "arn:aws:iam::123456789123:role/rs01"
        ]
    }
]

The IAM policy attached to the rs01 role will need to contain something like this:

"Statement": [
    {
        "Sid": "GetRedshiftServerlessCreds",
        "Effect": "Allow",
        "Action": "redshift-serverless:GetCredentials",
        "Resource": [
            "arn:aws:redshift-serverless:ap-southeast-2:123456789123:workgroup/3741886a-223d-446f-a77c-a5d0e7b5ad32"
        ]
    }
]

The trust policy for the rs01 role will need to contain the elements necessary to allow it to be assumed by lava-dev-worker-core.

Note

Lava currently does not cache temporary credentials. Watch out for throttling on the GetCredentials API.

Connector type: redshift¶

This is the connector for Redshift provisioned clusters. It can also be used for Redshift Serverless clusters except when IAM generated user credentials are used. In that case, the redshift-serverless connector must be used.

This connector is similar to postgres. Note that some operations are specific to Redshift and are not supported on conventional Postgres databases (e.g. the COPY and UNLOAD commands).

Field	Type	Required	Description
cluster_id	String	No	The Redshift cluster identifier. If required and not specified, the first component of the `host` name is used.
conn_id	String	Yes	Connection identifier.
database	String	Yes*	The name of the database within the database server.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
host	String	Yes*	The database host DNS name or IP address.
password_duration	String	No	The password duration when generating temporary IAM user credentials in the form `nnX` where `nn` is a number and `X` is `s` (seconds), `m` (minutes) or `h` (hours). If not specified, the default worker configuration is used. Limits imposed by the GetClusterCredentials API apply.
password	String	No*	The name of an encrypted SSM parameter containing the password. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key. If not specified, the worker will attempt to generate temporary IAM user credentials.
port	Number	Yes*	The database port number.
preserve_case	Boolean	No	If `true`, don't fold database object names to lower case when quoting them for use in db_from_s3 jobs. The default is `false` (i.e. case folding is enabled).
secret_id	String	No	Obtain missing fields from AWS Secrets Manager. More information.
ssl	Boolean	No	Set to `true` to enable SSL. Default is `false`.
subtype	String	No	Specifies the underlying DBAPI 2.0 driver. See Redshift Connector Subtypes below.
type	String	Yes	`redshift`.
user	String	Yes*	Database user name.

Info

Fields with a Required column marked with * can have a value provided directly in the connection specification or indirectly via AWS Secrets Manager using the secret_id field. See Database Authentication Using AWS Secrets Manager for more information.

Redshift Connector Subtypes¶

The subtype field of the connection specification allows selection of different database drivers.

Subtype	Description
pg8000	Pg8000 is the default if no subtype is specified.
redshift	This is the AWS Redshift connector.

Info

As of version 8.1 (Kīlauea), the Redshift connector no longer supports PyGreSQL. This is not a lava change. PyGreSQL just doesn't work with Redshift any more.

Creating Temporary IAM User Credentials for Redshift¶

If the password field is not present in the connection specification, lava will attempt to use the Redshift GetClusterCredentials API to generate temporary IAM-based database user credentials.

The specified user must already exist in the database as lava (deliberately) does not support AutoCreate of users.

Lava will specify the target cluster ID, database and target user in the credentials request. This means that the IAM policy attached to the worker will need to contain an element something like this:

"Statement": [
    {
        "Sid": "GetRedshiftCreds",
        "Effect": "Allow",
        "Action": "redshift:GetClusterCredentials",
        "Resource": [
            "arn:aws:redshift:ap-southeast-2:123456789012:dbuser:cluster_id/target_user",
            "arn:aws:redshift:ap-southeast-2:123456789012:dbname:cluster_id/mydb"
        ]
    }
]

Info

Lava currently does not cache temporary credentials. Watch out for throttling on the GetClusterCredentials API.

Connector type: ses¶

Warning

This is a legacy implementation. It is now deprecated and will be removed in a future release. Use the email connector instead.

The ses connector provides access to the AWS Simple Email Service (SES).

If can be used only with exe and pkg jobs. It provides an environment variable pointing to a script that will run the AWS CLI with appropriate parameters to access the SES service.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
from	String	No	The email address that is sending the email. This email address must be either individually verified with Amazon SES, or from a domain that has been verified with Amazon SES. If not specified, the value specified by the SES_FROM realm configuration parameter is used. A value must be specified by one of these mechanisms.
region	String	No	The AWS region name for the SES service. If not specified, the value specified by the SES_REGION realm configuration parameter is used, which itself defaults to `us-east-1`.
reply_to	String or List[String]	No	The reply-to email address(es) for messages.
return_path	String	No	The email address that bounces and complaints will be forwarded to when feedback forwarding is enabled.

In an exe or package pkg job, the job specification will look something like this:

{
    "job_id": "...",
    "parameters": {
        "connections": {
            "email": "email-connection-id"
        }
    },
    "payload": "my-payload.sh ..."
}

Note the email connection. This will provide the job with an environment variable LAVA_CONN_EMAIL which points to the executable handling the connection.

If the job payload is a shell script, the connector would be invoked thus:

# Send an email with a text message body.
$LAVA_CONN_EMAIL --to fred@somewhere.com --subject "Hello Fred" --text msg.txt

# But wait -- we can do HTML as well
$LAVA_CONN_EMAIL --to fred@somewhere.com --subject "Hello Fred" --html msg.html

# Or read from stdin. The connector will look for <HTML> at start of message
# to determine if message is text or HTML.
$LAVA_CONN_EMAIL --to fred@somewhere.com --subject "Hello Fred" < msg.xxx

The connector script accepts the following arguments:

--to email ...
--cc email ...
--bcc email ...

One or more recipient email addresses.
--subject text

Message subject.
--text filename

File containing the text body of the message. Optional.
--html filename

File containing the HTML body of the message. Optional.

If neither --text nor --html options are specified, the message body is read from stdin. If the content begins with <HTML> (case insensitive), the connector will send it as HTML otherwise as text.

Connector type: sharepoint¶

The sharepoint connector manages connections to SharePoint sites.

It is possible for Microsoft to have made this process more complex and unwieldy, but it is not obvious how.

Field	Type	Required	Description
client_id	String	Yes	The Application ID that the SharePoint registration portal assigned your app. This resembles a UUID.
client_secret	String	Yes	Name of the SSE parameter containing the client secret. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
https_proxy	String	No	HTTPS proxy to use for accessing the SharePoint API endpoints. If not specified, the `HTTPS_PROXY` environment variable is used, if set.
org_base_url	String	Yes	The hostname component of the organisation's SharePoint base URL. e.g. `acme.sharepoint.com`.
password	String	Yes	Name of the SSM parameter containing the password for authenticating to SharePoint. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
site_name	String	Yes	The SharePoint site name.
tenant	String	Yes	The Azure AD registered domain ID. This resembles a UUID.
type	String	Yes	`sharepoint`.
user	String	No	User name for authenticating to SharePoint.

The connector supports the sharepoint_get_doc, sharepoint_get_list, sharepoint_put_doc, sharepoint_put_list and sharepoint_get_multi_doc and job types.

Using SharePoint Connectors¶

The sharepoint connector provides two distinct interfaces:

A native Python interface
A command line interface.

Python Interface for SharePoint Connectors¶

The sharepoint connector can be used with Python based exe and pkg jobs that invoke the lava connection manager directly. In this case, the connector returns a lava.lib.sharepoint.Sharepoint object as described in the lava API documentation. In summary, this class has the following methods:

delete_all_list_items(list_id, list_name)

get_doc(lib_name, path, out_file)

get_list(list_name, out_file, system_columns=None, data_columns=None,
    header=True, **csv_writer_args)

put_doc(lib_name, path, src_file, title=None)

put_list(list_name, src_file, mode='append', error_missing=False,
    data_columns=None, **csv_reader_args)

get_multi_doc(lib_name, path, out_path, glob=None)

close()

Note that this is the low level connector. It does not handle moving files in or out of S3 or Jinja rendering of parameters. It is up to the caller to do that as required.

If the SharePoint connector key in the job's connectors map is spoint, typical usage would be something like:

import os
from lava.connection import get_sharepoint_connection

# Get a lava.lib.sharepoint.Sharepoint instance
sp_conn = get_sharepoint_connection(
    conn_id=os.environ['LAVA_CONNID_SPOINT'],
    realm=os.environ['LAVA_REALM']
)

# Get a list from SharePoint and store it locally.
row_count = sp_conn.get_list('postcodes', 'postcodes.csv', delimiter=',')

# Close the connection
sp_conn.close()

Executable Interface for SharePoint Connectors¶

When used with exe, pkg and docker job types (e.g. shell scripts), the connection is implemented by the lava-sharepoint command.

This is a somewhat higher level interface to the connector in that it can also handle moving data in and out of S3. Jinja rendering is handled as per the sharepoint_get_list, sharepoint_put_list, sharepoint_get_doc, sharepoint_put_doc and sharepoint_get_multi_doc job types.

If the SharePoint connector key in the job's connectors map is spoint, usage is:

usage: $LAVA_CONN_SPOINT [-J] [-l LEVEL] {put-doc,put-list,get-doc,get-list,get-multi-doc} ...

sub-commands:
  {put-doc,put-list,get-doc,get-list,get-multi-doc}
    put-doc             Copy a file into a SharePoint document library.
    put-list            Copy a file into a SharePoint list.
    get-doc             Copy a file from a SharePoint document library.
    get-list            Copy a SharePoint list to a file
    get-multi-doc       Copy multiple files from a SharePoint document library path.

optional arguments:
  -J, --no-jinja        Disable Jinja rendering of the transfer parameters.

logging arguments:
  -l LEVEL, --level LEVEL
                        Print messages of a given severity level or above. The
                        standard logging level names are available but debug,
                        info, warning and error are most useful. The Default
                        is info.

Usage for the get-doc sub-command:

usage: $LAVA_CONN_SPOINT get-doc [options] SharePoint-path file

positional arguments:
  SharePoint-path       Source location. Must be in the form library:path.
                        This will be jinja rendered.
  file                  Target file. Values starting with s3:// will be copied
                        to S3. This will be jinja rendered.

optional arguments:
  -k KMS_KEY_ID, --kms-key-id KMS_KEY_ID
                        AWS KMS key to use for uploading data to S3.

Usage for the get-list sub-command:

usage: $LAVA_CONN_SPOINT get-list [options] SharePoint-list file

positional arguments:
  SharePoint-list       Source SharePoint list name. This will be jinja
                        rendered.
  file                  Target file. Values starting with s3:// will be copied
                        to S3. This will be jinja rendered.

optional arguments:
  -k KMS_KEY_ID, --kms-key-id KMS_KEY_ID
                        AWS KMS key to use for uploading data to S3.
  -H, --no-header       Don't include a header row. A header is included by
                        default.
  --delimiter DELIMITER
                        Output field delimiter.
  --double-quote        As for csv.writer.
  --escape-char ESCAPECHAR
                        As for csv.writer.
  --quote-char QUOTECHAR
                        As for csv.writer.
  --quoting QUOTING     As for csv.writer QUOTE_ parameters (without the
                        QUOTE_ prefix).

Usage for the get-doc sub-command:

usage: $LAVA_CONN_SPOINT get-doc [options] SharePoint-path file

positional arguments:
  SharePoint-path       Source location. Must be in the form library:path.
                        This will be jinja rendered.
  file                  Target file. Values starting with s3:// will be copied
                        to S3. This will be jinja rendered.

optional arguments:
  -k KMS_KEY_ID, --kms-key-id KMS_KEY_ID
                        AWS KMS key to use for uploading data to S3.

Usage for the put-doc sub-command:

usage: $LAVA_CONN_SPOINT put-doc [options] file SharePoint-path

positional arguments:
  file                  Source file. Values starting with s3:// will be copied
                        from S3. This will be jinja rendered.
  SharePoint-path       Target location. Must be in the form library:path.
                        This will be jinja rendered.

optional arguments:
  -t TITLE, --title TITLE
                        Document title. This will be jinja rendered.

Usage for the get-multi-doc sub-command:

usage: $LAVA_CONN_SPOINT get-multi-doc [options] SharePoint-path outpath [glob]

positional arguments:
  SharePoint-path       Source location. Must be in the form library:path.
                        This will be jinja rendered.
  outpath               Target path. Values starting with s3:// will be copied
                        to S3 using given bucket and key as key prefix. This
                        will be jinja rendered.
  glob                  Filter files in sharepoint path on this given glob.
                        This will be jinja rendered.

optional arguments:
  -k KMS_KEY_ID, --kms-key-id KMS_KEY_ID
                        AWS KMS key to use for uploading data to S3.

The following examples show how to use the connector in an exe job using bash:

#!/bin/bash

# Copy a list from S3 to SharePoint, replacing existing contents.
$LAVA_CONN_SPOINT put-list --replace s3://my-bucket/data.csv My-List

# Get list back from SharePoint and place in S3. Include a header
$LAVA_CONN_SPOINT get-list -k alias/data --delimiter "," \
    My-List s3://my-bucket/data.csv

# Copy a document from S3 to SharePoint.
$LAVA_CONN_SPOINT put-doc s3://my-bucket/lava.docx "Lava Docs:/Lava/User Guide.docx"

# Get a document from SharePoint and place in S3.
$LAVA_CONN_SPOINT get-doc "Lava Docs:/Lava/User Guide.docx" s3://my-bucket/lava.docx 

# Get all docx files from SharePoint path and place in S3 base-prefix.
$LAVA_CONN_SPOINT get-multi-doc "Lava Docs:/Lava/" s3://my-bucket/base-prefix *.docx

# Get all files from SharePoint path and place in S3 base-prefix.
$LAVA_CONN_SPOINT get-multi-doc "Lava Docs:/Lava/" s3://my-bucket/base-prefix

Connector type: slack¶

The slack connector uses Slack webhooks to send messages to Slack channels. The target Slack workspace and channel are specified in Slack itself when the webhook is created.

Field	Type	Required	Description
colour	Style	No	Default colour for the sidebar for Slack messages sent using `attachment` style. This can be any hex colour code or one of the Slack special values `good`, `warning` or `danger`. If not specified a default value is used.
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
from	String	No	An arbitrary source identifier for display in Slack messages. If not specified, a default value is constructed when required.
preamble	String	No	Default preamble at the start of Slack messages. Useful values include things such as `<!here>` and `<!channel>` which will cause Slack to insert `@here` and `@channel` alert tags respectively. If not specified, no preamble is used.
style	String	No	Display style for Slack messages. Options are `block` (default), `attachment` and `plain`. The first two use the corresponding block or attachment message construction mechanism provided by Slack to make messages more presentable.
type	String	Yes	`slack`.
webhook_url	String	Yes	The webhook URL provided by Slack for sending messages.

Using the Slack Connector¶

The slack connector provides two distinct interfaces:

A native Python interface
A command line interface.

Python Interface for Slack Connectors¶

Python scripts can directly access the underlying Python interface of a slack connector. In this case, the connector returns a lava.lib.slack.Slack object as described in the lava API documentation.

As an example, consider an exe job specification that looks something like this:

{
    "job_id": "...",
    "parameters": {
        "connections": {
            "slack": "slack-connection-id"
        }
    },
    "payload": "my-payload.py ..."
}

A Python program can use the slack connector like this:

import os
from lava.connection import get_slack_connection

# If running as a lava exe/pkg/docker, get some info provided by lava in the
# environment. Assume our connector is labeled `slack` in the job spec.
realm = os.environ['LAVA_REALM']
conn_id = os.environ['LAVA_CONNID_SLACK']

# Get a slack connection 
slacker = get_slack_connection(conn_id, realm)

# Send a formatted message
slacker.send(
    subject='Oh no',
    message='Your oscillation overthruster has malfunctioned',
    style='attachment',  # Overrides value in connection spec.
    colour='#ff0000'  # Nice bright red. Overrides value in connection spec.
)

Executable Interface for Slack Connectors¶

When used with exe, pkg and docker job types (e.g. shell scripts), the connection is implemented by the lava-slack command.

When used as a connection script within a lava job, the -r REALM and -c CONN_ID arguments don't need to be provided by the job as these are provided by lava in the connection script.

Also, values for the ---bar-colour, --from, --preamble and --style options will be supplied from the connection specification where possible. These values can be overridden by providing the appropriate options to then connection script.

usage: lava-slack [-h] [--profile PROFILE] [-v] -c CONN_ID [-r REALM]
                  [--bar-colour COLOUR] [--from NAME] [--preamble PREAMBLE]
                  [-s SUBJECT] [--style {block,plain,attachment}]
                  [--no-colour] [-l LEVEL] [--log LOG] [--tag TAG]
                  [FILENAME]

Send Slack messages using lava slack connections.

optional arguments:
  -h, --help            show this help message and exit
  --profile PROFILE     As for AWS CLI.
  -v, --version         show program's version number and exit

lava arguments:
  -c CONN_ID, --conn-id CONN_ID
                        Lava connection ID. Required.
  -r REALM, --realm REALM
                        Lava realm name. If not specified, the environment
                        variable LAVA_REALM must be set.

slack arguments:
  --bar-colour COLOUR   Colour for the sidebar for messages sent using
                        attachment style. This can be any hex colour code or
                        one of the Slack special values good, warning or
                        danger.
  --from NAME           Message sender. If not specified, the value specified
                        in the connection specification, if any, will be used.
  --preamble PREAMBLE   An optional preamble at the start of the message.
                        Useful values include things such as <!here> and
                        <!channel> which will cause Slack to insert @here and
                        @channel alert tags respectively.
  -s SUBJECT, --subject SUBJECT
                        Message subject.
  --style {block,plain,attachment}
                        Slack message style. Must be one of attachment, block,
                        plain. If not specified, any value specified in the
                        connection specification will be used or block as a
                        last resort.

message source arguments:
  FILENAME              Name of file containing the message body. If not
                        specified or "-", the body will be read from stdin.
                        Only the first 3000 bytes are read.

logging arguments:
  --no-colour, --no-color
                        Don't use colour in information messages.
  -l LEVEL, --level LEVEL
                        Print messages of a given severity level or above. The
                        standard logging level names are available but debug,
                        info, warning and error are most useful. The default
                        is info.
  --log LOG             Log to the specified target. This can be either a file
                        name or a syslog facility with an @ prefix (e.g.
                        @local0).
  --tag TAG             Tag log entries with the specified value. The default
                        is lava-slack.

As an example, consider an exe job specification that looks something like this:

{
    "job_id": "...",
    "parameters": {
        "connections": {
            "slack": "slack-connection-id"
        }
    },
    "payload": "my-payload.sh ..."
}

Note the slack connection. This will provide the job with an environment variable LAVA_CONN_SLACK which points to the executable handling the connection.

If the job payload is a shell script, the connector would be invoked thus:

# Send a Slack message
$LAVA_CONN_SLACK --subject "Oh no" <<!
    Dear Buckaroo,

    Your oscillation overthruster has malfunctioned.

    -- John Bigbooté
!

Connector type: smb¶

The smb connector manages connections to SMB file shares.

Info

The smb connector has undergone a significant upgrade in v8.0 (Incahuasi) to support the smbprotocol SMB implementation as well as the existing pysmb. The former has a number of advantages (e.g. DFS support). An effort has been made to retain backward compatibility for lava jobs, notwithstanding the two implementations have significant interface differences. Be warned, though, that some more esoteric usage patterns could experience a backward compatibility issue.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
domain	String	No	The network domain. Defaults to an empty string.
enabled	Boolean	Yes	Whether or not the connection is enabled.
encrypt	Boolean	No	Whether to encrypt the connection between Lava and the SMB server. Only available with the `smbprotocol` connection subtype. Default `false`.
host	String	Yes	DNS name or IP address of the SMB host.
is_direct_tcp	Boolean	No	If `false`, use NetBIOS over TCP/IP. If `true` use SMB over TCP/IP. Default `false`.
my_name	String	No	Local NetBIOS machine name that will identify the origin of connections. If not specified, defaults to the first 15 characters of `lava-<REALM>`
password	String	Yes	Name of the SSM parameter containing the password for authenticating to the SMB server. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
port	Integer	No	Connection port number. If not specified, 139 is used if `is_direct_tcp` is `false` and 445 otherwise.
remote_name	String	Yes	NetBIOS machine name of the remote server.
subtype	String	No	Which connection type to use, `smbprotocol` or the default `pysmb`. To use encryption or DFS for the connection use the `smbprotocol` subtype.
type	String	Yes	`smb`.
use_ntlm_v2	Boolean	No	Indicates whether pysmb should be NTLMv1 or NTLMv2 authentication algorithm for authentication. Default is `true`.
user	String	Yes	User name for authenticating to the SMB server.

The connector supports the smb_get and smb_put job types.

Use with Python-based Executable Jobs¶

The connector can also be used with Python based exe and pkg jobs that invoke the lava connection manager directly. In this case, the connector returns a lava.lib.smb.LavaSMBConnection which provides a basic, common interface to the different subtypes.

The lava.lib.smb.LavaSMBConnection interface class provides enough functionality for most common use-cases (list path, put file, get file etc.).

The concrete implementation is handled by of one of two subclasses (depending on the subtype given in the connection spec):

a lava.lib.smb.PySMBConnection which implements LavaSMBConnection using the Python package pysmb. This is the default if no connection subtype is given.
a lava.lib.smb.SMBProtocolConnection which implements LavaSMBConnection using the Python package smbprotocol.

Note that this is the low level connector. It does not handle moving files in or out of S3 or Jinja rendering of parameters. It is up to the caller to do that as required.

If the SMB connector key in the job's connectors map is fserver, typical usage would be something like:

import os
from lava.connection import get_smb_connection

# Get an smb.SMBConnection.SMBConnection instance
smb_conn = get_smb_connection(
    conn_id=os.environ['LAVA_CONNID_FSERVER'],
    realm=os.environ['LAVA_REALM']
)

# Get a file from share 'Public' and store locally
with open('local.txt', 'wb') as fp:
    attributes, size = smb_conn.retrieve_file('Public', 'some_file.txt', fp)

smb_conn.close()

Use with Other Executable Jobs¶

When used with other exe and pkg job types (e.g. shell scripts), the connection is implemented by the lava-smb command.

This is a somewhat higher level interface to the connector in that it can also handle moving data in and out of S3. Jinja rendering is handled as per the smb_get and smb_put job types.

If the SMB connector key in the job's connectors map is fserver, usage is:

usage: $LAVA_CONN_FSERVER [-J] [-l LEVEL] {put,get} ...

sub-commands:
  {put,get}
    put                 Copy a file to an SMB file share.
    get                 Copy a file from an SMB file share.

optional arguments:
  -J, --no-jinja        Disable Jinja rendering of the transfer parameters.

logging arguments:
  -l LEVEL, --level LEVEL
                        Print messages of a given severity level or above. The
                        standard logging level names are available but debug,
                        info, warning and error are most useful. The Default
                        is info.

Usage for the get sub-command:

usage: $LAVA_CONN_FSERVER get [options] SMB-path file

positional arguments:
  SMB-path              Source location. Must be in the form share-name:path.
                        This will be jinja rendered.
  file                  Target file. Values starting with s3:// will be copied
                        to S3. This will be jinja rendered.

optional arguments:
  -k KMS_KEY_ID, --kms-key-id KMS_KEY_ID
                        AWS KMS key to use for uploading data to S3.

Usage for the put sub-command:

usage: $LAVA_CONN_FSERVER put [options] file SMB-path

positional arguments:
  file         Source file. Values starting with s3:// will be copied from S3.
               This will be jinja rendered.
  SMB-path     Target location. Must be in the form share-name:path. This will
               be jinja rendered.

optional arguments:
  -m, --mkdir  Create the target directory if it doesn't exist

For example, the following code in an exe job would transfer files between S3 and the Public share on an SMB server:

#!/bin/bash

# Copy file from S3 to SMB
$LAVA_CONN_FSERVER put --mkdir \
    s3://my-bucket/data.csv Public:/a/path/data.csv

# Copy file from SMB to S3
$LAVA_CONN_FSERVER get --kms-key-id alias/data \
    Public:/a/path/data.csv s3://my-bucket/data.csv

Connector type: sqlite3¶

The sqlite3 connector handles connections to SQLite3 file based databases.

Its use in general lava jobs is pretty marginal at best. It is mostly present to facilitate testing of lava itself.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
host	String	Yes	The name of the file containing the SQLite3 database. If it starts with `s3://`, the file will be copied from S3 when the connection is created and returned to S3 when the connection is closed if it has been modified.
port	Number	Yes*	A value is required but is ignored.
preserve_case	Boolean	No	If `true`, don't fold database object names to lower case when quoting them for use in db_from_s3 jobs. The default is `false` (i.e. case folding is enabled).
type	String	Yes	`sqlite3`.
user	String	Yes*	A value is required but is ignored.

Info

Fields with a Required column marked with * must be present but the value is ignored. This is an unfortunate interface idiosyncrasy resulting from the need to maintain some internal compatibility with the other database connectors.

When used with exe and pkg job types, the connection is implemented by the sqlite3 CLI. It is invoked with the following options:

sqlite3 -bail -batch DATABASE-FILE

Connector type: ssh, scp, sftp¶

This group of connectors provides support for the SSH family of clients.

When used with exe and pkg jobs, each connector provides an environment variable pointing to a script that will run the corresponding CLI with SSH keys managed in the background.

Field	Type	Required	Description
conn_id	String	Yes	Connection identifier.
description	String	No	Description.
enabled	Boolean	Yes	Whether or not the connection is enabled.
ssh_key	String	Yes	The name of an encrypted SSM parameter containing the SSH private key. There must not be any passphrase on the key. For a given `<REALM>`, the SSM parameter name must be of the form `/lava/<REALM>/...` and the value must be a secure string encrypted using the `lava-<REALM>-sys` KMS key.
ssh_options	List[String]	No	A list of SSH options as per ssh_config(5). e.g. `StrictHostKeyChecking=no`
type	String	Yes	`ssh`, `sftp` or `scp`.

The process for saving an SSH private key in the SSM parameter store using the AWS CLI looks like this:

# Create a new SSH key
ssh-keygen -f mykey

# Upload the private key to the SSM parameter store. Here realm name is "dev"
aws ssm put-parameter --name "/lava/dev/ssh01/ssh-key"  \
    --description "SSH key for ssh01" \
    --type SecureString \
    --value "$(cat mykey)" \
    --key-id alias/lava-dev-sys