A small Symphony bot that attempts to unfurl URLs posted to any chat or room the bot is invited to.
"Unfurling" involves reading a variety of metadata from the given URL (title, server-preferred URL, description, preview image, etc.), formatting those elements into a human-readable message, and posting it back to the same chat.
Here it is in action:
The bot is running in the production Symphony network, hosted in the Foundation's production pod,
and is enabled for cross-pod communication (so users in other pods can connect to the bot and use it). As a result,
there is no installation process, beyond requesting a connection to the bot in the Symphony directory - the bot is
running as a user called Unfurl Bot
, in the Foundation
pod. Note that the bot will take up to 30
minutes to accept new connection requests.
And of course you can always download the bot in source form and build and run it that way, if you'd prefer. The remainder of this document provides instructions on how to do that.
unfurl bot is configured via a single, optional EDN file that may be specified on the command line via the "-c" command line option. You can also provide a "-h" command line option to get help on all of the command line options the bot supports.
The configuration file is traditionally called config.edn
(but may be called anything you like) and may be stored anywhere
that can be read by the bot's JVM process via standard POSIX file I/O. It's loaded using the aero
library - see the aero documentation for details on the various advanced
options aero supports.
The bot ships with a default config.edn
file
that will be read if a config file is not specified on the command line. This file delegates basically all configuration to
environment variables, allowing the administrator to deploy and run the bot as a standalone uberjar, and configure it exclusively
from the runtime environment.
Please refer to the default config.edn
file
for details on using these environment variables. Their use is not described here.
The bot's configuration includes sensitive information (certificate locations and passwords), so please be extra careful to secure this configuration, however you choose to manage it (in a file, environment variables, etc.).
The configuration file is structured as follows:
{
:symphony-coords {
:pod-id "<id of pod to connect to - will autopopulate whichever of the 4 URLs aren't provided. (optional - see below)>"
:session-auth-url "<the URL of the session authentication endpoint. (optional - see below)>"
:key-auth-url "<The URL of the key authentication endpoint. (optional - see below)>"
:agent-api-url "<The URL of the agent API. (optional - see below)>"
:pod-api-url "<The URL of the Pod API. (optional - see below)>"
:trust-store ["<path to Java truststore>" "<password of truststore>"]
:user-cert ["<path to bot user's certificate>" "<password of bot user's certificate>"]
:user-email "<bot user's email address>"
}
:jolokia-config {
"host" "<jolokia-server-host>"
"port" "<jolokia-server-port-as-a-string>"
}
:blacklist ["<hostname>" "<hostname>" ".xxx" "microsoft.com" ...] ; Optional
:blacklist-files ["/path/or/url/of/text/file.txt" "/path/or/url/of/some/other/file.txt"] ; Optional
:unfurl-timeout-ms <timeout-in-ms> ; Optional - defaults to 2 seconds
:http-proxy ["<proxy-host>" <proxy-port>] ; Optional - only needed if you use an HTTP proxy
:accept-connections-interval <minutes> ; Optional - defaults to 30 minutes
:admin-emails ["user1@domain.tld" "user2@domain.tld"] ; Optional
}
The coordinates of the various endpoints, certificates, knickknacks and geegaws that the bot needs in order to connect to a
Symphony pod. This map is passed directly to the
clj-symphony library's connect
function,
and has the same semantics as what's described there.
The configuration of the Jolokia library, used to support server-side ops monitoring of the bot.
This map is passed directly to Jolokia's JolokiaServerConfig
constructor.
See the default Jolokia property file
for a full list of the supported configuration options and their default values, and note that all
keys and values in this map MUST be strings (this is a Jolokia requirement).
These two settings define the blacklist (aka blocklist) the bot should refer to, in order to determine whether a given URL should be ignored. It can be provided:
:blacklist
)
:blacklist-files
)
clojure.core/slurp
- this includes
both local files and remote URLsRegardless of how the blacklist is provided (inline, files, or both), all entries are merged and de-duped, resulting in a single blacklist used by the bot at runtime.
Entries themselves may be a hostname, domain name, or TLD, and must not begin with a full stop (.) character. Some examples:
Blacklist Entry | Description |
---|---|
localhost | Blacklists localhost. |
xxx | Blacklists everything in the ".xxx" TLD. |
microsoft.com | Blacklists every site with a ".microsoft.com" URL. |
drive.google.com | Blacklists Google Drive. |
If you're looking for a curated public blacklist, Université Toulouse 1 Capitole provides a comprehensive one
that's compatible with this feature (configure unfurl bot to use whichever of the various domains
files suit your needs,
via the :blacklist-files
setting). Note that configuring this entire blacklist results in the bot using approximately 1GB of
memory - make sure your server and JVM are sized appropriately.
The timeout, in milliseconds, for each unfurling operation. If not specified, defaults to 2000 (2 seconds).
The coordinates of an HTTP proxy to be used when accessing URLs that are to be unfurled.
Note that use of an HTTP proxy to make calls to the Symphony APIs are not yet supported by clj-symphony.
The interval (in minutes) that the bot will use to check for and accept incoming cross-pod connection requests. If not specified, defaults to 30 minutes.
A list of administrator email addresses. These users will be able to interact with the bot via ChatOps (1:1 chats with the bot
in Symphony). Administrators should say help
to the bot to get a list of the available admin commands.
unfurl bot uses the logback library for logging, and ships with a
reasonable default logback.xml
file.
Please review the logback documentation if you
wish to override this default logging configuration.
For now, you can run unfurl bot either directly or as a Docker image.
$ lein git-info-edn
$ lein run -- -c <path to EDN configuration file>
or
$ lein do git-info-edn, uberjar
...
$ java -jar ./target/bot-unfurl-standalone.jar -c <path to EDN configuration file>
To build the container:
$ docker build -t bot-unfurl .
To run the container:
$ # Interactively:
$ docker run -v /path/to/config/directory:/etc/opt/bot-unfurl:ro bot-unfurl
$ # In the background:
$ docker run -d -v /path/to/config/directory:/etc/opt/bot-unfurl:ro bot-unfurl
Where /path/to/config/directory
should be replaced with the fully qualified path of the configuration directory
on the Docker host. This configuration directory must contain:
config.edn
file (in the format described above), that points to the certificates using /etc/opt/bot-unfurl
as the base path (that's where the configuration folder is mounted within the container)And it may optionally also contain:
You can also use Docker Compose, by running:
$ docker-compose up -d
This assumes that the etc
directory contains the certificate, truststore, and config.edn
file, as described above.
This project has two permanent branches called master
and dev
. master
is a
GitHub protected branch and cannot be pushed to directly -
all pushes (from project team members) and pull requests (from the wider community) must be made against the dev
branch. The project team will periodically merge outstanding changes from dev
to master
.
All commits to the dev
branch automatically trigger redeployment of the instance of the bot that's configured to run against the
Symphony pod in the Foundation's Open Developer Platform (ODP).
All commits to the master
branch automatically trigger redeployment of the instance of the bot that's configured to run
against the Foundation's production pod.
Copyright 2016 Fintech Open Source Foundation
Distributed under the Apache License, Version 2.0.
SPDX-License-Identifier: Apache-2.0
To see the full list of licenses of all third party libraries used by this project, please run:
$ lein licenses :csv | cut -d , -f3 | sort | uniq
To see the dependencies and licenses in detail, run:
$ lein licenses
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close