MonkeyCI needs to store various kinds of information:
Most of this information is fairly small and structured, except for logs,
caches and artifacts, which can be large blobs. The structured information
needs to be searchable up to a level, and of course it must be durable. I
would like to keep an open view on which technology is most suited for this,
so I don't want to blindly fall back to a relational database. Currently I'm
thinking that keeping the information in edn
(or json
) files in object
storage could be useful. This can then be augmented with some kind of
indexing system, to allow for searching. Indices themselves could also be
stored in edn
, and could be loaded in a Redis or
ElasticSearch. As long as there is no income, I will
focus on the cheapest solution that gets the job done, without having to
re-invent the wheel. OCI also offers an autonomous JSON database,
which could also serve our needs.
The build process itself only needs to store information, there is no need
to read any, apart from caching. Initially, we will store everything in
object storage, as edn
files. The advantage over json
is that edn can
be appended, you can have multiple objects in one file. This could be useful
for adding log statements, or updating build progress. The information is
stored in a single bucket, organized like <customer>/<project>/<repository>/<build>
.
The build id is generated by MonkeyCI, which could be as simple as a UUID
.
Each build "folder" contains the following information:
Depending on the configuration, this could also just be store locally, which is what we will do initially, or in development mode.
Artifacts are just blobs that will be put into storage after each build step.
Since storage is not free, we will have to put a limit to the amount of data,
or to the period we will store it. Artifacts are configured at step level,
and have a name and one or more paths that will be added to the artifact.
We will probably use tar
and gzip
to put all files in one package.
Caches are similar to artifacts, but caches are not publicly available, but
rather reused between builds. Similar to CircleCI or Gitlab, we could
assign a key to each cache. This means that caches won't be stored along with
the build, but higher up, most likely at repository level. Each build step
can hold a cache
configuration entry, that has a key and a list of paths
that need to be cached/restored. Before the step is executed, the cache is
restored (if found), and after the step, it is updated. Depending on the
configuration, the update will happen only if the step was successful, or
regardless of status.
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close