You're trying to list all branches for 1000+ repositories by id via GitLab GraphQL API
The API won't let you ask for more than 100 repositories at once, and the query complexity limit in this case brings that number down to 35. Each of these repositories might have more than a thousand branches, so you need paging there too. And we would like to be efficient with the number of requests.
Unfortunately it is impossible to do this directly. A GraphQL query looks like this:
projects(first: 35,
ids: ["gid://gitlab/Project/28743567", "gid://gitlab/Project/29397853", ...],
membership: true) {
pageInfo { hasNextPage, endCursor }
nodes {
id, name, visibility, httpUrlToRepo, fullPath
namespace {
repository { rootRef,
branchNames(limit: 100, offset: 0, searchPattern: "*" )
We have two levels of paging here:
+ IDs
combo, which is can load up to 35 project by IDs. So if we have more than 35 ids, we need to do this in multiple queriesoffset
+ limit
combo. So we can only request a specific page, but same page for all of the IDs listedTo be efficient with our requests we must therefore batch IDs that have e.g. third page available with each other.
We want to have a result that has information about the project + list of branches.
The response to that query looks like this:
:body {:data {:projects {:pageInfo {:hasNextPage false, :endCursor "eyJpZCI6IjI3NjY1ODQzIn0"},
:nodes [{:id "gid://gitlab/Project/29918528",
:name "Tinygo",
:visibility "private",
:httpUrlToRepo "",
:fullPath "rtg41000/group1/tinygo",
:namespace {:id "gid://gitlab/Group/13505290"},
:repository {:rootRef "release",
:branchNames ["cgo-picolibc-stdio"
Here's some code:
(defn branches [auth-token ids page]
(let [q (format "query {
projects(first: %s,
ids: %s,
membership: true) {
pageInfo { hasNextPage, endCursor }
nodes {
id, name, visibility, httpUrlToRepo, fullPath
namespace {
repository { rootRef, branchNames(limit: %s, offset: %s, searchPattern: \"*\" )}
(count ids)
(json/generate-string ids)
(* branches-per-page page))]
{:url ""
:content-type :json
:as :json
:oauth-token auth-token
:request-method :post
:form-params {:query q}})))
(defn project-branches [auth-token paging-states]
(let [page (:page-cursor (first paging-states) 0)
b (branches auth-token (map :id paging-states) page)
nodes (get-in b [:body :data :projects :nodes])
results (mapv (fn [{:keys [id repository] :as node}]
{:id id
:entity-type :project-branches
:project (update node :repository dissoc :branchNames)
:page-cursor (when (>= (count (:branchNames repository)) branches-per-page)
(inc page))
:items (:branchNames repository)})
(p/merge-results paging-states results)))
(def e2 (-> (p/engine)
(p/with-concurrency 5)
(p/with-batcher false ids-per-page :page-cursor)))
#(project-branches "521aaxxxxxx7befc7828" %)
(map #(vector :project-branches %) ids))
It returns updated paging-states collection, with extra :project
key with all the project related information
In the example the concurrency is 1. On my sample of 117 projects with various amount of branches this produces the following requests:
This is obviously the least number of requests needed to page them all. If I make the concurrency 5 I get:
This is less ideal, but it keeps making requests every 100ms if there's spare concurrency. The actual wall clock time is twice lower than when concurrency is 1.
{:id "gid://gitlab/Project/29917873",
:entity-type :project-branches,
:pages 1,
:items ["master"],
:page-cursor nil,
:project {:id "gid://gitlab/Project/29917873",
:name "Awesome Hacking",
:visibility "private",
:httpUrlToRepo "",
:fullPath "rtg41000/group8/Awesome-Hacking",
:namespace {:id "gid://gitlab/Group/13505299"},
:repository {:rootRef "master"}}}
{:id "gid://gitlab/Project/29917932",
:entity-type :project-branches,
:pages 1,
:items ["dependabot/npm_and_yarn/"
:page-cursor nil,
:project {:id "gid://gitlab/Project/29917932",
:name "Blueprint",
:visibility "private",
:httpUrlToRepo "",
:fullPath "rtg41000/group8/blueprint",
:namespace {:id "gid://gitlab/Group/13505299"},
:repository {:rootRef "develop"}}}
We can then reconstruct the expected final data shape easily from such maps.
Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close