Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions .github/scripts/batch_index.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
set -e

# Similar to dry_run.sh, except actually builds exactly one batch group.
# This way CI can spread the full build across multiple jobs, keeping the
# total time reasonable.
batch_index=$1

# -f not -x since downloaded exe may not have executable permissions.
if [[ -f ./bin/clc-stackage ]]; then
echo "*** ./bin/clc-stackage exists, not re-installing ***"

# May need to add permissions, if this exe was downloaded
chmod a+x ./bin/clc-stackage
else
echo "*** Updating cabal ***"
cabal update

echo "*** Installing clc-stackage ***"
cabal install exe:clc-stackage --installdir=./bin --overwrite-policy=always
fi

if [[ -d output ]]; then
rm -r output
fi

echo "*** Building with --batch-index $batch_index ***"

set +e

./bin/clc-stackage \
--batch 200 \
--batch-index $batch_index \
--cabal-options="--semaphore" \
--cleanup off

ec=$?

if [[ $ec != 0 ]]; then
echo "*** clc-stackage failed ***"
else
echo "*** clc-stackage succeeded ***"
fi

# Print out the logs + the packages we built, in case it is useful e.g.
# what did CI actually do.
if [[ -f generated/generated.cabal ]]; then
echo "*** Printing generated cabal file ***"
cat generated/generated.cabal
else
echo "*** No generated/generated.cabal ***"
fi

if [[ -f generated/cabal.project.local ]]; then
echo "*** Printing generated cabal.project.local file ***"
cat generated/cabal.project.local
else
echo "*** No generated/cabal.project.local ***"
fi

.github/scripts/print_logs.sh

exit $ec
22 changes: 13 additions & 9 deletions .github/scripts/dry_run.sh
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
set -e

echo "*** Updating cabal ***"

cabal update
if [[ -f ./bin/clc-stackage ]]; then
echo "*** ./bin/clc-stackage exists, not re-installing ***"
chmod a+x ./bin/clc-stackage
else
echo "*** Updating cabal ***"
cabal update

echo "*** Installing clc-stackage ***"
# --overwrite-policy=always and deleting output/ are unnecessary for CI since
# this script will only be run one time, but it's helpful when we are
# testing the script locally.

# --overwrite-policy=always and deleting output/ are unnecessary for CI since
# this script will only be run one time, but it's helpful when we are
# testing the script locally.
cabal install exe:clc-stackage --installdir=./bin --overwrite-policy=always
echo "*** Installing clc-stackage ***"
cabal install exe:clc-stackage --installdir=./bin --overwrite-policy=always
fi

if [[ -d output ]]; then
rm -r output
Expand All @@ -18,7 +22,7 @@ fi
echo "*** Building all with --dry-run ***"

set +e
./bin/clc-stackage --batch 100 --cabal-options="--dry-run"
./bin/clc-stackage --batch 200 --cabal-options="--dry-run"

ec=$?

Expand Down
81 changes: 77 additions & 4 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:
- "windows-latest"
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
- uses: haskell-actions/setup@v2
with:
# Should be the current stackage nightly, though this will likely go
Expand Down Expand Up @@ -57,21 +57,94 @@ jobs:
if: ${{ failure() && steps.functional.conclusion == 'failure' }}
shell: bash
run: .github/scripts/print_logs.sh
nix:
dry-run:
strategy:
fail-fast: false
matrix:
os:
- "ubuntu-latest"
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6

- name: Setup nix
uses: cachix/install-nix-action@v30
uses: cachix/install-nix-action@v31
with:
github_access_token: ${{ secrets.GITHUB_TOKEN }}
nix_path: nixpkgs=channel:nixos-unstable

- name: Dry run
run: nix develop .#ci -Lv -c bash -c '.github/scripts/dry_run.sh'

# Upload installed binary so that build-batch does not need to re-install
# it.
- name: Upload clc-stackage binary
uses: actions/upload-artifact@v7
with:
name: clc-stackage-binary
path: ./bin/clc-stackage
retention-days: 1

# Uses jq's 'range(m; n)' operator to create list of indexes from [m, n)
# for the build-batch job. Slightly nicer than manually listing all of them.
build-batch-indexes:
runs-on: "ubuntu-latest"
outputs:
indexes: ${{ steps.set-batch-indexes.outputs.indexes }}
steps:
- id: set-batch-indexes
run: echo "indexes=$(jq -cn '[range(1; 19)]')" >> $GITHUB_OUTPUT

# Ideally CI would run a job that actually builds all packages, but this
# can take a very long time, potentially longer than github's free CI limits
# (last time checked: 5.5 hrs).
#
# What we can do instead, is perform the usual batch process of dividing the
# package set into groups, then have a different job build each group.
# This does /not/ run up against github's free CI limits.
#
# To do this, we have the script batch_index.sh divide the package set into
# groups, per --batch. Then, using github's matrix strategy, have each
# job build only a specific group by passing its index as --batch-index.
#
# In other words, each job runs
#
# clc-stackage --batch N --batch-index k
#
# where k is matrix.index, hence each building a different group.
# The only other consideration we have, then, is to make sure we have enough
# indices to cover the whole package set.
#
# Currently, we choose --batch to be 200, and the total package set is
# around 3400, which is filtered to about 3100 packages to build. We thus
# need at least ceiling(3100 / 200) = 16 indexes to cover this.
#
# There is no harm in going overboard e.g. if we have an index that is out of
# range, that job will simply end with a warning message. We should
# therefore err on the side of adding too many indices, rather than too few.
build-batch:
needs: [build-batch-indexes, dry-run]
strategy:
fail-fast: false
matrix:
index: ${{ fromJSON(needs.build-batch-indexes.outputs.indexes) }}
name: Batch group ${{ matrix.index }}
runs-on: "ubuntu-latest"
steps:
- uses: actions/checkout@v6

- name: Setup nix
uses: cachix/install-nix-action@v31
with:
github_access_token: ${{ secrets.GITHUB_TOKEN }}
nix_path: nixpkgs=channel:nixos-unstable

# Download clc-stackage binary from dry-run job.
- name: Download binary
uses: actions/download-artifact@v7
with:
name: clc-stackage-binary
path: ./bin

- name: Build
run: nix develop .#ci -Lv -c bash -c '.github/scripts/batch_index.sh ${{ matrix.index }}'
34 changes: 16 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,15 @@ The procedure is as follows:
with-compiler: /home/ghc/_build/stage1/bin/ghc
```

6. Run `clc-stackage` and wait for a long time. See [below](#the-clc-stackage-exe) for more details.
6. Add your custom GHC to the PATH e.g.

```
export PATH=/home/ghc/_build/stage1/bin/:$PATH
```

Nix users can uncomment (and modify) this line in the `flake.nix`.

7. Run `clc-stackage` and wait for a long time. See [below](#the-clc-stackage-exe) for more details.

* On a recent Macbook Air it takes around 12 hours, YMMV.
* You can interrupt `cabal` at any time and rerun again later.
Expand All @@ -51,14 +59,14 @@ The procedure is as follows:
$ watch -n 10 "grep -Eo 'Completed|^ -' output/logs/current-build/stdout.log | sort -r | uniq -c | awk '{print \$1}'"
```

7. If any packages fail to compile:
8. If any packages fail to compile:

* copy them locally using `cabal unpack`,
* patch to confirm with your proposal,
* link them from `packages` section of `cabal.project`,
* return to Step 6.

8. When everything finally builds, get back to CLC with a list of packages affected and patches required.
9. When everything finally builds, get back to CLC with a list of packages affected and patches required.

### Troubleshooting

Expand All @@ -68,7 +76,7 @@ Because we build with `nightly` and are at the mercy of cabal's constraint solve

- `p` requires a new system dependency (e.g. a C library).
- `p` is an executable.
- `p` depends on a package in [./excluded_pkgs.jsonc](excluded_pkgs.jsonc).
- `p` depends on an excluded package in [./package_index.jsonc](package_index.jsonc).

- A cabal flag is set in a way that breaks the build. For example, our snapshot requires that the `bson` library does *not* have its `_old-network` flag set, as this will cause a build error with our version of `network`. This flag is automatic, so we have to force it in `generated/cabal.project` with `constraints: bson -_old-network`.

Expand All @@ -87,17 +95,17 @@ We attempt to mitigate such issues by:

Nevertheless, it is still possible for issues to slip through. When a package `p` fails to build for some reason, we should first:

- Verify that `p` is not in `excluded_pkgs.jsonc`. If it is, nightly probably pulled in some new reverse-dependency `q` that should be added to `excluded_pkgs.jsonc`.
- Verify that `p` is not in `package_index.excluded`. If it is, nightly probably pulled in some new reverse-dependency `q` that should be added to `package_index.excluded`.

- Verify that `p` does not have cabal flags that can affect dependencies / API.

- Verify that `p`'s version matches what it is in the current snapshot (e.g. `https://www.stackage.org/nightly`). If it does not, either a package needs to be excluded or constraints need to be added.

In general, user mitigations for solver / build problems include:

- Adding `p` to `excluded_pkgs.jsonc`. Note that `p` will still be built if it is a (transitive) dependency of some other package in the snapshot, but will not have its exact bounds written to `cabal.project.local`.
- Adding `p` to `package_index.excluded`. Note that `p` will still be built if it is a (transitive) dependency of some other package in the snapshot, but will not have its exact bounds written to `cabal.project.local`.

- Manually downloading a snapshot (e.g. `https://www.stackage.org/nightly/cabal.config`), changing / removing the offending package(s), and supplying the file with the `--snapshot-path` param. Like `excluded_pkgs.jsonc`, take care that the problematic package is not a (transitive) dependency of something in the snapshot.
- Manually downloading a snapshot (e.g. `https://www.stackage.org/nightly/cabal.config`), changing / removing the offending package(s), and supplying the file with the `--snapshot-path` param. Like `package_index.jsonc`, take care that the problematic package is not a (transitive) dependency of something in the snapshot.

- Adding constraints to `generated/cabal.project` e.g. flags or version constraints like `constraints: filepath > 1.5`.

Expand All @@ -111,7 +119,7 @@ In general, user mitigations for solver / build problems include:
compiler = pkgs.haskell.packages.ghc<vers>;
```

can be a useful guide as to which GHC was last tested, as CI uses this ghc to build everything with `--dry-run`, which should report solver errors (e.g. bounds) at the very least.
can be a useful guide as to which GHC was last tested, as CI uses this ghc to build everything.

- If you encounter an error that you think indicates a problem with the configuration here (e.g. new package needs to be excluded, new constraint added), please open an issue. While that is being resolved, the mitigations from the [previous section](#troubleshooting) may be useful.

Expand Down Expand Up @@ -189,13 +197,3 @@ For Linux based systems, there's a provided `flake.nix` and `shell.nix` to get a
with an approximation of the required dependencies (cabal itself, C libs) to build `clc-stackage`.

Note that it is not actively maintained, so it may require some tweaking to get working, and conversely, it may have some redundant dependencies.

## Misc

* Your custom GHC will need to be on the PATH to build the `stack` library e.g.

```
export PATH=/home/ghc/_build/stage1/bin/:$PATH
```

Nix users can uncomment (and modify) this line in the `flake.nix`.
14 changes: 8 additions & 6 deletions dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ The `clc-stackage` library is namespaced by functionality:

### parser

`CLC.Stackage.Parser` contains the parsing functionality. In particular, `parser` is responsible for querying stackage's REST endpoint and retrieving the package set. That package set is then filtered according to [excluded_pkgs.json](excluded_pkgs.json). The primary function is:
`CLC.Stackage.Parser` contains the parsing functionality. In particular, `parser` is responsible for querying stackage's REST endpoint and retrieving the package set. That package set is then filtered according to [package_index.jsonc](package_index.jsonc). The primary function is:

```haskell
-- CLC.Stackage.Parser
Expand Down Expand Up @@ -77,21 +77,19 @@ The executable that actually runs. This is a very thin wrapper over `runner`, wh

`clc-stackage` is based on `nightly` -- which changes automatically -- meaning we do not necessarily have to do anything when a new (minor) snapshot is released. On the other hand, *major* snapshot updates will almost certainly bring in new packages that need to be excluded, so there are some general "update steps" we will want to take:

1. Modify [excluded_pkgs.json](excluded_pkgs.json) as needed. That is, updating the snapshot major version will probably bring in some new packages that we do not want. The update process is essentially trial-and-error i.e. run `clc-stackage` as normal, and later add any failing packages that should be excluded.
1. Modify [package_index.jsonc](package_index.jsonc) as needed. That is, updating the snapshot major version will probably bring in some new packages that we do not want. The update process is essentially trial-and-error i.e. run `clc-stackage` as normal, and later add any failing packages to `package_index.excluded` that should be excluded.

2. Update `ghc-version` in [.github/workflows/ci.yaml](.github/workflows/ci.yaml).

3. Update functional tests as needed i.e. exact package versions in `*golden` and `test/functional/snapshot.txt`.

4. Optional: Update nix:
3. Optional: Update nix:

- Inputs (`nix flake update`).
- GHC: Update the `compiler = pkgs.haskell.packages.ghc<vers>;` line.
- Add to the `flake.nix`'s `ldDeps` and `deps` as needed to have the `nix` CI job pass. System libs available on nix can be found here: https://search.nixos.org/packages?channel=unstable.

This job builds everything with `--dry-run`, so its success is a useful proxy for `clc-stackage`'s health. In other words, if the nix job fails, there is almost certainly a general issue (i.e. either a package should be excluded or new system dep is required), but if it succeeds, the package set is in pretty good shape (there may still be sporadic issues e.g. a package does not properly declare its system dependencies at config time).

5. Optional: Update `clc-stackage.cabal`'s dependencies (i.e. `cabal outdated`).
4. Optional: Update `clc-stackage.cabal`'s dependencies (i.e. `cabal outdated`).

### Verifying snapshot

Expand All @@ -114,3 +112,7 @@ $ NO_CLEANUP=1 cabal test functional
```

Note that this only saves files from the _last_ test, so if you want to examine test output for a particular test, you need to run only that test.

> [!TIP]
>
> CI has a job `build-batch` which actually builds the entire package set, hence it can be used in place of manual building / testing. Note it takes about an hour to run.
Loading