--- orphan: true --- # Scripts for GitHub CI A set of `gh_*.py` scripts work together to produce size comparisons for PRs. ## Reports on Pull Requests The scripts' results are presented as comments on PRs. **Note** that a comment may be updated by the scripts as CI run results become available. **Note** that the scripts will not create a comment for a commit if there is already a newer commit in the PR. A size report comment consists of a title followed by one to four tables. A title looks like: > PR #12345678: Size comparison from `base-SHA` to `pr-SHA` The first table, if present, lists items with a large increase, according to a configurable threshold. The next table, if present, lists all items that have increased in size. The next table, if present, lists all items that have decreased in size. The final table, always present, lists all items. ## Usage in CI The original intent was to have a tool that would run after a build in CI, add its sizes to a central database, and immediately report on size changes from the parent commit in the database. Unfortunately, GitHub provides no practical place to store and share such a database between workflow actions. Instead, the process is split; builds in CI record size information in the form of GitHub [artifacts](https://docs.github.com/en/actions/advanced-guides/storing-workflow-data-as-artifacts), and a later step reads these artifacts to generate reports. ### 1. Build workflows #### gh_sizes_environment.py The `gh_sizes_environment.py` script should be run once in each workflow that records sizes, _after_ checkout and _before_ any use of `gh_sizes.py` It takes a single argument, a JSON dictionary of the `github` context. Typically run as: ``` steps: - name: Checkout uses: actions/checkout@v3 with: submodules: true - name: Set up environment for size reports if: ${{ !env.ACT }} env: GH_CONTEXT: ${{ toJson(github) }} run: scripts/tools/memory/gh_sizes_environment.py "${GH_CONTEXT}" ``` #### gh_sizes.py The `gh_sizes.py` script runs on a built binary (executable or library) and produces a JSON file containing size information. Usage: `gh_sizes.py` _platform_ _config_ _target_ _binary_ [_output_] Where _platform_ is the platform name, corresponding to a config file in `scripts/tools/memory/platform/`. Where _config_ is a configuration identification string. This has no fixed meaning, but is intended to describe a build variation, e.g. a particular target board or debug vs release. Where _target_ is a readable name for the build artifact, identifying it in reports. Where _binary_ is the input build artifact. Where _output_ is the name for the output JSON file, or a directory for it, in which case the name will be _platform_`-`_config_name_`-`_target_name_`-sizes.json`. Example: ``` scripts/tools/memory/gh_sizes.py \ linux arm64 thermostat-no-ble \ out/linux-arm64-thermostat-no-ble/thermostat-app \ /tmp/bloat_reports/ ``` #### Upload artifacts The JSON files generated by `gh_sizes.py` must be uploaded with an artifact name of a very specific form in order to be processed correctly. Example: ``` Size,Linux-Examples,${{ env.GH_EVENT_PR }},${{ env.GH_EVENT_HASH }},${{ env.GH_EVENT_PARENT }},${{ github.event_name }} ``` Other builds must replace `Linux-Examples` with a label unique to the workflow, but otherwise use the form exactly. ### 2. Reporting workflow Run a periodic workflow calling `gh_report.py` to generate PR comments. This script has full `--help`, but normal use is probably best illustrated by an example: ``` scripts/tools/memory/gh_report.py \ --verbose \ --report-increases 0.2 \ --report-pr \ --github-comment \ --github-limit-artifact-pages 50 \ --github-limit-artifacts 500 \ --github-limit-comments 20 \ --github-repository project-chip/connectedhomeip \ --github-api-token "${{ secrets.GITHUB_TOKEN }}" ``` Notably, the `--report-increases` flag provides a _percent growth_ threshold for calling out ‘large’ increases in GitHub comments. When this script successfully posts a comment on a GitHub PR, it removes the corresponding PR artifact(s) so that a future run will not process it again and post the same comment. Only PR artifacts are removed, not push (trunk) artifacts, since those may be used as a comparison base by many different PRs. ## Using a database It can be useful to keep a permanent record of build sizes. ### Updating the database: `gh_db_load.py` To update an SQLite file of trunk commit sizes, periodically run: ``` gh_db_load.py \ --repo project-chip/connectedhomeip \ --token ghp_ThIsIsNoTMyReAlGiThUbToKeNSoDoNoTtRy \ --db /path/to/database ``` Those interested in only a single platform can add the `--github-label` option, providing the same name as in the size artifact name after `Size,` (e.g. `Linux-Examples` in the upload example above). See `--help` for additional options. _Note_: Transient 4xx and 5xx errors from GitHub's API are very common. Run `gh_db_load.py` frequently enough to give it several attempts before the relevant artifacts expire. ### Querying the database: `gh_db_query.py` While the database can of course be used directly, the `gh_db_query.py` script provides a handful of common queries. Note that this script (like others that show tables) has an `--output-format` option offering (among others) CSV, several JSON formats, and any text format provided by [tabulate](https://pypi.org/project/tabulate/). Two notable options: - `--query-build-sizes PLATFORM,CONFIG,TARGET` lists sizes for all builds of the given kind, with a column for each section. - `--query-section-changes PLATFORM,CONFIG,TARGET,SECTION` lists changes for the given section. The `--report-increases PERCENT` option limits this to changes over a given threshold (as is done for PR comments). (To find out what PLATFORM, CONFIG, TARGET, and SECTION exist: `--query-platforms`, then `--query-platform-targets=PLATFORM` and `--query-platform-sections=PLATFORM`.) See `--help` for additional options.