notesassorted ramblings on computer

Towards Metadata-only Dependencies in Hakyll

Note: This has since been implemented and merged! 🥳

The Hakyll static side generator does presently not support depending only on the metadata of an Identifier. This is problematic as it creates a lot of unnecessary rebuilds when using this Metadata (e.g., in a sidebar) on each generated page.

Current State

Currently, according to my present understanding of it, Haskell’s dependency tracking roughly works as follows:

  1. Hakyll determines an initial Set of modified inputs using resourceModified.
  2. Based on this set, Hakyll determines out-of-date dependencies in a Control.Monad.RWS monad transformer. This happens in Hakyll.Core.Dependencies, specifically in outOfDate it checks for:
    1. Entirely new inputs (e.g., new posts)
    2. Changed patterns (e.g., if a tag page depends on a new post)
    3. Identifier which are now out-of-date because a dependency changed (see bruteForce)

Required Changes

In order to enable dependencies on the Metadata of an Identifier, we would have to:

  1. In the Dependencies data type, somehow allow users to express Metadata-only dependencies. This probably requires an API-breaking changes as to how dependencies are added for Identifiers. For example:

    data Dependencies
      = DependsOn [(DependencyKind, Identifier)]
      | MustRebuild
      deriving (Show)
    
    data DependencyKind = KindContent | KindMetadata
  2. Enable DependencyM to access the Hakyll.Core.Provider.MetadataCache. Thereby, enabling it to determine if the Metadata differs in comparison to the previous run. This is potentially challenging as newProvider invalidates the metadata cache based on resourceModified.

  3. Use the newly added DependKind and the MetadataCache to check if an Identifier only depends on the Metadata of another and that Metadata changed since the last run of bruteForce.