Provena provides capability for versioning of a relevant subset of entity types in the Registry, including datasets. The purpose of this feature is to allow users to create and manage versions of registered items in Provena. Each version is a distinct entity in the Provena Registry, so users can manage and refer to versions of an item as first class items. This allows users to accurately refer to components used in a provenance trail, e.g. version 1 of Dataset X uses a different configuration of a modelling software suite to version 2 of Dataset X.
Use cases include:
- Versioning items to mark items at a point in their development or release lifecycle
- Creating a version of the item as the item is now deprecated or erroneous
- Maintaining two versions of items that are valid but have different lineages and uses
The screenshot below shows an example set of versions created for a dataset.
Example dataset version info |
---|
What can be versioned?
All registered items which are in the Entity category, can be versioned. For more information about the categories of registered items, see the provenance overview.
This includes:
- Dataset
- Model
- Dataset Template
- Model Run Workflow Template
What cannot be versioned?
Agents and Activities cannot be versioned, this includes:
- Person
- Organisation
- Model Run
What happens when an item is versioned
- A new persistent identifier is minted for each version (i.e. a new Handle ID)
- A versioned item is timestamped
- A new activity is created in the Registry. This activity is of type Version. If you explore the Provenance Graph you will see that the Version activity links the previous item, to the new item, and associates it with the Agent that created the version.
- The new item is assigned a version number, determined by incrementing the previous item’s version number by 1
- Versioned items shall link to the previous revision. Users should be able to navigate back to other versions (previous and later versions)
- A versioned item is a separate entity to other versioned items. Users should be able to view a versioned item independently of previous versions
- Provena provides functionality to navigate to the current version and previous versions distinctly
- When a new version is created for a Dataset and if users want to use the Provena data storage feature, a new storage location is created. This storage location will be empty for users to upload relevant data for that version.