In this section we give a translation from valid PROV instances to
structures, and show that a valid PROV instance has a model. We
call this property *weak completeness*.

First, without loss of generality, we assume that all times
specified in activity or event formulas in $I$ are ground values.
If not, set each variable in such a position to some dummy value.
This is justified by the following fact:

#### 6.2.1 Sets

The sets of structure $M(I)$ are:

$$\begin{array}{rcl}Entities& =& \{id\mid I\models typeOf(id,entity)\}\\ Plans& =& \{pl\mid \mathrm{\exists}id,ag,ac,attrs.\text{}wasAssociatedWith(id,ag,act,pl,attrs)\in I,pl\ne -\}\\ Collections& =& \{c\mid I\models typeOf(c,prov:Collection)\text{or}I\models typeOf(c,prov:EmptyCollection)\}\\ Activities& =& \{id\mid I\models typeOf(id,activity)\}\\ & \cup & \{{a}_{id},{a}_{id}^{\prime}\mid id\in Entities\}\\ & \cup & \{{a}_{id}\mid \mathrm{\exists}id,{e}_{2},{e}_{1}.\text{}wasDerivedFrom(id,{e}_{2},{e}_{1},-,-,-,attrs)\in I\}\\ Agents& =& \{id\mid I\models typeOf(id,agent)\}\\ \\ Usages& =& \{id\mid \mathrm{\exists}a,e,t,attrs.\text{}used(id,a,e,t,attrs)\in I\}\\ & \cup & \{{u}_{id}\mid \mathrm{\exists}id,{e}_{2},{e}_{1},attrs.\text{}wasDerivedFrom(id,{e}_{2},{e}_{1},-,-,-,attrs)\in I\}\\ Generations& =& \{id\mid \mathrm{\exists}e,a,t,attrs.\text{}wasGeneratedBy(id,e,a,t,attrs)\in I\}\\ & \cup & \{{g}_{id}\mid \mathrm{\exists}id,{e}_{2},{e}_{1},attrs.\text{}wasDerivedFrom(id,{e}_{2},{e}_{1},-,-,-,attrs)\in I\}\\ & \cup & \{{g}_{id}\mid id\in Entities\}\\ Invalidations& =& \{id\mid \mathrm{\exists}e,a,t,attrs.\text{}wasInvalidatedBy(id,e,a,t,attrs)\in I\}\\ & \cup & \{{i}_{id}\mid id\in Entities\}\\ Starts& =& \{id\mid \mathrm{\exists}a,e,{a}^{\prime},t,attrs.\text{}wasStartedBy(id,a,e,{a}^{\prime},t,attrs)\in I\}\\ Ends& =& \{id\mid \mathrm{\exists}a,e,{a}^{\prime},t,attrs.\text{}wasEndedBy(id,a,e,{a}^{\prime},t,attrs)\in I\}\\ Events& =& Usages\cup Generations\cup Invalidations\cup Starts\cup Ends\\ \\ Associations& =& \{id\mid \mathrm{\exists}ag,act,pl,attrs.\text{}wasAssociatedWith(id,ag,act,pl,attrs)\in I\}\\ Attributions& =& \{id\mid \mathrm{\exists}e,ag,attrs.\text{}wasAttributedTo(id,e,ag,attrs)\in I\}\\ Delegations& =& \{id\mid \mathrm{\exists}a{g}_{2},a{g}_{1},attrs.\text{}actedOnBehalfOf(id,a{g}_{2},a{g}_{1},act,attrs)\in I\}\\ Communications& =& \{id\mid \mathrm{\exists}{a}_{2},{a}_{1},attrs.\text{}wasInformedBy(id,{a}_{2},{a}_{1},attrs)\in I\}\\ Derivations& =& \{id\mid \mathrm{\exists}{e}_{2},{e}_{1},a,g,u,attrs.\text{}wasDerivedFrom(id,{e}_{2},{e}_{1},a,g,u,attrs)\in I\}\\ \\ Influences& =& Events\cup Associations\cup Attributions\cup Communications\cup Delegations\\ & \cup & \{id\mid \mathrm{\exists}{o}_{2},{o}_{1},attrs.\text{}wasInfluencedBy(id,{o}_{2},{o}_{1},attrs)\in I\}\\ \\ Objects& =& Entities\cup Activities\cup Agents\cup Influences\end{array}$$

In the definitions of $Entities$, $Collections$, $Activities$ and
$Agents$ we use the notation $I\models typeOf(id,t)$ to indicate
that $id$ must have type $t$ in $I$ according to the typing
constraints. For example, for entities, this means that the set
$Entities$ contains all identifiers $e,{e}^{\prime}$ appearing in the $entity(e,attrs)$,
$alternateOf(e,{e}^{\prime})$, or $specializationOf(e,{e}^{\prime})$ formulas, as well as all tose
appearing in the appropriate positions of other formulas, as
specified in the typing constraints.

In the definitions of $Activities$, $Generations$, $Invalidations$, and $Usages$ we
write ${a}_{id}$, ${g}_{id}$, ${i}_{id}$ and ${u}_{id}$ respectively to indicate
additional activities, generations and usages added for imprecise
derivations or entities.

In addition, to define the set of $Things$, we introduce an
equivalence relation on $Entities$ as follows:

$${e}_{1}\equiv {e}_{2}\phantom{\rule{thickmathspace}{0ex}}\u27fa\phantom{\rule{thickmathspace}{0ex}}alternateOf({e}_{1},{e}_{2})\in I$$

The fact that this is an equivalence relation follows from the
fact that $I$ is in normal form, since the constraints on
$alternateOf$ ensure that it is an equivalence relation. Recall
that given an equivalence relation $\equiv $ on some set $X$, the
*equivalence class* of $x\in X$ is the set $[x{]}_{\equiv}=\{y\in X\mid x\equiv y\}$. The *quotient* of $X$ by an equivalence
relation on $X$ is the set of equivalence classes, ${X}_{\equiv}=\{[x{]}_{\equiv}\mid x\in X\}$. Now we
define the set of $Things$ as the quotient of $\equiv $-equivalence classes of $Entities$.

$$Things=Entities{/}_{\equiv}=\{[e{]}_{\equiv}\mid e\in Entities\}$$

Observe that since $I$ is normalized and valid, entities and
activities are disjoint, the influences are disjoint from entities,
activities, and agents,
and the different subsets of events and influences are pairwise
disjoint, as required.

#### 6.2.2 Functions

First, we consider the functions associated with $Entities$.

$$\begin{array}{rcl}event{s}^{\prime}(e)& =& \{id\mid used(id,a,e,t,attrs)\in I\}\\ & \cup & \{id\mid wasGeneratedBy(id,e,a,t,attrs)\in I\}\\ & \cup & \{id\mid wasInvalidatedBy(id,e,a,t,attrs)\in I\}\\ & \cup & \{id\mid wasStartedBy(id,a,e,{a}^{\prime},t,attrs)\in I\}\\ & \cup & \{id\mid wasEndedBy(id,a,e,{a}^{\prime},t,attrs)\in I\}\\ & \cup & \{{g}_{e},{i}_{e}\}\\ events(e)& =& event{s}^{\prime}(e)\cup \bigcup _{specializationOf({e}^{\prime},e)\in I}event{s}^{\prime}({e}^{\prime})\\ valu{e}^{\prime}(e,a)& =& \{v\mid entity(e,attrs)\in I,(a=v)\in attrs\}\phantom{\rule{1em}{0ex}}(a\ne uniq)\\ valu{e}^{\prime}(e,uniq)& =& \{uni{q}_{e}\}\\ value(e,a)& =& valu{e}^{\prime}(e)\cup \bigcup _{specializationOf(e,{e}^{\prime})\in I}valu{e}^{\prime}({e}^{\prime})\\ thingOf(e)& =& [e{]}_{\equiv}\end{array}$$

Above, we introduce a fresh attribute name $uniq$, not already in
use in $I$, along with a fresh value $e$ and for each entity $e$ we
add a value $uni{q}_{e}$ to $values(e,uniq)$. This
construction ensures that if an entity is a specialization of another
in $I$ then the specialization relationship will hold in $M(I)$. We
also define the set of all events involved in $e$ as the set of events
immediately involved in $e$ or any specialization of $e$. Similarly,
the values of attributes of $e$ are those immediately declared for $e$
along with those of any ${e}^{\prime}$ that $e$ specializes. We also introduce dummy
generation and invalidation events for each entity $e$, along with
activities ${a}_{e},{a}_{e}^{\prime}$ to perform them.

Similarly, for $Things$, we
employ an auxiliary function $events:Things\to P(Events)$ that collects the set of all
events in which one of the entities constituting the thing participated.

$$\begin{array}{rcl}events(T)& =& \bigcup _{e\in T}events(e)\\ value(T,a,evt)& =& \bigcup _{e\in T,evt\in events(e)}value(e,a)\end{array}$$

The functions $events$, $startTime$ and $endTime$ mapping activities to
their start and end times are defined as follows:

$$\begin{array}{rcl}events(a)& =& \{id\mid used(id,a,e,t,attrs)\in I\}\\ & \cup & \{id\mid wasGeneratedBy(id,e,a,t,attrs)\in I\}\\ & \cup & \{id\mid wasInvalidatedBy(id,e,a,t,attrs)\in I\}\\ & \cup & \{id\mid wasStartedBy(id,a,e,{a}^{\prime},t,attrs)\in I\}\\ & \cup & \{id\mid wasEndedBy(id,a,e,{a}^{\prime},t,attrs)\in I\}\\ & \cup & \{{g}_{e},{i}_{e}\}\\ startTime(id)& =& {t}_{1}\text{where}activity(a,{t}_{1},{t}_{2},attrs)\in I\\ endTime(id)& =& {t}_{2}\text{where}activity(a,{t}_{1},{t}_{2},attrs)\in I\end{array}$$

The start and end times are arbitrary (say, some zero value) for activities with no
$activity$ formula declaring the times. The above definitions of $startTime$ and
$endTime$ ignore any start times asserted in $wasStartedBy$ or
$wasEndedBy$ formulas. If both $activity$ and
$wasStartedBy/wasEndedBy$ statements are present, then they must
match, but PROV-CONSTRAINTS does not require that the times of
multiple start or end events match for an activity with no
$activity$ statement.

For other $Objects$ besides $Entities$ and $Activities$, the
associated sets of $Events$ are defined to be empty. (An $Agent$ that
happens to be an $Entity$ or $Activity$ will have the set of events
defined above for the appropriate kind of object. Note that since
$Entities$ and $Activities$ are disjoint, this definition is unambiguous.)

The function $time$ mapping $Events$ to their $Times$ is defined
as follows:

$$\begin{array}{rcl}time(id)& =& t\text{where}used(id,a,e,t,attrs)\in I\\ time(id)& =& t\text{where}wasGeneratedBy(id,e,a,t,attrs)\in I\\ time(id)& =& t\text{where}wasInvalidatedBy(id,e,a,t,attrs)\in I\\ time(id)& =& t\text{where}wasStartedBy(id,a,e,{a}^{\prime},t,attrs)\in I\\ time(id)& =& t\text{where}wasEndedBy(id,a,e,{a}^{\prime},t,attrs)\in I\end{array}$$

This definition is deterministic because the sets of identifiers
of different $Events$ are disjoint, and the associated times are unique.

The functions giving the interpretations of the
different identified influences are as follows:

$$\begin{array}{rcl}used(id)& =& (a,e)\text{where}used(id,a,e,t,attrs)\in I\\ used({u}_{id})& =& ({a}_{id},{e}_{1})\text{where}wasDerivedFrom(id,{e}_{2},{e}_{1},-,-,-,attrs)\in I\\ generated(id)& =& (e,a)\text{where}wasGeneratedBy(id,e,a,t,attrs)\in I\\ generated({g}_{id})& =& ({e}_{2},{a}_{id})\text{where}wasDerivedFrom(id,{e}_{2},{e}_{1},-,-,-,attrs)\in I\\ generated({g}_{e})& =& (e,{a}_{e})\text{where}e\in Entities\\ invalidated(id)& =& (e,a)\text{where}wasInvalidatedBy(id,e,a,t,attrs)\in I\\ invalidated({i}_{e})& =& (e,{a}_{e}^{\prime})\text{where}e\in Entities\\ started(id)& =& (a,e,{a}^{\prime})\text{where}wasStartedBy(id,a,e,{a}^{\prime},t,attrs)\in I\\ ended(id)& =& (a,e,{a}^{\prime})\text{where}wasEndedBy(id,a,e,{a}^{\prime},t,attrs)\in I\\ \\ associatedWith(id)& =& (ag,act,pl)\text{where}wasAssociatedWith(id,ag,act,pl,attrs)\in I\\ attributedTo(id)& =& (e,ag)\text{where}wasAttributedTo(id,e,ag,attrs)\in I\\ actedFor(id)& =& (a{g}_{2},a{g}_{1},act)\text{where}actedOnBehalfOf(id,a{g}_{2},a{g}_{1},act,attrs)\in I\\ communicated(id)& =& ({a}_{2},{a}_{1})\text{where}wasInformedBy(id,{a}_{2},{a}_{1},attrs)\in I\\ derivationPath(id)& =& {e}_{2}\cdot g\cdot a\cdot u\cdot {e}_{1}\text{where}wasDerivedFrom(id,{e}_{2},{e}_{1},a,g,u,attrs)\in I\\ derivationPath(id)& =& {e}_{2}\cdot {g}_{id}\cdot {a}_{id}\cdot {u}_{id}\cdot {e}_{1}\text{where}wasDerivedFrom(id,{e}_{2},{e}_{1},-,-,-,attrs)\in I\end{array}$$

Note that since $I$ is normalized and valid, by the uniqueness
constraints these functions are all well-defined. In the case for
imprecise derivations, we generate additional activities, generations
and usages linking ${e}_{2}$ to ${e}_{1}$.

The definition of the $influenced$ function is more involved, and
is as follows:

$$\begin{array}{rcl}influenced(id)& =& used(id)\cup generated(id)\cup invalidated(id)\\ & \cup & \{(a,e)\mid (a,e,{a}^{\prime})\in started(id)\}\\ & \cup & \{(a,e)\mid (a,e,{a}^{\prime})\in ended(id)\}\\ & \cup & \{(ag,act)\mid (ag,act,pl)\in associatedWith(id)\}\\ & \cup & attributedTo(id)\\ & \cup & \{(a{g}_{2},a{g}_{1})\mid (a{g}_{2},a{g}_{1},act)\in actedFor(id)\}\\ & \cup & communicated(id)\\ & \cup & \{({e}_{2},{e}_{1})\mid {e}_{2}\cdot w\cdot {e}_{1}\in derivationPath(id)\}\\ & \cup & \{({o}_{2},{o}_{1})\mid wasInfluencedBy(id,{o}_{2},{o}_{1})\in I\}\end{array}$$

This definition ensures that by construction $influenced(id)$
contains all of the other associated relationships. For any specific
$id$, however, most of the above sets will be empty, and the final
line will often be redundant. It is not always redundant, because it
is possible to assert an unspecified influence in $I$.

It is straightforward to verify (by their definitions) that the
event sets associated with entities and activities satisfy the
side-conditions in Component 9.

Finally, the collection membership function $members$ is
defined as follows:

$$members(c)=\{e\mid hadMember(c,e)\in I$$