pytact.graph_api_capnp_cython
Graph_Node_Reader(_DynamicStructReader dyn)
Fits exactly in 128 bits. A node has a label that optionally contains some additional information, together with a list of outgoing edges (children).
Inlined for efficiency purposes Proof state Hash a unique id (evar) for the proof state that distinquishes proof states with identical contents but do not point to the same object nonetheless
The identity of a node uniquely determines it. That is, one can consider any to nodes with the same identity to be equal. The notion of equal we use is as follows:
- Graph perspective: Two nodes have the same identity if and only if they are bisimilar.
In this notion, bisimilarity does take into consideration the label of the nodes, modulo some
equivalence class that is not fully specified here. One aspect of the equivalence class is that
for definition nodes their associated global context (accessed through
Definition.previous
) is not taken into account. - Lambda calculus perspective: Two nodes have the same identity if their corresponding lambda terms are alpha-equivalent. Note that two definitions with the same body are not considered alpha-equivalent.
The identity of a node is used to perform partial graph-sharing. That is, two nodes with the same identity are merged when the graph is generated. There are three reasons why two nodes with the same semantic identity might have a different physical identity:
- Nodes are only merged when the first node exists in the same graph as the second node, or exists in a dependent graph. Hence, nodes originating from developments that do not depend on each other are never merged. Full graph-sharing would require global analysis on a dataset, which any consumer can optionally do as a post-processing step.
- Two definition nodes with the same name and body have the same identity. But if they occur in different global contexts, these nodes are physically different to ensure the integrity of their global contexts.
- For definitions with an opaque proof, the sub-graph that represents this opaque proof is ignored while calculating the identity. Morally we defend this decision with the concept that opaque proofs are invisible in Coq (proof irrelevance) and two constants with different proofs might as well be considered equal. Practically speaking, this is done so that one can compare the identity of a definition in a dataset quickly to a identity of a live definition during Coq interaction without having to produce the entire graph of opaque proofs (which tend to be large).
Beware that the identity is currently a 64 bit field. For datasets that have graphs of size in the order of billions of nodes there is a non-trivial chance of a collision. (We consider this acceptable for machine learning purposes.)
Graph_EdgeTarget_Reader(_DynamicStructReader dyn)
Fits exactly in 64 bits. Let's keep it that way. The 'pointy' end of an edge, together with the label of that edge.
Graph_Reader(_DynamicStructReader dyn)
A graph is the main data store (the 'heap') that contains the bulk of the data in the dataset and during communication with Coq. A graph is a collection of labeled nodes with directed, labeled edges between them. A graph may reference nodes from other graphs. For an edge 'A --> B' in the graph it always holds that 'A' is part of the current graph, but 'B' may (or may not) be part of another graph.
The main memory store of the graph. It acts as a heap similar to the main memory of a C/C++ program.
The heap is accessed by indexing the nodes
list using a NodeIndex
which returns a Node
.
Every node has a label and a list of children, which is indicated as a range within the edges
list using
childrenIndex
and childrenCount
. The targets of the edges can again be found in the nodes
list of the
current file or of a dependency.
Note that just like in C/C++ doing pointer arithmetic on the heap is undefined behavior, and you may
encounter arbitrary garbage if you do this. In particular, iterating over the heap is discouraged.
Instead, you should access the heap through various entry-points that are provided.
IntP_Reader(_DynamicStructReader dyn)
Definition_Reader(_DynamicStructReader dyn)
A definition of the CIC, which is either an constant, inductive, constructor, projection or section variable. Constants and section variables can have tactical proofs associated to them.
The fully-qualified name of the definition. The name should be unique in a particular global context, but is not unique among different branches of the global in a dataset.
The previous definition within the global context of the current file.
For the first definition this field is set to len(graph.nodes)
.
The contract on this field is that any definition nodes reachable from the forward closure of the definition
must also be reachable through the chain of previous fields. An exception to this rule are mutually
recursive definitions. Those nodes are placed into the global context in an arbitrary ordering.
This field provides information about the external global context. At any point in a source file other files 'X' can be loaded through 'Require X'. When this happens, the definitions of X that are reachable through its 'representative' field become available to all subsequent definitions.
These definitions are part of a mutually recursive cluster. They hold a reference to another definition
that acts as the 'representative' of the mutually recursive cluster. The representative is chosen such
that all definitions of the cluster are reachable through its previous
pointer. Additionally, all
definitions within the cluster have the same representative, and no definitions that are not part of the
cluster are interleaved within the chain of previous
nodes.
A constant defined by directly inputting a term In the future, we might augment such constants with tactical refinement proofs that build the term iteratively.
A constant that was either directly or indirectly generated by a tactical proof. Note that for non-original constants, the proof step sequence may not be very accurate.
FloatP_Reader(_DynamicStructReader dyn)
Graph_Node_Label_Reader(_DynamicStructReader dyn)
Inlined for efficiency purposes Proof state Hash a unique id (evar) for the proof state that distinquishes proof states with identical contents but do not point to the same object nonetheless
Definition_Status_Substituted_Reader(_DynamicStructReader dyn)
A definition that was obtained by performing some sort of module functor substitution. We reference the original definition.
Definition_Status_Reader(_DynamicStructReader dyn)
A definition can have different origins, which are encoded here
a definition that was originally defined in a section and has now had the section variables discharged into it. The node index references the original definition.
ProofStep_Reader(_DynamicStructReader dyn)
A proof step is the execution of a single tactic on one or more proof states, producing a list of outcomes.
The tactic that generated the proof step. Note that the arguments of the tactic can be found in the individual outcomes, because they may be different for each outcome.
Tactic_Reader(_DynamicStructReader dyn)
A concrete tactic with it's parameters determined. Somewhat strangely, this struct does not actually include
these parameters. They can instead be found in Outcome.tacticArguments
. The reason for this is that one
tactic can run on multiple proof states at the same time and for all of those proof states, the arguments
may be resolved differently.
A hash representing the identity of a tactic without it's arguments. Due to the complexity of the syntax trees of Coq's tactics, we do not currently encode the syntax tree. Instead, this hash is a representative of the syntax tree of the tactic with all of it's arguments removed.
The full text of the tactic including the full arguments. This does not currently correspond to (ident, arguments) because in this dataset arguments do not include full terms, but only references to definitions and local context elements. Tactics are postprocessed to remove explicit naming of new hypotheses as much as possible, and instead asking Coq to invent a name.
A textual representation of the base tactic without arguments. It tries to roughly correspond to ident
.
Note, however, that this is both an under-approximation and an over-approximation. The reason is that tactic
printing is not 100% isomorphic to Coq's internal AST of tactics. Sometimes, different tactics get mapped to
the same text. Conversely, the same tactic may be mapped to different texts when identifiers are printed
using different partially-qualified names.
A textual representation that tries to come as close as possible to (ident, arguments).
It comes with the same caveats as baseText
.
Indicates whether or not ident
+ arguments
is faithfully reversible into the original "strictified" tactic.
Note that this does not necessarily mean that it represents exactly the tactic that was inputted by the user.
All tactics are modified to be 'strict' (meaning that tactics that have delayed variables in them break).
This flag measures the faithfulness of the representation w.r.t. the strict version of the tactic, not the
original tactic inputted by the user.
Same as text
except that the tactic is not postprocessed to remove explicit naming of new hypotheses.
ProofStep_Tactic_Reader(_DynamicStructReader dyn)
The tactic that generated the proof step. Note that the arguments of the tactic can be found in the individual outcomes, because they may be different for each outcome.
Outcome_Reader(_DynamicStructReader dyn)
An outcome is the result of running a tactic on a proof state. A tactic may run on multiple proof states.
The proof term that witnesses the transition from the before state to the after states. It contains a hole
(an evar
node) for each of the after states. It may also refer to elements of the local context of the
before state.
Outcome_Term_Reader(_DynamicStructReader dyn)
The proof term that witnesses the transition from the before state to the after states. It contains a hole
(an evar
node) for each of the after states. It may also refer to elements of the local context of the
before state.
ProofState_Reader(_DynamicStructReader dyn)
A proof state represents a particular point in the tactical proof of a constant.
The entry-point of the proof state, all nodes that are 'part of' the proof state are reachable from here.
The local context of the proof state. These nodes label's are either contextAssum
or contextDef
. Note that
these nodes are also reachable from the root of the proof state.
The names of the local context nodes of the proof state, as they originally appeared in the proof. These names should be used for debugging and viewing purposes only, because hypothesis-generating tactics have been modified to use auto-generated names. Hence, tactics should not be concerned about the names of the context.
A unique identifier of the proof state. Any two proof states in a tactical proof that have an equal id
can morally be regarded to be 'the same' proof state.
IMPORTANT: Two proof states with the same id may still have different contents. This is because proof states
can contain existential variables (represented by the evar
node) that can be filled as a
side-effect by a tactic running on another proof state.
TODO: This is currently duplicated with the id attached to the ProofState
node that root
points to.
Eventually, this id should be resolved to a NodeIndex that represents the location of this proof state
in the final proof term that was generated.
Argument_Reader(_DynamicStructReader dyn)
A concrete argument of a tactic.
ProofState_Root_Reader(_DynamicStructReader dyn)
The entry-point of the proof state, all nodes that are 'part of' the proof state are reachable from here.
Node_Reader(_DynamicStructReader dyn)
A node in the CIC graph
Argument_Term_Reader(_DynamicStructReader dyn)
The root of a graph representing an argument that is a term in the calculus of constructions.
Graph_EdgeTarget_Target_Reader(_DynamicStructReader dyn)
AbstractTactic_Reader(_DynamicStructReader dyn)
DataVersion_Reader(_DynamicStructReader dyn)
Version info of a dataset.
Currently we only have a major version. Any change to the graph format, Cap'n Proto schema or communication protocol is considered a breaking change and will increment the major version number. Note that currently we don't distinquish backwards-compatible Cap'n Proto schema changes from non-backwards-compatible changes. The schema might change in a non-compatible way without notice. In the future, a minor version id might be introduced.
Dataset_Reader(_DynamicStructReader dyn)
Every file in the dataset contains a single message of type Dataset
. Every file corresponds to
a Coq source file, and contains a representation of all definitions that have existed at any point
throughout the compilation of the source file.
The graph contained in a file may reference nodes from the graph of other files. This field maps
a DepIndex
into the a file that contains that particular node.
The first file in this list is always the current file. It is guaranteed that no cycles exist in
the dependency relation between files induced by this field (except for the self-reference of the file).
The entry point of the global context of definitions that are available when this file is 'Required' by
another file. The full global context can be obtained by following the previous
node of definitions.
If the compilation unit does not contain any 'super'-global definitions this is set to len(graph.nodes)
All of the definitions present in the graph.
Note that some of these nodes may not be part of the 'super-global' context that is reachable using the
representative
field as an entry point. The reason is that the global context is a forest (list of tree's)
and the 'super-global' context is only the main spine of this forest.
Exception_Reader(_DynamicStructReader dyn)
A list of things that can go wrong.
ExecutionResult_Reader(_DynamicStructReader dyn)
The result of executing a tactic on a proof state.
ExecutionResult_NewState_Reader(_DynamicStructReader dyn)
The tactic ran successfully and produced a new proof state.
PredictionProtocol_Reader(_DynamicStructReader dyn)
This protocol works by exchanging raw messages over a socket. The protocol is fully linear.
Coq sends a Request
and waits for a corresponding Response
message. A message of a given
type should be responded with using the obviously corresponding response type.
PredictionProtocol_Request_Reader(_DynamicStructReader dyn)
Start a context for making tactical predictions for proof search. The context includes the tactics that are currently available, the definitions that are available.
TODO: Get rid of this in next version, no longer used. Coq uses this message to synchronize the state of the protocol when exceptions have occurred. The contract is that the given integer needs to be echo'd back verbatim.
PredictionProtocol_Request_Initialize_Reader(_DynamicStructReader dyn)
Start a context for making tactical predictions for proof search. The context includes the tactics that are currently available, the definitions that are available.
Initialize messages can depend on previously sent initialize messages. These messages form a stack. This field indicates the size of the stack that this message depends on (if a message has no dependencies, this field is zero). If the current stack is larger than the stack size of this message, then the items excess items on the stack should be discarded before adding the current message to the stack. It is guaranteed that the excess items will never be referenced again.
The dependency indices of nodes contained in the graphs of the stack are indexed as follows:
A dependency index i
found in the graph at stack height j
is resolved to the graph at stack height
j - i
. (This scheme keeps the guarantee that dependency index 0
always points to the current graph.)
PredictionProtocol_Request_Predict_Reader(_DynamicStructReader dyn)
Request a list of tactic predictions given the graph of a proof state.
PredictionProtocol_Prediction_Reader(_DynamicStructReader dyn)
PredictionProtocol_TextPrediction_Reader(_DynamicStructReader dyn)
PredictionProtocol_Response_Reader(_DynamicStructReader dyn)
See Request for documentation.
PredictionProtocol_Response_Alignment_Reader(_DynamicStructReader dyn)
ConflatableEdges_Reader(_DynamicStructReader dyn)
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator
Enum where members are also (and must be) ints
Inherited Members
- enum.Enum
- name
- value
- builtins.int
- conjugate
- bit_length
- bit_count
- to_bytes
- from_bytes
- as_integer_ratio
- real
- imag
- numerator
- denominator