5.2.12. xgt.Job

class xgt.Job(conn: Connection, job_response: JobStatus | None = None, python_errors: list[str] | None = None)

Represents a user-scheduled Job.

An instance of this object is created by job-scheduling functions like xgt.Connection.run_job and xgt.Connection.schedule_job.

A Job is used as a proxy for a job in the server and allows the user to monitor its execution, possibly cancel it, and learn about its status during and after execution.

Parameters:
  • conn (Connection) – An open connection to an xGT server.

  • job_response (JobStatus | None) – A single element of the array returned by the output of a job creation gRPC call. Each individual element in the array will be constructed as a separate Job object.

  • python_errors (list[str] | None) – List of ingest errors.

Methods

get_data([offset, length, rows, columns, ...])

Returns results data for query jobs with a RETURN but no INTO or None for all other job types.

get_ingest_errors([offset, length])

Returns a list of strings giving error information from ingest.

Attributes

default_graph

Default graph used in the job.

default_namespace

Default namespace of the job.

description

A description supplied when the job was started.

end_time

Date and time when the job finished running.

error

User-friendly error message describing the reason a job failed.

error_type

Class that belongs to the XgtError hierarchy that corresponds to the original exception type thrown that caused the Job to fail.

id

Identifier of the job.

num_rows

The number of rows in the query result or the number of correctly ingested/inserted rows for an input operation.

results_frame

Name of the results frame for a query job with an INTO clause or None for all other job types.

schema

The property names and types of the stored results.

start_time

Date and time when the job was scheduled.

status

Status of the job.

total_ingest_errors

The number of errors that were thrown during ingest.

total_visited_edges

The total number of edges traversed during the job.

trace

Very detailed error message for a failed job.

user

User who ran the job.

visited_edges

A dictionary mapping Cypher bound variable names to an integer giving the number of edges visited during the job for the Edge Frame referenced by the bound variable.

property default_graph: str | None

Default graph used in the job.

Added in version 2.5.0.

property default_namespace: str | None

Default namespace of the job.

Added in version 2.0.0.

property description: str | None

A description supplied when the job was started. Usually a query.

property end_time: datetime.datetime | None

Date and time when the job finished running.

property error: str | None

User-friendly error message describing the reason a job failed.

property error_type: XgtNotImplemented | XgtInternalError | XgtIOError | XgtServerMemoryError | XgtConnectionError | XgtSyntaxError | XgtTypeError | XgtValueError | XgtNameError | XgtArithmeticError | XgtTransactionError | XgtSecurityError | None

Class that belongs to the XgtError hierarchy that corresponds to the original exception type thrown that caused the Job to fail.

get_data(offset: int = 0, length: int | None = None, rows: Iterable[int] | None = None, columns: Iterable[int | str] | None = None, format: str = 'python', expand: str = 'none') list[list[Any]] | pandas.DataFrame | pyarrow.Table | None

Returns results data for query jobs with a RETURN but no INTO or None for all other job types.

Parameters:
  • offset (int) – Position (index) of the first row to be retrieved. Cannot be given with rows.

  • length (int | None) – Maximum number of rows to be retrieved starting from the row indicated by offset. A value of ‘None’ means ‘all rows’ on and after the offset. Cannot be given with rows.

  • rows (Iterable[int] | None) –

    The rows to retrieve. A value of ‘None’ means all rows. Cannot be given with either offset or length.

    Added in version 1.16.0.

  • columns (Iterable[int | str] | None) –

    The columns to retrieve. Given as an iterable over mixed column positions and schema column names. A value of ‘None’ means all columns.

    Added in version 1.14.0.

  • format (str) –

    Selects the data format returned: a Python list of list, a pandas Dataframe, or an Apache Arrow Table. Must be one of ‘python’, ‘pandas’, or ‘arrow’. Default = ‘python’.

    Added in version 1.14.0.

  • expand (str) –

    Controls what is returned for a RowID column type. Allowed values are:
    • ’none’: Only RowID. Original behavior.

    • ’light’: Expands RowIDs to Vertex, Edge, and TableRow types that include properties.

    • ’full’: Expands RowIDs to Vertex, Edge, and TableRow types that include properties. Also includes frame metadata.

    Works only for python and pandas format.

    Experimental: The API of this parameter may change in future releases.

Returns:

Returns one of the following if the job object represents an OpenCypher query with no RETURN clause: list of lists, pandas DataFrame, or Apache Arrow Table. Otherwise, returns None.

Return type:

list[list[Any]] | pandas.DataFrame | pyarrow.Table | None

Raises:
  • XgtValueError – If parameter is out of bounds or invalid format given.

  • OverflowError – If data is out of bounds when converting.

get_ingest_errors(offset: int = 0, length: int | None = None) list[str] | None

Returns a list of strings giving error information from ingest. The first thousand errors raised are retrievable this way.

Parameters:
  • offset (int) – Position (index) of the first row to be retrieved.

  • length (int | None) – Maximum number of rows of errors to be retrieved starting from the row indicated by offset. A value of ‘None’ means ‘all rows’ on and after the offset.

Returns:

If this is not an ingest job or no errors were raised, this returns None.

Return type:

list[str] | None

property id: int

Identifier of the job.

A 64-bit integer value that uniquely identifies a job. It is automatically incremented for each scheduled job over the lifetime of the xGT server process.

property num_rows: int

The number of rows in the query result or the number of correctly ingested/inserted rows for an input operation.

Added in version 1.15.0.

property results_frame: str | None

Name of the results frame for a query job with an INTO clause or None for all other job types.

Added in version 2.0.1.

property schema: list[list[Any]] | None

The property names and types of the stored results. Only set when query results are stored in the job.

property start_time: datetime.datetime | None

Date and time when the job was scheduled.

property status: str

Status of the job.

Job status

Status

Description

scheduled

The state after the job has been created, but before it has started running.

running

The job is being executed.

completed

The job finished successfully.

canceled

The job was canceled.

failed

The job failed. When the job fails the error and trace properties are populated.

rollback

The job had a transactional conflict with another job and was rolled back.

unknown_job_status

The job was not found in the job history.

property total_ingest_errors: int

The number of errors that were thrown during ingest.

property total_visited_edges: int

The total number of edges traversed during the job. This is the sum of the counts for all edge labels returned in visited_edges.

For the example given in the visited_edges documentation, the value of total_visited_edges would be 16.

property trace: str | None

Very detailed error message for a failed job.

This error message contains the friendly error message and a stack strace for the code that participated in the error.

property user: str

User who ran the job.

property visited_edges: dict[str, int] | None

A dictionary mapping Cypher bound variable names to an integer giving the number of edges visited during the job for the Edge Frame referenced by the bound variable.

An edge is “visited” when the query considers the edge as a match to one of the query path edges. Multiple Cypher variables can refer to the same edge frame.

Consider the query path ()-[a:graph_edges1]->()-[b:graph_edges2]->()-[c:graph_edges1]->() with a visited_edges result of a -> 5, b -> 7, c -> 4. In performing the query 5 edges of type a were visited, and so on. Notice that the total number of edges visited for the frame graph_edges1 is 9 while the number of edges visited for the frame graph_edges2 is 7.