pygfa package

Submodules

pygfa.gfa module

GFA representation through a networkx MulitGraph.

The dovetail operations are available thanks to the dovetail_operation.Iterator class, that considers only dovetail overlaps edges.

TODO:
  • Rewrite pprint method.
class pygfa.gfa.Element[source]

Bases: object

Represent the types of graph a GFA graph object can have.

EDGE = 1
NODE = 0
SUBGRAPH = 2
class pygfa.gfa.GFA(base_graph=None)[source]

Bases: pygfa.dovetail_operations.iterator.DovetailIterator

GFA will use a networkx MultiGraph as structure to contain the elements of the specification. GFA graphs directly accept only instances coming from the graph_elements package, but can contains any kind of data undirectly by accessing the _graph attribute.

add_edge(new_edge, safe=False)[source]

Add a graph_element Edge or a networkx edge to the GFA graph using the edge id as key.

If its id is * or None the edge will be given a virtual_id, in either case the original edge id will be preserved as edge attribute.

All edge attributes will be stored as netwrorkx edge attributes and all the remainders optional field will be stored individually as edge data.

add_graph_element(element)[source]

Add a graph element -Node, Edge or Subgraph- object to the graph.

add_node(new_node, safe=False)[source]

Add a graph_element Node to the GFA graph using the node id as key.

Its sequence and sequence length will be individual attributes on the graph and all the remainders optional field will be stored individually as node data.

Parameters:
  • new_node – A graph_element.Node object or a string that can represent a node (such as the Segment line).
  • safe – If set check if the given identifier has already been added to the graph, and in that case raise an exception
add_subgraph(subgraph, safe=False)[source]

Add a Subgraph object to the graph.

The object is not altered in any way. A deepcopy of the object given is attached to the graph.

as_graph_element(key)[source]

Given a key of an existing node, edge or subgraph, return its equivalent graph element object.

clear()[source]

Clear all GFA object elements.

Call networkx clear method, reset the virtual id counter and delete all the subgraphs.

dovetails_subgraph(nbunch=None, copy=True)[source]

Given a collection of nodes return a subgraph with the nodes given and all the edges between each pair of nodes. Only dovetails overlaps are considered.

dump(gfa_version=1, out=None)[source]
edge(identifier=None)[source]

GFA edge accessor.

  • If identifier is None all the graph edges are returned.
  • If identifier is a tuple perform a search by nodes with
    the tuple values as nodes id.
  • If identifier is a single defined value then perform
    a search by edge key, where the edge key is the given value.
edges(**kwargs)[source]

Return all the edges in the graph.

edges_iter(nbunch=None, data=False, keys=False, default=None)[source]

Interface to networx edges iterator.

classmethod from_file(filepath)[source]

Parse the given file and return a GFA object.

from_string(string)[source]

Add a GFA string to the graph once it has been converted.

TODO:Maybe this could be used instead of checking for line type in the add_xxx methods...
get(key)[source]

Return the element pointed by the specified key.

get_subgraph(sub_key)[source]

Return a GFA subgraph from the parent graph.

Return a new GFA graph structure with the nodes, edges and subgraphs specified in the elements attributes of the subgraph object pointed by the id.

The returned GFA is independent from the original object.

Parameters:sub_key – The id of a subgraph present in the GFA graph.
Returns None:if the subgraph id doesn’t exist.
nbunch_iter(nbunch=None)[source]

Return an iterator of nodes contained in nbunch that are also in the graph.

Interface to the networkx method.

neighbors(nid)[source]

Return all the nodes id of the nodes connected to the given node.

Return all the predecessors and successors of the given source node.

Params nid:The id of the selected node
node(identifier=None)[source]

An interface to access the node method of the netwrokx graph.

If identifier is None all the graph nodes are returned.

nodes(data=False, with_sequence=False)[source]

Return a list of the nodes in the graph.

Parameters:with_sequence – If set return only nodes with a sequence property.
nodes_iter(data=False, with_sequence=False)[source]

Return an iterator over nodes in the graph.

Para with_sequence:
 If set return only nodes with a sequence property.
pprint()[source]

A basic pretty print function for nodes and edges.

remove_edge(identifier)[source]

Remove an edge or all edges identified by an id or by a tuple with end node, respectively.

  • If identifier is a two elements tuple remove all the
    all the edges between the two nodes.
  • If identifier is a three elements tuple remove the edge
    specified by the third element of the tuple with end nodes given by the first two elements of the tuple itself.
  • If identifier is not a tuple, treat it as it should be
    an edge id.
Raises:InvalidEdgeError – If identifier is not in the cases described above.
remove_edges(from_node, to_node)[source]

Remove all the direct edges between the two nodes given.

Call iteratively remove_edge (remove a not specified edge from from_node and to_node) for n-times where n is the number of edges between the given nodes, removing all the edges indeed.

remove_node(nid)[source]

Remove a node with nid as its node id.

Edges containing nid as end node will be automatically deleted.

Parameters:nid – The id belonging to the node to delete.
Raises:InvalidNodeError – If nid doesn’t point to any node.
remove_subgraph(subgraph_id)[source]

Remove the Subgraph object identified by the given id.

search(comparator, limit_type=None)[source]

Perform a query applying the comparator on each graph element.

search_on_edges(comparator)[source]
search_on_nodes(comparator)[source]
search_on_subgraph(comparator)[source]
subgraph(nbunch, copy=True)[source]

Given a bunch of nodes return a graph with all the given nodes and the edges between them.

The returne object is not a GFA Graph, but a MultiGraph. To create a new GFA graph, just use the GFA initializer an give the subgraph to it.

Interface to the networkx subgraph method. Given a collection of nodes return a subgraph with the nodes given and all the edges between each pair of nodes.

Parameters:
  • nbunch – The nodes.
  • copy – If set to True return a copy of the subgraph.
subgraphs(identifier=None)[source]

An interface to access to the subgraphs inside the GFA object.

If identifier is None all the graph Subgraph objects are returned.

subgraphs_iter(data=False)[source]

Return an iterator over subgraphs elements in the GFA graph.

exception pygfa.gfa.GFAError[source]

Bases: Exception

exception pygfa.gfa.InvalidElementError[source]

Bases: Exception

exception pygfa.gfa.InvalidSearchParameters[source]

Bases: Exception

pygfa.operations module

pygfa.operations.nodes_connected_component(gfa_, nid)[source]

Return the connected component belonging to the given node.

Parameters:nid – The id of the node to find the reachable nodes.
pygfa.operations.nodes_connected_components(gfa_)[source]

Return a generator of sets with nodes of each weakly connected component in the graph.

Module contents