Skip to content

Commit

Permalink
checkpoint 3
Browse files Browse the repository at this point in the history
  • Loading branch information
aMahanna committed Sep 3, 2024
1 parent d62fe61 commit f2caf27
Show file tree
Hide file tree
Showing 13 changed files with 258 additions and 4 deletions.
File renamed without changes.
File renamed without changes.
99 changes: 99 additions & 0 deletions doc/algorithms/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
.. _algorithms:

**********
Algorithms
**********

As NetworkX-ArangoDB is primarily a **Storage Backend** to NetworkX, its primary focus is on persisting and reloading graphs from ArangoDB.

However, running algorithms on the graph is also still possible.

There are 3 ways to run algorithms on the graph:

1. **NetworkX**: The traditional way of running algorithms on Graphs.
2. **NetworkX-cuGraph**: The GPU-accelerated way of running algorithms on Graphs.
3. **ArangoDB**: The database way of running algorithms on Graphs.

Currently, Options 1 & 2 are supported, whereas Option 3 is a work-in-progress.

Running algorithms with Option 2 requires ``nx-cugraph`` to be installed on a system with a compatible GPU:

.. code-block::
pip install nx-cugraph-cu12 --extra-index-url https://pypi.nvidia.com
When running algorithms with Option 2, the graph is converted to a ``nx-cugraph`` graph, and the algorithm is run on the GPU.

This is only possible if ``nx-cugraph`` has implemented the algorithm you want to run.

- For a list of algorithms that are supported by ``nx-cugraph``, refer to the `nx-cugraph README <https://github.com/rapidsai/cugraph/tree/branch-24.10/python/nx-cugraph#algorithms>`_.
- For a list of algorithms that are supported by ``networkx``, refer to the `NetworkX Documentation <https://networkx.org/documentation/stable/reference/algorithms/index.html>`_.

``nx-arangodb`` will automatically dispatch algorithm calls to either CPU or GPU based on if ``nx-cugraph`` is installed. We rely on a rust-based library called `phenolrs <https://github.com/arangoml/phenolrs>`_ to retrieve ArangoDB Graphs as fast as possible.

You can also force-run algorithms on CPU even if ``nx-cugraph`` is installed:

.. code-block:: python
import os
import networkx as nx
import nx_arangodb as nxadb
# os.environ ...
G = nxadb.Graph(name="MyGraph")
nx.config.backends.arangodb.use_gpu = False
nx.pagerank(G)
nx.betweenness_centrality(G)
# ...
nx.config.backends.arangodb.use_gpu = True
.. image:: ../_static/dispatch.png
:align: center
:alt: nx-arangodb dispatching
:height: 200px


**Tip**: If you're running multiple CPU algorithms, it's recommended to rely on invoking ``nxadb.convert.nxadb_to_nx`` to convert the graph to a NetworkX Graph before running the algorithms.
This is because we currently load the entire graph into memory before running *each* algorithm, which can be slow for large graphs.

.. code-block:: python
import networkx as nx
import nx_arangodb as nxadb
G_adb = nxadb.Graph(name="MyGraph")
G_nx = nxadb.convert.nxadb_to_nx(G)
nx.pagerank(G_nx)
nx.betweenness_centrality(G_nx)
# ...
**Option 3**

This is an experimental module seeking to provide server-side algorithms for `nx-arangodb` Graphs.
The goal is to provide a set of algorithms that can be delegated to the server for processing,
rather than having to pull all the data to the client and process it there.

Currently, the module is in a very early stage and only provides a single algorithm: `shortest_path`.
This is simply to demonstrate the potential of the module and to provide a starting point for further development.

.. code-block:: python
import os
import networkx as nx
from nx_arangodb as nxadb
# os.environ ...
G = nxadb.Graph(name="MyGraph")
nx.pagerank(G) # Runs on the client
nx.shortest_path(G, source="A", target="B") # Runs on the DB server
nx.shortest_path.orig_func(G, source="A", target="B") # Runs on the client
2 changes: 1 addition & 1 deletion doc/classes/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ graph you want to represent.
+----------------+------------+--------------------+------------------------+

.. toctree::
:maxdepth: 2
:maxdepth: 1

graph
digraph
Expand Down
18 changes: 18 additions & 0 deletions doc/dict/adj.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.. _adj:

=========
Adjacency
=========


.. currentmodule:: nx_arangodb.classes.dict.adj
.. autoclass:: AdjListOuterDict

.. currentmodule:: nx_arangodb.classes.dict.adj
.. autoclass:: AdjListInnerDict

.. currentmodule:: nx_arangodb.classes.dict.adj
.. autoclass:: EdgeKeyDict

.. currentmodule:: nx_arangodb.classes.dict.adj
.. autoclass:: EdgeAttrDict
12 changes: 12 additions & 0 deletions doc/dict/graph.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. _graph:

=====
Graph
=====


.. currentmodule:: nx_arangodb.classes.dict.graph
.. autoclass:: GraphDict

.. currentmodule:: nx_arangodb.classes.dict.graph
.. autoclass:: GraphAttrDict
41 changes: 41 additions & 0 deletions doc/dict/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
.. _dict:

************
Dictionaries
************

The ``dict`` module provides a set of ``UserDict``-based classes that extend the traditional dictionary functionality to maintain a remote connection to an ArangoDB Database.

NetworkX Graphs rely on dictionary-based structures to store their data, which are defined by their factory functions:

1. ``node_dict_factory``
2. ``node_attr_dict_factory``
3. ``adjlist_outer_dict_factory``
4. ``adjlist_inner_dict_factory``
5. ``edge_key_dict_factory`` (Only for MultiGraphs)
6. ``edge_attr_dict_factory``
7. ``graph_attr_dict_factory``

These factories are used to create the dictionaries that store the data of the nodes, edges, and the graph itself.

This module contains the following classes:

1. ``NodeDict``
2. ``NodeAttrDict``
3. ``AdjListOuterDict``
4. ``AdjListInnerDict``
5. ``EdgeKeyDict``
6. ``EdgeAttrDict``
7. ``GraphDict``
8. ``GraphAttrDict``

Each class extends the functionality of the corresponding dictionary factory by adding methods to interact with the data in ArangoDB. Think of it as a CRUD interface for ArangoDB. This is done by overriding the primary dunder methods of the ``UserDict`` class.

By using this strategy in addition to subclassing the ``nx.Graph`` class, we're able to preserve the original functionality of the NetworkX Graphs while adding ArangoDB support.

.. toctree::
:maxdepth: 1

adj
node
graph
12 changes: 12 additions & 0 deletions doc/dict/node.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. _node:

====
Node
====


.. currentmodule:: nx_arangodb.classes.dict.node
.. autoclass:: NodeDict

.. currentmodule:: nx_arangodb.classes.dict.node
.. autoclass:: NodeAttrDict
7 changes: 5 additions & 2 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,10 @@ of how to use NetworkX, refer to the `NetworkX Documentation <https://networkx.o
Expect documentation to grow over time:

.. toctree::
:maxdepth: 1
:maxdepth: 2

quickstart
classes/index
classes/index
dict/index
algorithms/index
views/index
14 changes: 14 additions & 0 deletions doc/views/coreviews.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
.. _coreviews:

=========
Coreviews
=========


.. currentmodule:: nx_arangodb.classes.coreviews
.. autoclass:: ArangoAdjacencyView
:members:

.. currentmodule:: nx_arangodb.classes.coreviews
.. autoclass:: ArangoAtlasView
:members:
33 changes: 33 additions & 0 deletions doc/views/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
.. _views:

**************
ArangoDB Views
**************

Having a database as a backend to NetworkX allows us to delegate
certain operations to the database.

This can be applied to the concept of NetworkX Views.

Below are a set of experimental overrides of the NetworkX Views that represent the
nodes and edges of the graph. Overriding these classes allows us to
implement custom logic for data filtering and updating in the database.

These classes are a work-in-progress. The main goal is to try
to delegate data processing to ArangoDB, whenever possible.

To use these experimental views, you must set **use_arango_views=True**
when creating a new graph object:

.. code-block:: python
import nx_arangodb as nxadb
G = nxadb.Graph(name="MyGraph", use_arango_views=True)
.. toctree::
:maxdepth: 1

coreviews
reportviews
22 changes: 22 additions & 0 deletions doc/views/reportviews.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
.. _reportviews:

===========
Reportviews
===========


.. currentmodule:: nx_arangodb.classes.reportviews
.. autoclass:: ArangoNodeView
:members:

.. currentmodule:: nx_arangodb.classes.reportviews
.. autoclass:: ArangoNodeDataView
:members:

.. currentmodule:: nx_arangodb.classes.reportviews
.. autoclass:: ArangoEdgeView
:members:

.. currentmodule:: nx_arangodb.classes.reportviews
.. autoclass:: ArangoEdgeDataView
:members:
2 changes: 1 addition & 1 deletion nx_arangodb/algorithms/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This is an experimental module seeking to provide server-side algorithms for `nx-arangodb` Graphs. The goal is to provide a set of algorithms that can be delegated to the server for processing, rather than having to pull all the data to the client and process it there.

Currently, the module is in a very early stage and only provides a single algorithm: `shortestPath`. This is simply to demonstrate the potential of the module and to provide a starting point for further development.
Currently, the module is in a very early stage and only provides a single algorithm: `shortest_path`. This is simply to demonstrate the potential of the module and to provide a starting point for further development.

```python
import os
Expand Down

0 comments on commit f2caf27

Please sign in to comment.