Chapter 8. ConfiguredGraphFactory

The JanusGraph Server can be configured to use the ConfiguredGraphFactory. The ConfiguredGraphFactory is an access point to your graphs, similar to the JanusGraphFactory. These graph factories provide methods for dynamically managing the graphs hosted on the server.

8.1. Overview

JanusGraphFactory is a class that provides an access point to your graphs by providing a Configuration object each time you access the graph.

ConfiguredGraphFactory provides an access point to your graphs for which you have previously created configurations using the ConfigurationManagementGraph. It also offers an access point to manage graph configurations.

ConfigurationManagementGraph allows you to manage graph configurations.

JanusGraphManager is an internal server component that tracks graph references, provided your graphs are configured to use it.

8.2. ConfiguredGraphFactory versus JanusGraphFactory

However, there is an important distinction between these two graph factories:

  1. The ConfiguredGraphFactory can only be used if you have configured your server to use the ConfigurationManagementGraph APIs at server start.

The benefits of using the ConfiguredGraphFactory are that:

  1. You only need to supply a String to access your graphs, as opposed to the JanusGraphFactory-- which requires you to specify information about the backend you wish to use when accessing a graph-- every time you open a graph.
  2. If your ConfigurationManagementGraph is configured with a distributed storage backend then your graph configurations are available to all JanusGraph nodes in your cluster.

8.3. How Does the ConfiguredGraphFactory Work?

The ConfiguredGraphFactory provides an access point to graphs under two scenarios:

  1. You have already created a configuration for your specific graph object using the ConfigurationManagementGraph#createConfiguration. In this scenario, your graph is opened using the previously created configuration for this graph.
  2. You have already created a template configuration using the ConfigurationManagementGraph#createTemplateConfiguration. In this scenario, we create a configuration for the graph you are creating by copying over all attributes stored in your template configuration and appending the relevant graphName attribute, and we then open the graph according to that specific configuration.

8.4. Accessing the Graphs

You can either use ConfiguredGraphFactory.create("graphName") or ConfiguredGraphFactory.open("graphName"). Learn more about the difference between these two options by reading the section below about the ConfigurationManagementGraph.

8.5. Listing the Graphs

ConfiguredGraphFactory.getGraphNames() will return a set of graph names for which you have created configurations using the ConfigurationManagementGraph APIs.

JanusGraphFactory.getGraphNames() on the other hand returns a set of graph names for which you have instantiated and the references are stored inside the JanusGraphManager.

8.6. Dropping a Graph

ConfiguredGraphFactory.drop("graphName") will drop the graph database, deleting all data in storage and indexing backends. The graph can be open or closed (will be closed as part of the drop operation). Furthermore, this will also remove any existing graph configuration in the ConfigurationManagementGraph.

[Important]Important

This is an irreversible operation that will delete all graph and index data.

[Important]Important

To ensure all graph representations are consistent across all JanusGraph nodes in your cluster, remove the graph from the JanusGraphManager graph reference tracker on all nodes in your cluster: ConfiguredGraphFactory.close("graphName");.

8.7. Configuring JanusGraph Server for ConfiguredGraphFactory

To be able to use the ConfiguredGraphFactory, you must configure your server to use the ConfigurationManagementGraph APIs. To do this, you have to inject a graph variable named "ConfigurationManagementGraph" in your server’s YAML’s graphs map. For example:

graphManager: org.JanusGraph.graphdb.management.JanusGraphManager
graphs: {
  ConfigurationManagementGraph: conf/JanusGraph-configurationmanagement.properties
}

In this example, our ConfigurationManagementGraph graph will be configured using the properties stored inside conf/JanusGraph-configurationmanagement.properties, which for example, look like:

gremlin.graph=org.JanusGraph.core.JanusGraphFactory
storage.backend=cassandrathrift
graph.graphname=ConfigurationManagementGraph
storage.hostname=127.0.0.1

Assuming the GremlinServer started successfully and the ConfigurationManagementGraph was successfully instantiated, then all the APIs available on the ConfigurationManagementGraph Singleton will also act upon said graph. Furthermore, this is the graph that will be used to access the configurations used to create/open graphs using the ConfiguredGraphFactory.

[Important]Important

The pom.xml included in the JanusGraph distribution lists this dependency as optional, but the ConfiguredGraphFactory makes use of the JanusGraphManager, which requires a declared dependency on the org.apache.tinkerpop:gremlin-server. So if you run into NoClassDefFoundError errors, then be sure to update according to this message.

8.8. ConfigurationManagementGraph

The ConfigurationManagementGraph is a Singleton that allows you to create/update/remove configurations that you can use to access your graphs using the ConfiguredGraphFactory. See above on configuring your server to enable use of these APIs.

[Important]Important

The ConfiguredGraphFactory offers an access point to manage your graph configurations managed by the ConfigurationManagementGraph, so instead of acting upon the Singleton itself, you may act upon the corresponding ConfiguredGraphFactory static methods. For example, you may use ConfiguredGraphFactory.removeTemplateConfiguration() instead of ConfiguredGraphFactory.getInstance().removeTemplateConfiguration().

8.8.1. Graph Configurations

The ConfigurationManagementGraph singleton allows you to create configurations used to open specific graphs, referenced by the graph.graphname property. For example:

Map<String, Object> map = new HashMap<String, Object>();
map.put("storage.backend", "cassandrathrift");
map.put("storage.hostname", "127.0.0.1");
map.put("graph.graphname", "graph1");
ConfiguredGraphFactory.createConfiguration(new MapConfiguration(map));

Then you could access this graph on any JanusGraph node using:

ConfiguredGraphFactory.open("graph1");

8.8.2. Template Configuration

The ConfigurationManagementGraph also allows you to create one template configuration, which you can use to create many graphs using the same configuration template. For example:

Map<String, Object> map = new HashMap<String, Object>();
map.put("storage.backend", "cassandrathrift");
map.put("storage.hostname", "127.0.0.1");
ConfiguredGraphFactory.createTemplateConfiguration(new MapConfiguration(map));

After doing this, you can create graphs using the template configuration:

ConfiguredGraphFactory.create("graph2");

This method will first create a new configuration for "graph2" by copying over all the properties associated with the template configuration and storing it on a configuration for this specific graph. This means that this graph can be accessed in, on any JanusGraph node, in the future by doing:

ConfiguredGraphFactory.open("graph2");

8.8.3. Updating Configurations

All interactions with both the JanusGraphFactory and the ConfiguredGraphFactory that interact with configurations that define the property graph.graphname go through the JanusGraphManager which keeps track of graph references created on the given JVM. Think of it as a graph cache. For this reason:

[Important]Important

Any updates to a configuration are not guaranteed to take effect until you remove the graph in question on every JanusGraph node in your cluster.

You can do so by calling:

ConfiguredGraphFactory.close("graph2");

Since graphs created using the template configuration first create a configuration for that graph in question using a copy and create method, this means that:

[Important]Important

Any updates to a specific graph created using the template configuration are not guaranteed to take effect on the specific graph until:

  1. The relevant configuration is removed: ConfiguredGraphFactory.removeConfiguration("graph2");
  2. The graph in question has been closed on every JanusGraph node: ConfiguredGraphFactory.close("graph2");
  3. The graph is recreated using the template configuration: ConfiguredGraphFactory.create("graph2");

8.8.4. Update Examples

1) We migrated our Cassandra data to a new server with a new IP address:

Map map = new HashMap();
map.put("storage.backend", "cassandrathrift");
map.put("storage.hostname", "127.0.0.1");
map.put("graph.graphname", "graph1");
ConfiguredGraphFactory.createConfiguration(new
MapConfiguration(map));

def g1 = ConfiguredGraphFactory.open("graph1");

// Update configuration
Map map = new HashMap();
map.put("storage.hostname", "10.0.0.1");
ConfiguredGraphFactory.updateConfiguration("graph1",
map);

// Close graph
ConfiguredGraphFactory.close("graph1");

// We are now guaranteed to use the updated configuration
def g1 = ConfiguredGraphFactory.open("graph1");

2) We added an Elasticsearch node to our setup:

Map map = new HashMap();
map.put("storage.backend", "cassandrathrift");
map.put("storage.hostname", "127.0.0.1");
map.put("graph.graphname", "graph1");
ConfiguredGraphFactory.createConfiguration(new
MapConfiguration(map));

def g1 = ConfiguredGraphFactory.open("graph1");

// Update configuration
Map map = new HashMap();
map.put("index.search.backend", "elasticsearch");
map.put("index.search.hostname", "127.0.0.1");
map.put("index.search.elasticsearch.transport-scheme", "http");
ConfiguredGraphFactory.updateConfiguration("graph1",
map);

// Close graph
ConfiguredGraphFactory.close("graph1");

// We are now guaranteed to use the updated configuration
def g1 = ConfiguredGraphFactory.open("graph1");

3) Update a graph configuration that was created using a template configuration that has been updated:

Map map = new HashMap();
map.put("storage.backend", "cassandrathrift");
map.put("storage.hostname", "127.0.0.1");
ConfiguredGraphFactory.createTemplateConfiguration(new
MapConfiguration(map));

def g1 = ConfiguredGraphFactory.create("graph1");

// Update template configuration
Map map = new HashMap();
map.put("index.search.backend", "elasticsearch");
map.put("index.search.hostname", "127.0.0.1");
map.put("index.search.elasticsearch.transport-scheme", "http");
ConfiguredGraphFactory.updateTemplateConfiguration(new
MapConfiguration(map));

// Remove Configuration
ConfiguredGraphFactory.removeConfiguration("graph1");

// Close graph on all JanusGraph nodes
ConfiguredGraphFactory.close("graph1");

// Recreate
ConfiguredGraphFactory.create("graph1");
// Now this graph's configuration is guaranteed to be updated

8.9. JanusGraphManager

The JanusGraphManager is a Singleton adhering to the TinkerPop graphManager specifications.

In particular, the JanusGraphManager provides:

  1. a coordinated mechanism by which to instantiate graph references on a given JanusGraph node
  2. a graph reference tracker (or cache)

Any graph you create using the graph.graphname property will go through the JanusGraphManager and thus be instantiated in a coordinated fashion. The graph reference will also be placed in the graph cache on the JVM in question.

Thus, any graph you open using the graph.graphname property that has already been instantiated on the JVM in question will be retrieved from the graph cache.

This is why updates to your configurations require a few steps to guarantee correctness.

8.9.1. How To Use The JanusGraphManager

This is a new configuration option you can use when defining a property in your configuration that defines how to access a graph. All configurations that include this property will result in the graph instantiation happening through the JanusGraphManager (process explained above).

For backwards compatability, any graphs that do not supply this parameter but supplied at server start in your graphs object in your .yaml file, these graphs will be bound through the JanusGraphManager denoted by their key supplied for that graph. For example, if your .yaml graphs object looks like:

graphManager: org.JanusGraph.graphdb.management.JanusGraphManager
graphs {
  graph1: conf/graph1.properties,
  graph2: conf/graph2.properties
}

but conf/graph1.properties and conf/graph2.properties do not include the property graph.graphname, then these graphs will be stored in the JanusGraphManager and thus bound in your gremlin script executions as graph1 and graph2, respectively.

8.9.2. Important

For convenience, if your configuration used to open a graph specifies graph.graphname, but does not specify the backend’s storage directory, tablename, or keyspacename, then the relevant parameter will automatically be set to the value of graph.graphname. However, if you supply one of those parameters, that value will always take precedence. And if you supply neither, they default to the configuration option’s default value.

One special case is storage.root configuration option. This is a new configuration option used to specify the base of the directory that will be used for any backend requiring local storage directory access. If you supply this parameter, you must also supply the graph.graphname property, and the absolute storage directory will be equal to the value of the graph.graphname property appended to the value of the storage.root property.

Below are some example use cases:

1) Create a template configuration for my Cassandra backend such that each graph created using this configuration gets a unique keyspace equivalent to the String <graphName> provided to the factory:

Map map = new HashMap();
map.put("storage.backend", "cassandrathrift");
map.put("storage.hostname", "127.0.0.1");
ConfiguredGraphFactory.createTemplateConfiguration(new
MapConfiguration(map));

def g1 = ConfiguredGraphFactory.create("graph1"); //keyspace === graph1
def g2 = ConfiguredGraphFactory.create("graph2"); //keyspace === graph2
def g3 = ConfiguredGraphFactory.create("graph3"); //keyspace === graph3

2) Create a template configuration for my BerkeleyJE backend such that each graph created using this configuration gets a unique storage directory equivalent to the "<storage.root>/<graph.graphname>":

Map map = new HashMap();
map.put("storage.backend", "berkeleyje");
map.put("storage.root", "/data/graphs");
ConfiguredGraphFactory.createTemplateConfiguration(new
MapConfiguration(map));

def g1 = ConfiguredGraphFactory.create("graph1"); //storage directory === /data/graphs/graph1
def g2 = ConfiguredGraphFactory.create("graph2"); //storage directory === /data/graphs/graph2
def g3 = ConfiguredGraphFactory.create("graph3"); //storage directory === /data/graphs/graph3

8.10. Examples

gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured localhost/127.0.0.1:8182

gremlin> :remote console
==>All scripts will now be sent to Gremlin Server - [localhost:8182]-[5206cdde-b231-41fa-9e6c-69feac0fe2b2] - type ':remote console' to return to local mode

gremlin> ConfiguredGraphFactory.open("graph");
Please create configuration for this graph using the
ConfigurationManagementGraph API.

gremlin> ConfiguredGraphFactory.create("graph");
Please create a template Configuration using the
ConfigurationManagementGraph API.

gremlin> Map map = new HashMap(); map.put("storage.backend",
"cassandrathrift"); map.put("storage.hostname", "127.0.0.1");
map.put("GraphName", "graph1");
ConfiguredGraphFactory.createConfiguration(new
MapConfiguration(map));
Please include in your configuration the property "graph.graphname".

gremlin> Map map = new HashMap(); map.put("storage.backend",
"cassandrathrift"); map.put("storage.hostname", "127.0.0.1");
map.put("graph.graphname", "graph1");
ConfiguredGraphFactory.createConfiguration(new
MapConfiguration(map));
==>null

gremlin> ConfiguredGraphFactory.open("graph1").vertices();

gremlin> Map map = new HashMap(); map.put("storage.backend",
"cassandrathrift"); map.put("storage.hostname", "127.0.0.1");
map.put("graph.graphname", "graph1");
ConfiguredGraphFactory.createTemplateConfiguration(new
MapConfiguration(map));
Your template configuration may not contain the property
"graph.graphname".

gremlin> Map map = new HashMap(); map.put("storage.backend",
"cassandrathrift"); map.put("storage.hostname", "127.0.0.1");
ConfiguredGraphFactory.createTemplateConfiguration(new
MapConfiguration(map));
==>null

// Each graph is now acting in unique keyspaces equivalent to the
graphnames.
gremlin> def g1 = ConfiguredGraphFactory.open("graph1"); def g2 =
ConfiguredGraphFactory.create("graph2"); def g3 =
ConfiguredGraphFactory.create("graph3"); g2.addVertex(); l = []; l <<
g1.vertices().size(); l << g2.vertices().size(); l <<
g3.vertices().size(); l;
==>0
==>1
==>0

// After a graph is created, you must access it using .open()
gremlin> def g2 = ConfiguredGraphFactory.create("graph2");
g2.vertices().size();
Configuration for graph "graph2" already exists.

gremlin> def g2 = ConfiguredGraphFactory.open("graph2");
g2.vertices().size();
==>1