Example of an announcement banner goes here
thatDot half circle logo

A Query’s Journey: From Ad-Hoc to Standing

Like any epic story, every Quine developer’s journey follows a well-trodden path.

The Hero’s Journey by Esbjorn Jorsater, CC BY-SA 4.0, via Wikimedia Commons

Similarly, a query’s journey in Quine begins with a question to be answered and finishes with a fully formed standing query that answers that question. It’s the stuff that happens in between that we’re going to discuss, including tools and techniques that make the process as painless as possible.

  • The Call to Adventure → Business Question to Answer
  • Mentor → Recruiting SME(s) to understand the problem
  • Crossing the First Threshold → Utilizing the Exploration UI
  • Ally → Generating Ad-Hoc Queries
  • Tests and Enemies → Generate Quick Queries
  • The Ordeal → Iterating through Quick Queries with SME(s)
  • Reward → Expected Subgraph
  • The Road Back → Standing Queries based on Quick Queries

Query Development

Let’s walk the hero’s path, introducing the tools and techniques we used while developing the Financial Risk Calculation recipe.

Query development is an iterative process that is as much about asking questions as it is about finding answers. After all, business questions solved by standing queries don’t emerge full-grown from a developer’s forehead. If the Hero’s Journey conjures up a metaphor for our story, we can use a pragmatic view of query development to visualize the iterative query process as a workflow.

Query Development Workflow

Business Question to be Answered

As we embark on our journey, our first step is to refine a set of technically vague business requirements and turn them into questions that can be answered using data. For example, consider the core question behind the Financial Risk Calculation recipe.

“How do we identify if we are out-of-compliance with Basel III Liquidity Coverage Ratio (LCR) in real time?”

As we progress, an SME (Subject Matter Expert) joins forces with an analytics engineer to explain how LCR is calculated and what it means.

Now, let’s get a view of the data. This involves acquiring a sample set of records from the source data in a file you will use to test your queries. The path of least resistance is to ingest the file and manifest the nodes in your graph from each record in your sample data in Quine:

With the data in hand, it’s time to build a data model to reference while answering questions. The model should be expressed as a graph (a tree). This is where the SMEs come into play again. It is critical that the data model provides a good representation of the data objects and relationships. The result is a model viewable in the Quine Exploration UI, allowing us to interact with the data as expressed in the nodes (and their properties) and edges via ad-hoc queries.

The Hierarchal Data Model

Tools & Techniques

Before we move on to the queries specific to the question you are looking to answer, here are some tools and techniques that we find helpful.

For interactive analysis, Quick Queries are your friend! 🥰

Some quick queries are universally useful for development and ways to leverage Quine’s Exploration UI using them.

Label your queries’ output type of Node or Text in the name.

This technique is helpful for recognizing the class of output to expect in the Exploration UI.

Group classes of queries using a self-documenting helper query

Note that the `querySuffix` returns static text.

Add an invisible first query

Those who have used Quine for some time will know that if you double-click a node in the Exploration UI, the first quick query will be triggered. One minor issue with this – if we separate queries as specified in the second step, then double-clicking a node will return the fixed output description from the separator.

To overcome this, we add an invisible query. Since the Adjacent Nodes query is the first query in the default configuration, we will assign the same functionality to our hidden query.

Note that this query is the first in the list.

Quick Queries to navigate reify.time()

The following examples are universal for navigating time nodes as manifested by reify.time(). The first is tied to the seconds time node that is manifest. It will show the entire time hierarchy up from a second node. You’ll want to remove any periods not manifest by reify.time()based on your configuration.

The second query is an example, specific to time nodes, of creating a query to traverse up your node tree. In this case, from seconds to minutes.

This query can be extended to any time node class. For example, one for minutes to hours would look like this:

Or the query can be generalized for any node type.

Create queries to validate the accuracy of any logic that will be a part of a standing query.

It’s not always a given that your Cypher logic will always represent an accurate mapping of the business logic to the query intended.

In these examples, we created a table to display all of the transactions grouped by desk.

You’ll note that we are displaying something labeled `VALUE_LOG`. This is enabled by including a log of all the investment value changes in an array.

Created during ingest by:

Notice that there are a number of additional properties that are generated for external validation (e.g., run, lastAdjustedValue, etc.). We would typically remove these before pushing the recipe to production.

Node appearances help you visualize your data.

Utilize icons (or even just colors) to help you quickly visually differentiate nodes in the Exploration UI. The full set that’s available to use is displayed here.

With all of this out of the way, there is one more query that we’ve found useful. In this case, it’s a sample query.

This will display one of each node class by label.

An equivalent query that utilizes node properties (specifically ones intended to identify the class of node) rather than labels way to label. In this example, we have a node, foo, with a property, bar, that identifies its class.

With these techniques and helper queries, you can more easily iterate through quick query generation, proving out, step-by-step (query-by-query), the flow that will eventually represent the logic of the associate standing query. Note that one or more quick queries executed on specific nodes in a particular order may be used to show this logic. This is a chance to engage your SME again to ensure you’ve captured the logic you targeted.

For example, continuing with the Financial Risk recipe, I will want to ensure that the total adjusted value per desk and at the institution level is correct since we need them to calculate the investment composition for alerting. To do this, we generated an aggregating query.

Selecting the “Institutions Investments Total Value” quick query …

Runs this query:

To produce the `totalAdjustedValue`:

Here we see that the rolling aggregation of the adjusted values is correct via the quick query; we create similar quick queries for the Class 2, 2a, and 2b investments to ensure we’re ready to create the associated standing query (calculate the composition of Class 2 and 2b investments relative to the total investments).

A Quick Query becomes a Standing Query

A standing query in Quine has two parts, a pattern match and a series of output actions to perform that can update the graph, emit an output, or produce a new event stream.

The quick queries from above were developed while building STANDING-2 in the Financial Risk Calculation recipe.

The standing query serves three purposes:

  1. Aggregating the total adjusted values of investments to both the desk and institution levels;
  2. Aggregating the per-class adjusted values of investments to both the desk and institution levels; and
  3. Calculating the Class 2 and 2B composition of the total adjusted values at the institutional level

The Cypher statements developed for the quick queries above are reusable in the standing query to update the graph as new events stream in.

We can compare the standing query output to the quick query we used to develop the standing query.

Victory is ours! Time to profit!

But, just one more thing …

Quick queries are a powerful tool for manually analyzing a graph once it is loaded with events. Still, we are building a streaming solution that needs to perform triage analysis independently.

A standing query can run business logic against the graph to produce alerts triggering a deeper analysis. Given this decision tree, the Financial Risk Calculation recipe produces an output stream of alerts for follow-up.

Conclusion

Embark on your own journey to become a Quine query master!

Following the tools and techniques outlined in this article can enhance your data analysis skills and boost productivity. Whether you’re a data analyst, developer, or business professional, practicing these strategies will empower you to handle complex queries easily. Embrace the power of Quine’s Exploration UI and Quick Queries to iterate through your development process efficiently.

Your data-driven adventures await!

Next Steps

If you want to try Quine using your own data, here are some resources to help:

  1. Learn more about Quine by visiting the Quine open source project.
  2. Download Quine – JAR file | Docker Image | Github
  3. Check out the Financial Risk Calculation recipe to see how Cypher is used for real-time rollups.
  4. Check out demos and other videos on our YouTube channel.

Related posts

See for yourself

If you think Streaming Graph or Novelty might be for you, contact us to see them in action.