Quine Streaming Graph 1.3.0: Focus on Usability, Query Performance
Performant Pagination at Scale, Improved Querying and User Docs, Advanced Recipes
It is hard to believe we released Quine 1.2.0 only six weeks ago, especially when I look at the work that has gone into not just Quine but also documentation, how-to blogs and example recipes. Indeed, 1.3.0 cements a pattern of releases made up of a few features needed to achieve performance at scale and loads of smaller usability improvements that has emerged since we released Quine as an open source project in February.
Additions to Quine included vastly improved pagination performance inside of our Cypher compiler, overhauled the API documentation, making journals a default when running recipes, improved Cypher query support, and a number of small but consequential changes to the system’s logging behavior.
In addition, we’ve migrated the documentation to its own site to make it easier to make community contributions and keep docs in sync with releases, added three new recipes and made substantial updates to one of the favorites.
The common theme throughout: usability and performance.
Pagination in Quine Streaming Graph
As part of our work to make all aspects of the system perform predictably and well at throughput rates of hundreds of thousands or even millions of events per second, we have undertaken some plumbing upgrades.
To give you an idea of the engineering involved, check out this blog post about the three most common pagination approaches (or I can save you time and tell you it explains page, point, and keySet-based pagination). We combined aspects of all three in our approach.
The other notable work focused on usability and community enablement.
Quine and API documentation plus Improved Usability
We switched to the Stoplight Elements framework to make API documentation easier to access and migrated from quine.io/docs to docs.quine.io. Not huge changes in themselves, but together they ensure docs are more accessible to the community to modify and never lag releases.
Ingest Streams from Kafka and other Sources
We also completed five blog posts on ingesting streams, ranging from simple CSV files to internet feeds to Kafka integration.
- Real-time Graph Analytics for Kafka Streams with Quine
- Building a Quine Streaming Graph: Ingest Streams
- Ingesting data from the Internet into Quine Streaming Graph
- Ingesting From Multiple Data Sources into Quine Streaming Graph
- Ingest and Analyze Log Files Using Streaming Graph
Live event stream, log, and network observability recipes
And of course when we wrote an explainer on ingesting and processing log files, we couldn’t resist a recipe that uses Quine logs as the source. We all know that consuming, parsing, and visualizing Java log output is a huge challenge, one that lacks a widely available solution. We think Quine might be an answer. Use the Quine Log Recipe as a baseline, then modify the regular expression inside the ingest stream Cypher query to fit your logs.
In addition to the Quine Java log ingest recipe, we’ve created a recipe showing how to ingest and build a streaming graph from a feed of IMDB movie data. (For anyone really interested in log processing, there’s also an Apache web logs analytics recipe). Rounding out the trio of new recipes is a fun one: Ethan’s Pi Day recipe using Quine to calculate Pi using Liebniz’s formula.
On the topic of observability and root cause analysis, the CDN Cache Efficiency recipe got a major update:
- Moved shaping the graph from standing queries into the ingest stream.
- Updated code to reflect Cypher best practices.
- Added quick queries to perform efficiency calculations.
- Optimized the manifestation of nodes.
- Added client device nodes.
- Increased the data sample size
Quine Synthetic Data Generator
With Quine v1.3.0 we also introduced a powerful series of built-in synthetic data Cypher functions. The synthetic data functions can be used within ingest streams to create booleans, bytes, floats, integers, strings, or nodes. This allows you to generate streaming synthetic data that can be used for testing or development purposes.
Search for `gen.` to check out how to use the functions on the Cypher Functions page of docs.quine.io.
Next Up
Quine is open source if you want to explore standing queries for yourself using your own data. Download a precompiled version or build it yourself from the codebase from the Quine Github codebase.
Have a question, suggestion, or improvement? I welcome your feedback! Please drop into Quine Slack and let me know. I’m always happy to discuss Quine or answer questions.
Release Notes:
Release Quine 1.3.0
Features:
- Added a pagination (SKIP/LIMIT) optimizer to the Cypher query engine for
historical queries with no unaliased values (#1822)
- Enabled journals by default when running a recipe (#1814)
- Added support for using the Stoplight Elements interactive documentation
behind an authentication proxy (#1781)
Bugfixes:
- Fixed an issue where waking up a node would not correctly re-register its s
standing queries, potentially resulting in dropped results (#1830)
- Fixed an issue where Cypher subqueries could be executed with too many
variables in scope (#1821)
- Fixed an issue where some Cypher constructs (notably: variable-length
relationship patterns) could be executed with too many variables in scope (#1821)
- Fixed a documentation rendering issue for Standing Query Outputs (#1815)
- Renamed the metric "persistors.snapshot-sizes" to "persistor.snapshot-sizes"
for consistency (#1788)
- Fixed the behavior of DISTINCT during Cypher query execution, making
it work correctly with SKIP and/or LIMIT (#1777)
Misc:
- Simplified startup log messages (#1831)
- Update some error messages to use the correct name for DistinctId
Standing Queries (#1796)
- Improved UX for API-issued historical queries near the present
time (#1786, #1789)
- Removed logback-config logging library: to configure logging, use standard
logback.xml (#1754)
- Added timestamps to node journal events in debug.node and node
debug APIs (#1741)
- Removed StandingQueryPattern.Graph API (#1795)
- Improved distribution of randomly-generated partitioned IDs (#1801)
- Documented metrics endpoint in openapi specification (#1792)
- Added peephole optimization for property value comparsion (#1783)
- Refactored to simplify DomainGraphBranch representation (#1771)
Updates:
- rocksdbjni to 7.3.1 (#1825)
- msgpack-core to 0.9.2 (#1824)
- cats-core to 2.8.0 (#1826)
- metrics to 4.2.10 (#1823)
- scala-library to 2.12.16
- sbt-paradox to 0.10.2 (#1809)
- sbt-scalafix to 0.10.1 (#1808)
- scala-java-time to 2.4.0 (#1798)
==== Quine Enterprise Additions ====
Release Quine Enterprise 1.3.0
Misc
- Removed hydrolix persistor (#1739)
Updates
- proguard-base to 7.2.2
- scala-logging to 3.9.5 (#1776)
- classgraph to 4.8.147 (#1784)
==== Quine.io / docs.thatdot.com: Probably not in release notes ====
* 1471b8201 Fixed typo in Kinesis section (#1829)
* e216d1475 Resolve left nav issue on docs page (#1819)
* 60fc048b5 updated the social link to a community invite (#1816)
* 72fe04a2d Added 3d data tutorial (#1806)
* 01061a434 initial quine log recipe commit (#1802)
* 96dba566b Added the movieData recipe. (#1787)
* 6330c807f (query-manager-fiddling) 1.2-docs-bugFix (#1758)
* b98b791d0 Refactor site to use - instead of _ in urls (#1772)
Collapse
Related posts
-
Streaming Graph Get Started
It’s been said that graphs are everywhere. Graph-based data models provide a flexible and intuitive way to represent complex relationships and interconnectedness in data. They are particularly well-suited…
-
Streaming Graph for Real-Time Risk Analysis at Data Connect in Columbus 2024
After more than 25 years in the data management and analysis industry, I had a brand new experience. I attended a technical conference. No, that wasn’t the new…
-
The Power of Real-Time Entity Resolution with Ryan Wright
This lightning talk will highlight two approaches to real-time entity resolution on streaming data using the Quine streaming graph.