Streaming Graph ETL

Streaming Graph ETL

Real-time ETL on streaming data, especially when data is received from multiple sources, requires new tools to accommodate out-of-order data arrival and entity resolution, while operating at the scale of todays cloud, CDN and application event volumes.

General Software Architecture

The Problem

Most ETL tools use the batch processing paradigm to find high-value patterns in large volumes of data. Whether the specific business application is fraud detection, cyber security, network observability, e-commerce or ad targeting, batch processing translates into delay. Even if you are processing data in small batches, you are missing opportunities to react to events as they happen and shape outcomes in ways beneficial to your business.

A great example is insider trading. The cost of detecting someone who is about to execute an insider trade is much less than the cost of trying to unwind that trade later when batch processing picks it up. Even if the batch process runs every five minutes, that just means you'll find them sooner, not stop them. Ultimately stream vs. batch will result in the costly reversal of transactions, not stopping them in real-time.

The Solution

Streaming ETL using Quine means not just knowing but acting on events as they occur. Use Quine's ingest queries to materialize event data as a graph, with a graph’s ability to express and query for complex relationships between seemingly unrelated data. Then use Quine’s standing queries to monitor for key patterns (e.g. indicating a fraudulent transaction or cyber attack is underway) and take action when those patterns emerge. 

Quine’s graph ETL also makes it straightforward to process categorical data — everything from email addresses and model numbers to IP addresses and process IDs — that other systems ignore or try to encode. 

Use Quine Enterprise to scale your graph ETL to millions of events per second.

Key Value Delivered

  • Use standing queries to detect patterns as they occur and take action
  • Join data from multiple sources as scale
  • Resolve entities across sources
  • Mitigate out-of-order data arrival
  • De-duplicate data
  • Generate new events from data as it streams, in real-time
  • Integrates with existing Apache Kafka, AWS Kinesis, data lake, and API event sources.

Scale Real-Time ETL to Millions of Events/Sec

When you need to materialize graph data for use in real time, Quine streaming graph ETL is your best option. When you need added resilience and the ability to scale to millions of events per second, Quine Enterprise is the only option. Learn more about Quine Enterprise.

Next Steps