Application Architectural Overview
Now that IT knows the business goal, here are a few key points they need to focus on to implement this system:
- Latency is a big deal for this application. If they don't have the aggregate contents of all the shopping baskets in a timely fashion, they can't make a business decision on which items to offer the upsell on in real-time. This makes their biggest focus to be on the streaming aspect of this application. The team has been reading a lot about how much success other companies have had with Apache Kafka, so they want to give that a shot.
- Kafka will handle the data ingest and aggregation part, but they still need a way to see the contents of all shoppers' shopping baskets while they shop. The team knows that object detection has gotten a lot better recently, so they want to leverage that to index the baskets' contents. They'll also need some sort of device attached to the shopping cart that can run this detection algorithm and send the output to the Kafka system.
- They don't really want to process this data in batch because latency is a killer for this situation. The longer they wait, the more likely the shopper is either in line for checkout or out of the store.
- They're going to leverage Confluent's packaging of Apache Kafka as their streaming data collection and processing system.
The team realizes they need to measure activity in the retail store in the same way they track shoppers in a web-based store. Ultimately, what is happening here is that they will instrument and analyze the physical world much in the same way they would treat a webserver or analyze web logs. They want to see not only what happened, but predict what they should do next.
Digging into the Apache Kafka Platform
The Big Cloud Dealz application development team has been to a few conferences and is interested in building on a solid real-time platform. They've heard a lot about Apach Kafka so they are going to base their application design on that platform as they are largely focused on collecting large amounts of streaming data and being able to run business rules against aggregates of this data. The Kafka streaming API
seems like a great fit to implement this concept, as Kafka has many examples of applications being able to react to data in-flight before it hits the data lake. A few other aspects of Kafka catch the team's attention:
- The type of application they were looking for wasn't as good of a fit for the batch world due to the latency requirements.
- Streaming apps tended to differ from MapReduce (or Spark, batch, etc.) applications as they tended to implement core function of the business as opposed to computing analytics.
- Real-time apps work better as event-driven, while database-based applications are fundamentally table driven.
- Having stream processing mechanics built into the ingest platform is handy and more efficient than trying to cobble this together from multiple systems.
- Given that the team had to build an embedded application for Shopping Cart 2.0, being able to build everything else on a single platform was a benefit.
The application development team checked out Confluent's blog for more ideas and saw where machine learning had been applied in general
along with predicting flight arrivals
A quote from Confluent's site goes on to state:
"Consider a simple model of a retail store. The core streams in retail are sales of products, orders placed for new products, and shipments of products that arrive. The “inventory on hand” is a table computed off the sale and shipment streams which add and subtract from our stock of products on hand. Two key stream processing operations for a retail outlet are re-ordering products when the stock starts to run low, and adjusting prices as supply and demand change."
Jay Kreps' Blog Article: "Introducing Kafka Streams: Stream Processing Made Simple"
The team also likes the idea of Kafka (as opposed to trying to put something together themselves) because they don't have to maintain another custom, internally-built system. Why? When building complex systems there are only so many parts you want to "own" as many things become a rabbit hole that distract from your end goal and most folks don't have time to miss their project dates.
The General Application Architecture
Based on several whiteboarding sessions, the team comes up with a general design for what they want to do with the "Green-Light Special" application. In the diagram below, we see the architectural overview for this streaming Kafka-based application.
The team has broken up the applicaiton into 3 major components in the architectural diagram above:
- A shopping cart (2.0) with an attached camera and wifi unit (likely an ARM-based embedded system) with an object detection model loaded to detect specific objects from the camera.
- A Kafka cluster back in the data center to collect all of the incoming data from the shopping carts, organizing it into logical topics for processing.
- A group of streaming applications leveraging Kafka's Streaming API to give the retail store's team a real-time look at what items are in customers' baskets across the store.
While the Big Cloud Dealz team is a big fan of Apache Hadoop, notice that the architecture above does not include (at this point) any sort of "data lake" to land the data. Some types of data have value that is a direct function of how long it takes to make the data actionable. In the case of trying to analyze shoppers' baskets in real time, it doesn't make a lot of sense to push this data to HDFS first. They find it far more favorable (for latency purposes) to have the data driving business insights as it comes in, which keeps the data's value high. With the basic architecture laid out, we'll take a look in the upcoming blog posts at specific parts of the application the team needs to develop.
Summary and Next Steps
In this blog post, we saw a new business plan developed by Big Cloud Dealz to update their in-store retail experience.
In the next post in this series
we'll look at the object detection portion of that system along with sending those detected objects to Kafka.
1. In popular culture, Kmart became known for its "Blue Light Specials." These occurred at surprise moments when a store worker would light up a mobile police light and offer a discount in a specific department of the store while announcing the discounted special over the store's public address system.