[Webinar] Bringing Flink to On-Prem and Private Clouds. Register Now
At Pinterest, we use Kafka Streams API to provide inflight spend data to thousands of ads servers in mere seconds. Our ads engineering team works hard to ensure we’re providing the best experience to our advertising partners. As part of this, it’s critical we build systems to prevent instances of ads overdelivery. In this post, we’ll explain overdelivery and share how we built a predictive system using Apache Kafka Streams API to reduce overdelivery in our ads system and provide near-instant information about inflight spend.
Overdelivery occurs when free ads are shown for out-of-budget advertisers. This reduces opportunities for advertisers with available budget to have their products and services discovered by potential customers.
Overdelivery is a difficult problem to solve for two reason:
Here’s an example of how this works. Imagine that advertiser X pays an internet-based company $0.10 per impression with $100 total daily budget. This yields a maximum of 1,000 daily impressions.
The company quickly implements a straightforward system for the advertiser:
When a new ad spot (i.e. an opportunity to display an ad) appears in the company’s website, the frontend sends an ad request to ads inventory. Ads inventory then decides whether to show ads for advertiser X based on their remaining budget. If budget is still available, the ads inventory will make an ad insertion (i.e. an ad entry that’s embedded in a user’s app) to the frontend. After the user views the ad, an impression event is sent to the spend system.
However, when the company checked their revenue that didn’t happen.
In one day, advertiser X’s ads were shown 1,100 times with $0.09 per impression. The extra 100 impressions were free, and could have been used by another advertiser. This example illustrates the common industry challenge of overdelivery.
So how does overdelivery occur? In this example, let’s say it turned out that the spend system was reacting too slowly. In fact, let’s say there was a delay of five minutes before it accounted for a user impression. Therefore, the internet company in this example did some optimizations to improve the system, resulting in an extra $9! This occurred because the company showed 90 impressions for other budget surplus advertisers, and overdelivery rate is only 10/1000 = 1 percent.
Later on, another advertiser, Y, contacted the same internet company and wanted to spend $100 per day to surface their ads at $2.0 per click (i.e. a user clicking through the ads link and reaching advertiser Y’s website), with 50 daily clicks max. The internet company added advertiser Y to the flow and added click event tracking for them into the system.
By the end of day, the internet company’s system was over-delivering, again.
Advertiser Y ended up with 10 free clicks! The internet company needed to identify why the system couldn’t foresee whether the inserted ad would be clicked, no matter how fast the system. Without the future spend information, they’d always overdeliver.
To finish this example, the company eventually found a brilliant solution: compute the inflight spend of each advertiser. Inflight spend is the cost of ads insertions that haven’t yet been charged. If actual_spend + inflight_spend > daily_budget, don’t show ad for this advertiser.
Every day we help Pinners discover personalized ideas across our app, from recommendations to search to Promoted Pins. We need a reliable and scalable way to serve ads to Pinners and ensure we respect the budgets of our advertising partners.
We began designing a spend prediction system with the following goals:
We evaluated a variety of streaming services, including Spark and Flink. These technologies meet our scale requirements, but in our case, Apache Kafka Streams API provides extra advantages:
At a high level, the below diagram illustrates our system with inflight spend:
*Price: the value of this ad.
*Impression_rate: historical conversion rate of one insertion to impression. Note that an insertion is not guaranteed to convert to an impression.
*Action_rate: for an advertiser paying by click, this is the probability that user will click on this ad insertion; for advertiser paying by impression, this is 1.
In practice, our spend predictions are extremely accurate. After applying the predictive budgeting system, we significantly reduced over delivery. Below is an example test of of actual vs. predicted spend.
Example: The horizontal axis is three 3-minute wide time interval; the vertical axis is actual spend by time interval. Blue line represents the inflight spend and green line represents actual spend.
Using Apache Kafka’s Stream API to build a predictive spend pipeline was a new initiative for our ads infrastructure and it turned out to be fast, stable, fault-tolerant and scalable. We plan to continue exploring Kafka 1.0 and KSQL brought by Confluent for future systems design!
If you have enjoyed this article, you might want to continue with the following resources to learn more about Apache Kafka’s Streams API:
Acknowledgements: Huge thanks to Tim Tang, Liquan Pei, Zack Drach and Jerry Liu from Pinterest for the overall design, data analysis, performance tuning and encoding logic, and Guozhang Wang from Confluent for Kafka Streams usage elucidation and troubleshooting.
Discover how predictive analytics, powered by generative AI and data streaming, transforms business decisions with real-time insights, accurate forecasts, and innovation.
Transform your ad campaigns with generative AI + Confluent. Optimize performance, automate tasks, and deliver personalized content—all in real time.