Google Adds Event-Driven Webhooks to the Gemini API, Eliminating the Need for Polling in Long-Running AI Jobs

google-adds-event-driven-webhooks-to-the-gemini-api,-eliminating-the-need-for-polling-in-long-running-ai-jobs

Source: MarkTechPost

If you’ve ever built a production AI pipeline that runs long jobs — processing thousands of prompts overnight, kicking off a Deep Research agent, or generating a long video — you’ve almost certainly dealt with the polling problem. Your code sits in a loop, firing GET requests every few seconds asking, “Is the job done yet?” It’s wasteful, it adds latency, and at scale it becomes a reliability headache. Google just shipped the fix.

Google introduced event-driven Webhooks for the Gemini API — a push-based notification system that eliminates the need for inefficient polling. The feature is available now for all developers using the Gemini API and targets a core pain point in agentic and high-volume AI workflows.

Why Polling Breaks Down at Scale

To understand the problem, it helps to know what Long-Running Operation (LRO) is. Webhooks allow the Gemini API to push real-time notifications to your server when asynchronous or Long-Running Operations complete, replacing the need to poll the API for status updates and reducing latency and overhead.

Before webhooks, the only option was continuous polling — repeatedly calling GET /operations to check if a job had finished. As Gemini shifts toward agentic workflows and high-volume processing — like Deep Research, long video generation, or processing thousands of prompts via the Batch API — operations can take minutes or even hours. Polling for hours is expensive in both compute and API quota, and it introduces unnecessary delays between when a job completes and when your application learns about it.

The fix is conceptually simple: instead of your code asking “are you done?” repeatedly, the Gemini API calls your server the moment a task finishes, by pushing a real-time HTTP POST payload to your endpoint the instant a task completes.

Two Configuration Modes: Static and Dynamic

The Gemini API supports two ways to configure webhooks. Static webhooks are project-level endpoints configured with the WebhookService API and are suited for global integrations like notifying Slack or syncing a database — they are registered once per project and trigger for any matching event. Dynamic webhooks are request-level overrides that pass a webhook URL in the webhook_config payload of a specific job call, making them ideal for routing specific jobs to dedicated endpoints, for example in agent-orchestration queues.

You can think of static webhooks like a standing instruction to your mail carrier: “Always deliver packages to the front desk.” Dynamic webhooks are more like saying: “For this one shipment, send it to my home address.” An additional feature of dynamic webhooks is the user_metadata field, which lets you attach arbitrary key-value metadata to a job at dispatch time — for example, {"job_group": "nightly-eval", "priority": "high"}. This metadata travels with the job notification and is particularly useful when you need to fan out different job types to different downstream processors without building a separate tracking layer.

Security Architecture: Standard Webhooks, HMAC, and JWKS

Security is where this implementation gets technically interesting. Google’s implementation strictly adheres to the Standard Webhooks specification. Every request is signed using webhook-signature, webhook-id, and webhook-timestamp headers, ensuring idempotency and preventing replay attacks.

For static webhooks, the signing is done with HMAC (Hash-based Message Authentication Code) using a symmetric shared secret, which is provided once at creation time and must be stored securely in your environment variables — the API returns this signing secret only once and it cannot be retrieved again. If you lose it, you have to rotate it. The rotation endpoint supports a revocation_behavior parameter — specifically REVOKE_PREVIOUS_SECRETS_AFTER_H24, which keeps the old secret valid for a 24-hour grace period so you can safely transition production systems, or an immediate revocation option for incident response.

For dynamic webhooks, Google uses asymmetric public-key JWKS (JSON Web Key Set) signatures instead of symmetric secrets. Dynamic webhook requests emit a JSON Web Token (JWT) signature, and your listener must extract and verify it using Google’s public certificate endpoints at https://generativelanguage.googleapis.com/.well-known/jwks.json. The RS256 algorithm is used for this verification.

This means your server never blindly trusts incoming requests — every webhook hit can be cryptographically verified before you act on it. The webhook-timestamp header is particularly important: best practices call for always validating this timestamp and rejecting payloads older than five minutes to mitigate replay attacks.

Thin Payloads and the Event Catalog

One architectural decision worth noting is the thin payload model. To avoid bandwidth congestion, Gemini webhooks deliver a snapshot containing status details and pointers to results, rather than the raw output file itself. The exact fields in that snapshot depend on the event type.

For batch jobs, a completed notification carries the job id and an output_file_uri pointing to your results — for example, a Cloud Storage path like gs://my-bucket/results.jsonl. For video generation, the video.generated event delivers a different set of fields: file_id and video_uri. Your server-side handler needs to branch on event type before reading the payload data fields.

The full event catalog covers three categories: batch jobs (batch.succeeded, batch.cancelled, batch.expired, batch.failed), Interactions API operations (interaction.requires_action, interaction.completed, interaction.failed, interaction.cancelled), and video generation (video.generated). For developers writing code: the official code samples in Google’s documentation subscribe to and handle batch.completed rather than batch.succeeded — both appear across the documentation, so match whichever your implementation uses.

The Interactions API, for readers unfamiliar with it, is Gemini’s API for async multi-turn agent conversations. The interaction.requires_action event is particularly useful — it fires when a function call is pending and your application needs to step in and take an action before the agent can continue.

Delivery Guarantees and Best Practices

Google guarantees “at-least-once” delivery with automatic retries for up to 24 hours using exponential backoff. The “at-least-once” guarantee means your endpoint could occasionally receive the same event more than once under high-congestion conditions. The consistent webhook-id header should be used to deduplicate these. Your server should also respond with a 2xx status code immediately upon valid signature detection and queue any heavier parsing internally — prolonged listener hold times trigger the retry cycle, which is the opposite of what you want.

Key Takeaways

  • No more polling loops — The Gemini API now pushes a signed HTTP POST to your server the instant a long-running job (Batch API, Deep Research, video generation) completes, eliminating the need to repeatedly call GET /operations.
  • Two webhook modes for different architectures — Static webhooks handle project-level global integrations secured via HMAC; Dynamic webhooks bind to individual job requests via JWKS signatures and support user_metadata for custom routing logic in agent-orchestration pipelines.
  • Security is built in, not bolted on — Every notification is cryptographically signed per the Standard Webhooks spec using webhook-signature, webhook-id, and webhook-timestamp headers. Reject payloads older than 5 minutes to block replay attacks, and use webhook-id to deduplicate at-least-once deliveries.
  • Thin payloads, not raw results — Webhook notifications carry status pointers, not output data. Batch events return output_file_uri; video events return file_id and video_uri. Always respond 2xx immediately and process asynchronously — slow responses trigger exponential-backoff retries for up to 24 hours.

Check out the Technical details here. Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.