Transaction and Span Rate Limiting

Relay enforces quotas defined in Sentry and propagates them as rate limits to clients. In most cases, this is a simple one-to-one mapping of data category to envelope item.

The transaction and span rate limits are more complicated. Spans and Transactions both have a "total" and an additional "indexed" category. They are also closely related, a transaction is a container for spans. A dropped transaction results in dropped spans.

The following document describes how transaction and span quotas interact with each other within Relay.

Related Documentation:

The following rules hold true:

  • A quota for the transaction category is also enforced for the span category.
  • A quota for the span category is also enforced for the transaction category.
  • When a transaction is dropped, all the contained spans should be reported in outcomes.
  • If either transactions or spans are rate limited, clients should receive a limit for both categories.
  • Indexed quotas only affect the payload for their respective category.

These rules can be visualized using this table:

Rate Limit / ItemTransaction PayloadTransaction MetricSpan PayloadSpan Metric
Transaction
Transaction Indexed
Span
Span Indexed
  • ❌: Item rejected.
  • ✅: Item accepted.

Outcomes must be generated in Relay for every span and transaction which is dropped. Usually a dropped span/transaction results in outcomes for their respective total and indexed category, refer to The Indexed Outcome Category for details.

This is straight forward for all indexed rate limits, they only drop the payload which results in a single negative outcome in the indexed category of the item. A transaction_indexed rate limit does not cause any spans to be dropped and vice versa.

A span quota and the resulting span rate limit is also trivial for standalone spans received by Relay, the standalone span is dropped, and a single outcome is generated.

Transactions are containers for spans until they are extracted in Relay. This span extraction can happen at any Relay stage: customer managed, PoP-Relay or Processing-Relay. Until spans are extracted from Relay, a dropped transaction should count the contained spans and generate an outcome with the contained span quantity + 1, for the segment span which would be generated from the transaction itself.

After spans have been extracted, the transaction is no longer a container of span items and just represents itself, thus, a dropped transaction with spans already extracted only generates outcomes for the total transactions and indexed transaction categories.

Quota:

Copied
{
  "categories": ["span_indexed"],
  "limit": 0
  // ...
}

Transaction:

Copied
{
  "type": "transaction",
  "spans": [
      { .. },
      { .. },
      { .. }
  ],
  // ...
}

An envelope containing a transaction with 3 child spans generates 4 outcomes for rate limited spans in the spans_indexed category. 1 count for the generated segment span from the transaction and 3 counts for the contained spans. The transaction itself will still be ingested.

Ingestion:

Transaction PayloadTransaction MetricsSpan PayloadSpan Metrics

Negative Outcomes:

transactiontransaction_indexedspanspan_indexed
0004

Rate Limits propagated to SDKs: None.

Quota:

Copied
{
  "categories": ["transaction"],
  "limit": 0
  // ...
}

Transaction:

Copied
{
  "type": "transaction",
  "spans": [
      { .. },
      { .. },
      { .. }
  ],
  // ...
}

Ingestion:

Transaction PayloadTransaction MetricsSpan PayloadSpan Metrics

Negative Outcomes:

transactiontransaction_indexedspanspan_indexed
1144

Rate Limits propagated to SDKs: Transaction, Span.

At first glance this is non-obvious, spans can exist without transactions and also transactions can exist without (standalone) spans, so why can the quotas be used interchangeably?

The reasoning lies in the Sentry product and how the information is used within Sentry. Because of Relay, Sentry can safely assume spans exist if the user/SDK sends transactions and more and more of the product is built on the basis of spans. From this point of view transactions and spans convey the same information, it is just represented differently.

The logical conclusion and simplification is, to treat span and transaction rate limits equally (important: treat the limits the same, the quotas are still tracked separately). Enforcing these limits on SDKs then does not require extra logic, it is all contained within Relay and works for any SDK version, future and present.

There can be multiple reasons for this in the entire pipeline, these are just some reasons why there can be differences caused by Relay:

  • Parsing the span count from a transaction is too expensive and may be omitted. This is an extremely important property of Relay, abuse cases must be handled as fast as possible with as little cost ($ and resources) as possible. In some cases, it may not be feasible to JSON parse the transaction to extract a span count.
  • A envelope item may be malformed, there will be outcomes generated for the inferred data category (span or transaction), but the span count cannot be recovered from an invalid transaction.

From top to bottom:

  • Do you completely disable ingestion? Configure the limit for the span and transaction data categories.
  • Is it billing related? For example, spike protection operates on the category of the billing unit, use the respective data category for the limit.
  • Use the total category (not indexed) which makes most sense to you. When in doubt, ask the Relay team.

No, unless the intention is to protect infrastructure (abuse limits).

Indexed quotas are only useful to protect downstream infrastructure through abuse quotas. They are inherently more expensive to enforce, cannot be propagated to clients and are generally a sign of misconfiguration.

Dynamic-, smart- or client side-sampling should prevent any indexed quota from being enforced.

They are not treated differently from extracted spans. After metrics extraction, which may happen in customer Relays, there is no more distinction. Having no special treatment for standalone spans also means we do not need any special logic in the SDKs.

Was this helpful?
Help improve this content
Our documentation is open source and available on GitHub. Your contributions are welcome, whether fixing a typo (drat!) or suggesting an update ("yeah, this would be better").