---
title: "Grouping"
url: https://develop.sentry.dev/backend/application-domains/grouping/
---

# Grouping

Sentry has an extensive grouping system ([public documentation](https://docs.sentry.io/product/sentry-basics/grouping-and-fingerprints/)) which can be customized [on the client](https://docs.sentry.io/platforms/python/usage/sdk-fingerprinting/) and on the server via [fingerprint rules](https://docs.sentry.io/product/data-management-settings/event-grouping/fingerprint-rules/) and [stack trace rules](https://docs.sentry.io/product/data-management-settings/event-grouping/stack-trace-rules/).

This documentation attempts to explain how the system currently functions and what limitations exist presently with it.

# [Basics](https://develop.sentry.dev/backend/application-domains/grouping.md#basics)

On the most basic level when an event comes into Relay, Relay will already associate with the event the current version of the grouping configuration. This means that from the very first moment an event enters the infrastructure the decision on which version of the grouping algorithm has been made.

When the event makes its way into the core event processing system ready to be saved and a fingerprint hasn't been set by the client yet, three systems start operating:

1. At first stack traces are processed through a system called `normalize_stacktraces_for_grouping` where the stack trace rules are applied to make a pass over the "in app" flags of the frames. This for instance allows the active grouping code to mark frames as outside of the app or within the scope of the app. The original value of the `in_app` flag is preserved in the event as well. This allows the system to "revert" to the original values if grouping rules have to run again (for instance for reprocessing or similar).
2. Next the server side fingerprinting rules are run. These have the ability to override the default fingerprint that would otherwise be generated by the grouping code. The output of the fingerprinting code can either be a list of strings will be hashed into the fingerprint, or it can also include the `{{ default }}` placeholder in which case the original grouping code will still run and be folded into the fingerprint to further subdivide the group.
3. Finally the actual grouping algorithm runs *if and only if* the fingerprint is not been set yet or it uses the special `{{ default }}` value.

It's important to know that the grouping algorithm can produce more than one fingerprint hash. These hashes are collected and associated with issues via the `GroupHash` model. If any of these hashes exists in a group the event is associated with it, and any hash not yet associated with the group is added. In practice this means an event may produce both an "app" hash (using only in-app frames) and a "system" hash (using all frames), and either one matching an existing group is sufficient.

# [Issue / Group Creation](https://develop.sentry.dev/backend/application-domains/grouping.md#issue--group-creation)

Internally in Sentry issues are called "Groups" or "grouped messages" and represented by the [`Group`](https://github.com/getsentry/sentry/blob/master/src/sentry/models/group.py) model.

After the fingerprints were calculated Sentry makes a decision if an already existing group shall be reused or if a new group is created. This is done via the [`GroupHash`](https://github.com/getsentry/sentry/blob/master/src/sentry/models/grouphash.py) model. Most importantly if a group must be created it will be created immediately. While the data model supports events without issues, the user interface does not really support this for a range of things that a user expects. This means that at the moment 100% of events are associated with a group.

When the group has been created or found, the event is associated with that `group_id`. This means that the event as it flows further through the system to make it's way towards snuba, is associated with that group which also means that the group is persisted in snuba along with the event.

Upon group creation, additional code runs such as the triggering of alerts, regression detection and more. It is thus relatively expensive to create a group due to the number of additional actions that can be triggered from it.

# [AI Grouping](https://develop.sentry.dev/backend/application-domains/grouping.md#ai-grouping)

In addition to fingerprint-based grouping, Sentry uses AI to further improve issue grouping accuracy. This system identifies issues with similar stacktraces and error messages that might have different fingerprints due to minor code variations. AI grouping works alongside traditional fingerprinting, occurring *after* hash-based lookup and *before* new group creation:

1. The event's hashes are computed via the standard grouping algorithm.
2. Sentry checks whether any of those hashes already belong to an existing group via the hash-based lookup.
3. If no existing group is found, and the event is eligible, Sentry generates an embedding of the error's message and in-app stack frames, compares this embedding against existing error embeddings for that project, and merges the new error into an existing issue if a similar issue is found within the configured threshold.
4. If no match is found (or the event is ineligible), a new group is created as before.

Eligibility is determined by [`should_call_seer_for_grouping`](https://github.com/getsentry/sentry/blob/master/src/sentry/grouping/ingest/seer.py) in `src/sentry/grouping/ingest/seer.py`. Beyond the fingerprint check above, it also considers:

* Whether the event has a usable stacktrace
* Whether the event's platform is supported
* Whether the project has AI-enhanced grouping enabled
* Rate limits (both global and per-project)
* A circuit breaker that trips if error rates are too high
* Stacktrace size limits

Results — including the matched grouphash, match distance, and model version — are persisted in the [`GroupHashMetadata`](https://github.com/getsentry/sentry/blob/master/src/sentry/models/grouphashmetadata.py) model alongside the event's hash-based grouping metadata.

More details on the AI grouping process can be found on the public facing docs [here](https://docs.sentry.io/concepts/data-management/event-grouping/#ai-enhanced-grouping).

# [Merges and Splits](https://develop.sentry.dev/backend/application-domains/grouping.md#merges-and-splits)

The system does not cope particularly well with merges and splits because the events in Snuba are generally considered immutable. When a user triggers a merge a task is issued that initiates this merge. A merge can also be partially undone, but a split fails to fully reconstruct the original state as some information (such as the original >90 day counts) were lost in the process. The merge is also expensive and frequent merges can cause a significant backlog on the snuba task queue.

# [Grouping Theories](https://develop.sentry.dev/backend/application-domains/grouping.md#grouping-theories)

Sentry applies some general principles to grouping in an attempt to group the right types of events together. There are however restrictions to the current system where Sentry has to find "the one group" as an event can only ever get to one. To understand how Sentry understands groups, it's important to understand that to find the right group one has to realize that the problem is fundamentally subjective and in fact depends where the source of the error is.

## [Stack Traces](https://develop.sentry.dev/backend/application-domains/grouping.md#stack-traces)

Sentry's primary way of grouping is the stack trace. Equivalent stack traces (with a matching error type) indicate the same error. There is however a fundamental variance to stack traces to actually creating a fingerprint of the stack trace has some challenges associated with it.

A stack trace consists of multiple frames and each frame contributes to the fingerprint. Which parts of it do and which ones do not depends quite a bit on the platform and on the different rules we apply to it.

### [Example Platform Behavior](https://develop.sentry.dev/backend/application-domains/grouping.md#example-platform-behavior)

In **Python** as an example each frame contributes `module` name, `function` name and `context-line` (that is the source code of the line where the frame pointer pointed with leading and trailing whitespace removed). The motivation here is the following: modules and functions are relatively coarse indicators and a function can often fail from different branches, so taking the source code into account in addition is less likely to over-group. This however also means that when a refactoring takes places in a line that does not change the functionality but the source code, it can cause a new line to be created unnecessarily. The other consequence of this is that we require source code to be available for grouping. This works well in Python as the Python SDK generally submits the source code along, but for instance we cannot use the same rule in C++ for instance where the availability of the source code is not guaranteed and one release can come with source, whereas another one might not.

On the other hand for **JavaScript** we need to apply different rules. First of all we are currently having challenges with determining a reliable function name due to limitations with the source maps. This can cause minified function names to show up at times which are unstable between builds. Because of this we instead feed `module` name, `filename` (that is the base name of the path only converted to lowercase) as well as the `context-line`. This again is risky as the `context-line` might slightly change over time. An additional complication with JavaScript is that the browser supplied stack trace is not guaranteed to be particularly stable. Different browsers have very different behavior in how the stack trace looks like when browser native functions are involved. For instance the use of `Array.forEach` can produce vastly different stack traces between browsers. Some will show `Array.forEach` as a stack frame, some will not. As such the grouping algorithm is trying it's very best to consistently discard frames which some other browsers cannot produce. However there are limitations and the same error can produce different stack traces from different browsers.

The **native platforms** are the most tricky to get right. Here we are working with limitations of the underlying debug information when creating stack traces. The first level of complication is that a natively compiled language is likely to inline source code into the calling function. One some platforms this will fundamentally change the available information. As an example a Microsoft compiler will provide the full demangled name for a non inlined function, but it only provides the local function name for an inlined function (that's a very simplified explanation, the actual difference is more nuanced). It's also that different compilers will mangle and format names very differently. This for instance can cause the very same error to fingerprint very differently when compiled and run on Linux vs macOS. Additionally source code is generally not available thus we are not relying on it. We thus largely only feed `function` names into the grouping algorithm with a lot of cleaning up being applied (eg: we remove generics, parameters and return values from the demangled function name before feeing it to the grouping code).

### [Frame Hiding](https://develop.sentry.dev/backend/application-domains/grouping.md#frame-hiding)

Because based on the fingerprint we can only create a single group we thus are trying quite hard to eliminate unnecessary noise between stacks. This is to a large degree achieved by removing entire frames from the stack for grouping. There are two ways by which frames are removed: they are either removed from grouping entirely or they are marked as "out of app" which means that they contain code which is unrelated to the application that the developer created. This for instance means that if you are using the Django framework we will mark frames from the framework as not application code which cause them to be "ignored" for grouping.

The difference between non application code and code entirely removed from grouping is that the former will still create a secondary hash which we also associate with groups. However because that hash contains "more information" than the other, it's generally never used for grouping except for a form of implied merge. To understand this better consider a stack trace with three frames "A1 B1 A2 A3". In the beginning they are all considered in app thus they are all feeding into the fingerprint `[A1, B1, A2, A3]`. At a later point either an SDK update or a change in the configuration on the server marks `B1` as not in-app. The grouping algorithm if it were to fully ignore the `B1` frame now would create a new hash which is not found on the existing groups and would create a new group. However because we still create the full hash anyways the new event that comes in would still find the already existing group. The hashes created are `[A1, B1, A2, A3]` as well as `[A1, A2, A3]`. Likewise if you later also remove `A3` from in-app it would create the hashes `[A1, B1, A2, A3]` as well as now `[A1, A2]` and the first hash would still be a match against an already existing group.

### [Stack Lengths](https://develop.sentry.dev/backend/application-domains/grouping.md#stack-lengths)

Sentry at the moment feeds the entire stack into the group. There is a way to limit the number of frames contributed down to a smaller set by setting a maximum number of frames that should be considered. This has a hypothetical advantage when working with different paths that lead to a bug but have the consequence that very large groups can be created.

## [Fallback Grouping](https://develop.sentry.dev/backend/application-domains/grouping.md#fallback-grouping)

When a stack trace is unavailable the system needs to fall back to some sort of alternative grouping method. The fallback is what we call "message based grouping" and it's a pretty limited method. We take the first line of the message and apply some clean up logic. For instance if numbers are encountered they are replaced by a static placeholder. Same with known timestamps, UUIDs and similar things. However in many cases the source of these strings is impossible to clean up so fallback grouping is very likely to create a high number of independent groups.

## [Theory on Issue Sources](https://develop.sentry.dev/backend/application-domains/grouping.md#theory-on-issue-sources)

Sentry has a general tendency to group different paths towards an issue separarately. Even if the grouping algorithm is perfect, it will consider different ways to end up at a bug independent problems. Take a hypothetical function `get_current_user`. Let's imagine this function has a bug where it now starts failing with a `DataConsistencyError` exception. If we consider that this function is used in a lot of different places of the application, each of these callers will now create a different group. We can thus think of grouping as a problem of the source of the error. At any point the question can be asked if the source of the error is "how we call a function" (caller error) or "in the function" (callee error). Making this decision is impossible to make in a general sense, but over time it can be easier to make this call.

# [Paths Forward](https://develop.sentry.dev/backend/application-domains/grouping.md#paths-forward)

The grouping system is tightly coupled to the workflow that issues drive. It is the creator of the groups and as the creator, it drives a big part of the user experience. If it were to create a single issue per event, or a single issue for all events, nothing in Sentry would properly function. It is thus our first point of balancing the quality of the workflow.

## [Improving AI-Enhanced Grouping](https://develop.sentry.dev/backend/application-domains/grouping.md#improving-ai-enhanced-grouping)

The introduction of AI-enhanced grouping has already improved the caller-vs-callee problem described above, but there is ongoing work to improve model accuracy, expand platform support, and reduce latency. Key areas include better handling of hybrid fingerprints, improving confidence thresholds, and training on broader datasets.

## [Groups of Groups](https://develop.sentry.dev/backend/application-domains/grouping.md#groups-of-groups)

The consequences of making too many groups today are alert spam and the inability to work with multiple issues at once. If Sentry were to no longer be alerting on all new groups and tools existed to work across multiple groups, more opportunities arise. In particular the grouping algorithm could continue to just fingerprint the stack trace but a secondary process could come in periodically and sweep up related fingerprints into a larger group. If we take the `get_current_user` example the creation of 50 independent groups is not much of an issue if no alerts are fired. If after 5 minutes the system detected that they are in fact all very related (eg: the bug is "in `get_current_user`") it could leave the 50 generated groups alone but create a new group that links the other 50 groups, hide/deemphasize the individual 50 groups in the UI and let the user work with the larger group instead.
