Group by

Roll up rows by one or more keys, producing aggregates (count, sum, average, min, max) per group.

intermediate

A group-by step rolls up rows by one or more key fields, producing one output row per group with computed aggregates.

Use after a spatial-join, after a filter, or directly on a source layer when you want a summary.

Inputs

Source layer.
Group keys: one or more fields whose distinct combinations define the groups.
Aggregates, each a (function, field, output name): - count(*): number of rows per group. - sum(field): numeric total. - avg(field): average. - min / max(field): extremes. - first / last(field, ordered by another field): first or last value when ordered. - collect_into_array(field): JSON array of values.
Geometry strategy (for spatial outputs): - Drop: output rows have no geometry (non-spatial output). - Bounding box: union bbox of each group's geometries. - Convex hull: convex hull of each group's geometries. - First: the first row's geometry (cheap, semi-arbitrary). - Dissolve: union of geometries (use Dissolve instead if that's the only thing you want; cheaper).

One row per distinct combination of group keys. Schema is the group keys + the aggregates. Geometry per the chosen strategy.

Hydrants per parcel (after a spatial-join). Group key: parcel_id. Aggregate: count(*) as hydrant_count. Geometry strategy: drop (non-spatial result) or join back to parcels for the spatial result.
Average inspection score per neighborhood. Group key: neighborhood. Aggregate: avg(score). Geometry: dissolve to neighborhood boundaries.
Number of incidents per category, monthly. Two group keys: category, month_start. Aggregate: count(*). Geometry: drop.

NULL is a group. Rows with NULL in a group key are grouped together under "NULL". Filter them out first if you don't want that.
Large groups are slow. The default geometry strategy (first) is cheap; dissolve on a 100k-feature group can be minutes. Pick by what the downstream consumer needs.
Cardinality drops sharply. The output row count is the number of distinct group-key combinations. A million-row layer grouped by month_start produces ~12 rows per year of data.