SQL

Lesson 05

Aggregation and grouping

Group by the same dimensions you want to report, and make aggregate meaning clear.

Good Code

review-metrics.sql
SELECT
  DATE_TRUNC('day', submitted_at) AS review_day,
  status,
  COUNT(*) AS review_count
FROM code_reviews
WHERE submitted_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY review_day, status
ORDER BY review_day DESC, status;

Bad Code

review-metrics.sql
SELECT
  submitted_at,
  status,
  COUNT(*) AS review_count
FROM code_reviews
WHERE submitted_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY status
ORDER BY submitted_at DESC;

Review Notes

What to review

Good Code

The good version reports counts by day and status, so both dimensions are selected, grouped, and ordered intentionally.

Bad Code

The bad version mixes a raw timestamp with an aggregate grouped only by status. Depending on the database, it either fails or returns misleading values.

Takeaways

  • Every non-aggregated selected value should be part of the grouping intent.