Validating SQL Query Changes: How to Compare Results Effectively

James Gnanasekaran
3 min readAug 30, 2024

--

“In the world of SQL, even the smallest change can have a significant impact. Ensuring that your modifications are accurate and don’t introduce new issues is essential for maintaining the integrity of your database.

A few years ago, I discovered Beyond Compare, a tool that quickly became indispensable for me. I’ve used it extensively to compare code, data tables, and entire folders. It’s especially useful for catching subtle issues, like non-printable characters, that a simple visual check might miss.

However, Beyond Compare is a licensed tool, and not every company may have access to it. So, what do you do if you need to ensure your changes are correct but don’t have a dedicated comparison tool?”

You can compare the records from the original and modified CTE queries directly in SQL itself. Here’s how you can do it:

1. Using EXCEPT or MINUS (depending on the database)

  • You can use the EXCEPT (or MINUS in some databases like Oracle) operator to compare the results of two queries. This operator returns all rows from the first query that are not in the second query.
-- Differences in records present in original but not in modified
WITH original_cte AS (
-- Original CTE query here
),
modified_cte AS (
-- Modified CTE query here
)
SELECT *
FROM original_cte
EXCEPT
SELECT *
FROM modified_cte;

2. Using FULL OUTER JOIN

  • You can join the original and modified queries on all columns and filter out rows where there is a difference.
WITH original_cte AS (
-- Original CTE query here
),
modified_cte AS (
-- Modified CTE query here
)
SELECT *
FROM original_cte o
FULL OUTER JOIN modified_cte m
ON o.col1 = m.col1
AND o.col2 = m.col2
AND ... -- Continue for all columns you want to compare
WHERE o.col1 IS NULL OR m.col1 IS NULL
OR o.col2 IS NULL OR m.col2 IS NULL
OR o.col1 <> m.col1
OR o.col2 <> m.col2
-- Continue for all columns you want to compare

3. Using UNION ALL with a GROUP BY

  • Combine both the original and modified CTEs using UNION ALL and group the results. Rows that appear twice (once in each query) will have a count of 2, while rows that appear only once will have a count of 1. Filtering for counts of 1 will give you the differences.
WITH original_cte AS (
-- Original CTE query here
),
modified_cte AS (
-- Modified CTE query here
),
combined AS (
SELECT *, 'original' AS source FROM original_cte
UNION ALL
SELECT *, 'modified' AS source FROM modified_cte
)
SELECT col1, col2, ... -- Select the relevant columns
FROM combined
GROUP BY col1, col2, ...
HAVING COUNT(*) = 1;

4. Using ROW_NUMBER for Row-Level Comparison

If you want to compare rows based on their position/order, you can use the ROW_NUMBER function to assign row numbers and then compare them.

WITH original_cte AS (
-- Original CTE query here
),
modified_cte AS (
-- Modified CTE query here
),
original_numbered AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rn FROM original_cte
),
modified_numbered AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rn FROM modified_cte
)
SELECT *
FROM original_numbered o
FULL OUTER JOIN modified_numbered m
ON o.rn = m.rn
WHERE o.col1 IS NULL OR m.col1 IS NULL
OR o.col1 <> m.col1
-- Continue for all columns you want to compare

Explanation:

  • EXCEPT/MINUS: Useful for finding rows that are in one result but not the other.
  • FULL OUTER JOIN: Compares row by row and identifies differences across all columns.
  • UNION ALL with GROUP BY: Identifies rows that do not match perfectly between the two queries.
  • ROW_NUMBER: Compares rows based on their order.

Choose the approach that best suits your needs.

--

--

James Gnanasekaran

Lifelong learner, Interested in solving business problems with technology, with a specific interest in CV and NLP (https://www.linkedin.com/in/jamespaultg/)