How to Group By Alias in an AWS Athena SQL Query
Let’s see how we can perform a GROUP BY
on an alias in an AWS Athena query.
Error: alias Column 'col' cannot be resolved
We might try to run a query that looks something like this:
SELECT
substr(old_col, 2) as new_col,
COUNT(*)
FROM table_name
GROUP BY new_col
But, we’ll run into the following error when trying to use this alias in the group by:
alias Column 'new_col' cannot be resolved
There are several ways we can go above circumventing an issue like this.
1. Using the WITH
clause
We can use the WITH
clause to define subqueries, each one defining a temporary table that we can use in the following SELECT
query.
WITH new_table_name
AS (SELECT substr(old_col, 2) as new_col FROM table_name)
SELECT new_col FROM new_table_name
GROUP BY new_col
2. Using a subquery
We can also explicitly define the subquery in the FROM
clause.
SELECT new_col, COUNT(*)
FROM (SELECT substr(old_col, 2) as new_col FROM table_name)
GROUP BY new_col
3. Using GROUP BY
numberings
We can also reference the alias based on its index in the SELECT
statement.
This index is 1
-indexed (not 0
-indexed), so in this case, our alias exists in the first index of the SELECT
statement. If we had switched the COUNT(*)
with our alias, we would have had to GROUP BY 2
.
SELECT
substr(old_col, 2) as new_col,
COUNT(*)
FROM table_name
GROUP BY 1
4. Using a LATERAL JOIN
Using a LATERAL JOIN
, our subqueries can reference columns provided by preceding FROM
items.
SELECT new_col, COUNT(*) FROM table_name
LATERAL JOIN (SELECT substr(old_col, 2) as new_col)
GROUP BY new_col
Grouping by the number may also yield some performance benefits.
5. Using copy-paste
Finally, another option is to explicitly re-use our expression in the GROUP BY
clause.
SELECT
substr(old_col, 2) as new_col,
COUNT(*)
FROM table_name
GROUP BY substr(old_col, 2)