How to Group By Alias in an AWS Athena SQL Query


Let’s see how we can perform a GROUP BY on an alias in an AWS Athena query.

Error: alias Column 'col' cannot be resolved

We might try to run a query that looks something like this:

SELECT 
  substr(old_col, 2) as new_col, 
  COUNT(*)
FROM table_name 
GROUP BY new_col

But, we’ll run into the following error when trying to use this alias in the group by:

alias Column 'new_col' cannot be resolved

There are several ways we can go above circumventing an issue like this.

1. Using the WITH clause

We can use the WITH clause to define subqueries, each one defining a temporary table that we can use in the following SELECT query.

WITH new_table_name 
AS (SELECT substr(old_col, 2) as new_col FROM table_name)
SELECT new_col FROM new_table_name
GROUP BY new_col

2. Using a subquery

We can also explicitly define the subquery in the FROM clause.

SELECT new_col, COUNT(*)
FROM (SELECT substr(old_col, 2) as new_col FROM table_name)
GROUP BY new_col

3. Using GROUP BY numberings

We can also reference the alias based on its index in the SELECT statement.

This index is 1-indexed (not 0-indexed), so in this case, our alias exists in the first index of the SELECT statement. If we had switched the COUNT(*) with our alias, we would have had to GROUP BY 2.

SELECT 
  substr(old_col, 2) as new_col, 
  COUNT(*)
FROM table_name
GROUP BY 1

4. Using a LATERAL JOIN

Using a LATERAL JOIN, our subqueries can reference columns provided by preceding FROM items.

SELECT new_col, COUNT(*) FROM table_name
LATERAL JOIN (SELECT substr(old_col, 2) as new_col)
GROUP BY new_col

Grouping by the number may also yield some performance benefits.

5. Using copy-paste

Finally, another option is to explicitly re-use our expression in the GROUP BY clause.

SELECT 
  substr(old_col, 2) as new_col, 
  COUNT(*)
FROM table_name
GROUP BY substr(old_col, 2)