How to Remove Duplicates from a List in Python
We want to remove duplicates from a list, or in other words, keep only unique values in our list.
duplicates = [0, 0, 0, 1, 1, 1, 2, 2, 2]
Using a for
Loop
The brute force, inefficient way to remove duplicates is using a for
loop, which might look something like this.
unique = []
for num in duplicates:
if num not in unique:
unique.append(num)
# [0, 1, 2]
Using set()
We can avoid this loop by simply converting the duplicates
list into a set.
By definition, a set will only contain unique items.
We can then convert the set back into a list if needed.
unique = list(set(duplicates))
# [0, 1, 2]
The issue here is that a set is unordered, so the original list order is not guaranteed after the set-to-list conversion.
So, how can we preserve order?
Using dict.fromkeys()
to Preserve Order
dict.fromkeys(list)
will return a dictionary with the list items as keys. Dictionaries do not allow duplicate keys, so the returned dictionary will remove the duplicates for us while preserving order.
dict.fromkeys(duplicates)
# {0: None, 1: None, 2: None}
We can then convert back to a list.
unique = list(dict.fromkeys(duplicates))
# [0, 1, 2]