How to Check If Column Exists in Spark DataSet in Java

Published Jul 12, 2022

How can we check if a column exists in a Spark DataSet in Java?

In the Java API, df.columns() returns a String[], so we can use any method to check a value exists in an array.

1. Using Arrays.asList() and contains()

Let’s convert the array into a list and use contains().

String columnToCheck = "maybeColumn";
Arrays.asList(df.columns()).contains(columnToCheck)

2. Using Arrays.stream() and anyMatch()

We can also create a stream of the elements and run anyMatch() on that stream.

String columnToCheck = "maybeColumn";
Arrays.stream(df.columns()).anyMatch(columnToCheck::equals);