How to Get the First Row Meeting a Condition in Pandas


How can we get the first row in a Pandas DataFrame that meets some condition or criteria?

Let’s say we have this DataFrame df.

  id    year  period  value
0 000e	1976	M01	    7.3
1 000e	1976	M02	    7.3
2 000e	1976	M03	    7.3
3 000f	1976	M04	    720
4 000f	1976	M05	    710

Suppose we want the index of the first row whose id ends with an f (so we want an index of 4).

Create the filtering logic

Let’s create our filtering logic to get all rows whose id ends with f.

df[df.id.str.endswith('f')]

Get the index

Using index

We can get the row index using .index[0].

index = df[df.id.str.endswith('f')].index[0]

Using iloc

We could also use iloc[0] to achieve the same functionality.

index = df[df.id.str.endswith('f')].iloc[0]
id                000f
year              1976
period             M04
value              720
Name: 4, dtype: object

This will give us the first row that meets our condition. We can obtain the actual index by accessing the name attribute.

index = df[df.id.str.endswith('f')].iloc[0].name

Get all rows until that index

If we wanted to, we could get all rows up until that index that we obtained earlier.

df.iloc[:index,:]

Alternative approaches

If we’re working with a large DataFrame, it might be wasteful to apply a filter on the entire DataFrame just to extract the first row.

Ideally, we want to return the first row that meets the criteria without iterating or scanning through the other rows.

If we know that the row meeting the criteria will be one of the first ~10k rows, then a simple for loop might be more performant than the original solution.

def get_first_row_with_condition(condition, df):
  for i in range(len(df)):
    if condition(df.iloc[i]):
      break
  return i

Then, we can use this function like so:

index = get_first_row_with_condition(lambda x: np.char.endswith(x.id.endswith('f'), df)