How to Get Timestamp from HBase Row or Column in Java
How can we retrieve the timestamp for a row or column in an HBase table?
I recently needed to obtain the time of insertion (or update) of a single row in HBase.
Suppose we’re running a scan on our table, and we’re now in charge of handling each result using our iterator.
Scan scan = new Scan();
try (ResultScanner scanner = hbaseTable.getScanner(scan)) {
for (Result result = scanner.next(); result != null; result = scanner.next()) {
// Do something with `result`
}
} catch (IOException e) {
e.printStackTrace();
}
Retrieve latest timestamp of row
If the timestamps for all cells in a row are the same, we can get the latest timestamp of the first column using rawCells()
.
long ts = result.rawCells()[0].getTimestamp()
From the HBase API documentation,
rawCells()
will return an array of typeCell
that back thisResult
instance.
A cell in HBase is a single unit of storage, uniquely represented by row, column family, column qualifier, timestamp, and type.
For instance, a single column may be associated with multiple cells if that column was inserted then updated once (i.e. unique timestamps/versions).
If every column has multiple versions, the cells returned from rawCells()
will be sorted with the newer timestamp first.
Retrieve latest timestamp of column
If we need the latest timestamp for a specific column, we can use getColumnLatestCell()
.
byte[] CF = Bytes.toBytes("column_family");
byte[] CQ = Bytes.toBytes("column_qualifier");
long ts = res.getColumnLatestCell(CF, CQ).getTimestamp();
From the HBase API documentation,
getColumnLatestCell()
will return theCell
with the most recent timestamp for a given column family and column qualifier.
Retrieve any timestamp of column
If we want access to all timestamps for a specific column, we can use getColumnCells()
.
byte[] CF = Bytes.toBytes("column_family");
byte[] CQ = Bytes.toBytes("column_qualifier");
List<Cell> cells = res.getColumnCells(CF, CQ);
int index = 1;
long ts = cells.get(index).getTimestamp();
From the HBase API documentation,
getColumnCells()
will return an array of typeCell
.
The most recent timestamp will be at index 0
of this list.
The second, most recent timestamp will be at index 1
, and so on.