-
Notifications
You must be signed in to change notification settings - Fork 48
Expand file tree
/
Copy pathlocal-algebra.pymd
More file actions
72 lines (50 loc) · 4.05 KB
/
local-algebra.pymd
File metadata and controls
72 lines (50 loc) · 4.05 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# Local Map Algebra
```python, setup, echo=False
from IPython.core.display import display
from pyrasterframes.utils import create_rf_spark_session
from pyrasterframes.rasterfunctions import rf_normalized_difference, rf_tile, rf_dimensions, rf_extent, rf_tile_mean
import pyrasterframes.rf_ipython
import pandas as pd
import os
spark = create_rf_spark_session()
```
[Local map algebra](https://gisgeography.com/map-algebra-global-zonal-focal-local/) raster operations are element-wise operations on a single _tile_, between a _tile_ and a scalar, between two _tiles_, or among many _tiles_. These operations are common in processing of earth observation imagery and other image data.
<img src="https://gisgeography.com/wp-content/uploads/2015/05/Local-Operation-Raster.png" alt="local op" style="width:200px;"/>
Credit: [GISGeography](https://gisgeography.com)
## Computing NDVI
Here is an example of computing the Normalized Differential Vegetation Index (NDVI). NDVI is a vegetation index which emphasizes differences in relative biomass and vegetation health. The term _index_ in Earth observation means a combination of many raster bands into a single band that highlights a phenomenon of interest. Various indices have proven useful visual tools and frequently appear as features in machine learning models using Earth observation data.
> NDVI is often used worldwide to monitor drought, monitor and predict agricultural production, assist in predicting hazardous fire zones, and map desert encroachment. The NDVI is preferred for global vegetation monitoring because it helps to compensate for changing illumination conditions, surface slope, aspect, and other extraneous factors (Lillesand. _Remote sensing and image interpretation_. 2004)
We will apply the @ref:[catalog pattern](raster-catalogs.md) for defining the data we wish to process. To compute NDVI we need to compute local algebra on the *red* and *near infrared* (nir) bands:
nir - red
NDVI = ---------
nir + red
This form of `(x - y) / (x + y)` is common in remote sensing and is called a normalized difference. It is used with other band pairs to highlight water, snow, and other phenomena.
```python, read_rasters
from pyspark.sql import Row
uri_pattern = 'https://rasterframes.s3.amazonaws.com/samples/luray_snp/B0{}.tif'
catalog_df = spark.createDataFrame([
Row(red=uri_pattern.format(4), nir=uri_pattern.format(8))
])
df = spark.read.raster(
catalog_df,
catalog_col_names=['red', 'nir']
)
df.printSchema()
```
Observe how the bands we need to use to compute the index are arranged as columns of our resulting DataFrame. The rows or observations are various spatial extents within the entire coverage area.
RasterFrames provides a wide variety of local map algebra functions. There are several different broad categories, based on how many _tiles_ the function takes as input:
* A function on a single _tile_ is a unary operation; example: @ref:[rf_log](reference.md#rf-log);
* A function on two _tiles_ is a binary operation; example: @ref:[rf_local_multiply](reference.md#rf-local-multiply);
* A function on a _tile_ and a scalar is a binary operation; example: @ref:[rf_local_less](reference.md#rf-local-less); or
* A function on many _tiles_ is a n-ary operation; example: @ref:[rf_agg_local_min](reference.md#rf-agg-local-min)
We can express the normalized difference with a combination of `rf_local_divide`, `rf_local_subtract`, and `rf_local_add`. Since the normalized difference is so common, there is a convenience method `rf_normalized_difference`, which we use in this example. We will append a new column to the DataFrame, which will apply the map alegbra function to each row.
```python, compute_ndvi
df = df.withColumn('ndvi', rf_normalized_difference(df.nir, df.red))
df.printSchema()
```
We can inspect a sample of the data. Yellow indicates very healthy vegetation, and purple represents bare soil or impervious surfaces.
```python, show_tile
t = df.select(rf_tile('ndvi').alias('ndvi')).first()['ndvi']
display(t)
```
We continue examining NDVI in the @ref:[time series](time-series.md) section.