Local Analysis & Writing raster data by Ravi Bhandari

Local Analysis & Writing raster data by Ravi Bhandari

Geoprocessing with Python: Local Analysis in Raster Data

Introduction to the Session

  • The session is the fifth in a course on geoprocessing using Python and machine learning fundamentals, focusing on raster data processing.
  • Previous lectures covered reading and visualizing raster data, including GeoTIFF and HDF images, as well as creating animations from 3D raster data.

Understanding Local Analysis

  • The discussion introduces local analysis within image processing, categorizing operations into four types: local, focal (neighborhood), zonal, and global analysis.

Definition of Local Analysis

  • Local analysis involves applying operations to each pixel individually; output values at specific locations depend solely on input values at those same locations.
  • For example, if analyzing pixel (10, 10), its output will be determined only by its corresponding input value without considering neighboring pixels.

Comparison with Other Analyses

  • Focal or neighborhood analysis considers surrounding pixels' values for determining an output. This can involve various neighborhood sizes (e.g., 3x3).
  • Zonal analysis operates over defined zones within the image where outputs are based on all pixels in that zone rather than individual ones.
  • Global analysis takes into account all pixels across the entire image for calculations like maximum rainfall or temperature.

Focus on Local Analysis Techniques

  • Today's focus is specifically on local analysis methods where computations are restricted to individual pixel coordinates without neighbor influence.

Example of Local Analysis: NDVI Calculation

  • A typical application of local analysis is calculating NDVI (Normalized Difference Vegetation Index), which assesses green vegetation presence using remote sensing measurements.
  • NDVI helps identify live green vegetation in satellite images by analyzing specific pixel values independently.

Understanding NDVI Calculation

What is NDVI?

  • NDVI (Normalized Difference Vegetation Index) is calculated for every pixel in an image to create an NDVI map. The formula involves reflectance values from the near-infrared (NIR) and red bands of the spectrum.

How is NDVI Calculated?

  • The formula for calculating NDVI is:

[

textNDVI = fractextNIR - textRedtextNIR + textRed

]

This operation is applied to each pixel individually, resulting in a unique NDVI value for all pixels in the input image.

Interpretation of NDVI Values

  • The range of NDVI values spans from -1 to +1:
  • Healthy plants absorb most visible red light and reflect NIR light, leading to high NDVI values close to +1.
  • Areas with no vegetation yield low or negative values, indicating soil, dead plants, water, clouds, or snow. Values near zero typically indicate bare soil.

Practical Application of NDVI Calculation

  • To calculate NDVI from an image using Python libraries like NumPy:
  • Read the red and NIR band data either as separate images or from a multiband raster.
  • Perform element-wise operations on these arrays using NumPy functions. For example: NDVI = (NIR - Red) / (NIR + Red).

Error Handling in Image Processing

  • When reading images with libraries such as GDAL, it’s crucial to enable exception handling:
  • If errors occur while reading files without this feature enabled, users may not receive any error messages.
  • Proper exception handling allows users to be informed about issues like missing files during processing.

Understanding Unsigned Integers and NDVI Calculation

Unsigned Integers and Their Limitations

  • An unsigned integer cannot store negative numbers as it lacks a sign bit, meaning all values are represented as non-negative.
  • Operations on unsigned integers yield results that are also unsigned, which can lead to issues when the expected result is negative (e.g., nir - red).
  • To represent negative integers, the speaker converts unsigned 16-bit integers to floating-point numbers for accurate calculations.

NDVI Calculation Process

  • The conversion of red and nir bands into floating-point allows for proper representation of NDVI values.
  • The NDVI formula is applied element-wise: (nir - red) / (nir + red), enabling local analysis across image pixels.
  • The resulting NDVI values range from -1 to 1, indicating different vegetation levels in the analyzed area.

Visualization Techniques for NDVI

  • The calculated NDVI image shows bright areas representing vegetation and darker areas indicating non-vegetation like soil or water.
  • A color map can enhance visualization; for instance, using a red-yellow-green scheme where higher NDVI values appear green.
  • Custom color maps can be created to represent specific ranges of NDVI values visually.

Introduction to NDWI

  • Similar to NDVI, the Normalized Difference Water Index (NDWI) helps delineate open water features in satellite imagery.
  • It leverages low reflectance characteristics of water bodies in visible to infrared wavelengths for effective extraction from images.

Calculating and Interpreting NDWI

  • The formula for calculating NDWI is (green - NIR)/(green + NIR), focusing on maximizing water feature visibility while minimizing other land types' reflectance.
  • In the resulting images, positive values indicate water features while zero or negative values suggest soil or vegetation presence.
  • Higher pixel values (>2) signify water pixels; lower or zero values may indicate soil or rock.

This structured overview captures key concepts discussed in the transcript regarding unsigned integers, their limitations in calculations involving negatives, and detailed methodologies for calculating both NDVI and NDWI along with their visual representations.

Color Mapping and Water Pixel Extraction

Understanding Color Maps

  • The discussion begins with the concept of color mapping, highlighting that water pixels appear brighter compared to vegetation in an image.
  • An alternative color map (G and BU natural water look) is introduced, which enhances the blue appearance of water.

Extracting Water Pixels

  • To isolate water pixels, a binary image conversion is necessary where water pixels are assigned a value of one and all other pixels zero.
  • The process involves selecting a threshold manually; for this case, a threshold of 0.8 was determined to effectively distinguish water from non-water areas.

Threshold Selection Techniques

  • It’s noted that while manual selection is used here, there are various methods for determining thresholds, including machine learning approaches discussed in previous lectures.
  • The numpy function np.where is explained as a tool for applying conditions to create binary images based on the selected threshold.

Displaying Extracted Water Images

  • After applying the thresholding technique, only the bright-colored water pixels remain visible in the output image.
  • If pixel area data is known, it allows for calculating total surface area covered by water bodies within the analyzed image.

Local Analysis Techniques

  • The focus remains on local analysis techniques applied at an individual pixel level before advancing to more complex analyses like focal or zonal analysis.

Writing Raster Data

Methods for Writing Raster Files

  • Two primary techniques for writing raster files using GDAL are introduced: createCopy and create.
  • The createCopy method allows copying data from a source dataset while retaining its properties such as projection and number of bands.

Creating New Rasters

  • In cases where source data may not be available or needs transformation, the create method can be utilized to define new raster datasets with custom parameters.

Driver Handling in GDAL

  • Each raster format has an associated driver; GDAL automatically selects appropriate drivers when reading files based on their extensions.
  • When writing rasters, users must specify which driver to use according to the desired output format.

Understanding Raster Formats and Driver Initialization in GDAL

Overview of Raster Formats

  • The speaker discusses the importance of identifying the short name for raster formats when creating a driver. Each format has a unique short name that must be passed during driver creation.

Creating a Driver for GeoTIFF

  • To write a GeoTIFF image, the short name "gt" is used, which corresponds to "gif". A variable named file_format is created to store this short name.
  • The function get_driver_by_name is utilized to create a driver for handling GeoTIFF data. The resulting driver's short name is confirmed as "GTF".

Metadata and Driver Capabilities

  • Associated metadata provides insights into the capabilities of the driver, including supported operations like creating and copying datasets.
  • The metadata reveals that the driver supports both creation and copying functionalities.

Working with Satellite Images

  • The speaker references using an image file named band2.tiff, sourced from ISRO's satellite portal, for basic image processing tasks.
  • A destination filename raster_one.tiff is established for output after processing.

Image Processing Steps

  • Basic image processing involves reading from the source file and preparing to copy its contents while performing transformations on it.
  • After initializing the driver with jidal.get_driver_by_name, the function create_copy is called to duplicate data from source to destination.

Data Manipulation Techniques

  • Data can be fetched either from the source or destination dataset. An example includes inverting pixel values using NumPy's np.invert.
  • This inversion process transforms brighter areas of an image into darker ones, akin to developing negatives in traditional photography.

Writing Processed Data Back

  • After processing, writing back to the destination dataset involves accessing raster bands and utilizing functions like write_array.
  • It’s crucial to call flush_cache post-writing; failing to do so may result in blank images due to unsaved changes not being written back properly.

Creating Raster Images from CSV Data

Overview of the Create Method

  • The flush cache function is called after writing is complete, allowing for deletion of temporary data.
  • When exporting an existing file isn't possible, the create method must be used to generate a single band raster.
  • A demonstration involves using a CSV file containing daily grid rainfall data across India.

Understanding the Data Structure

  • The first record in the dataset corresponds to specific geographic coordinates (6.5° N, 66.5° E).
  • Each row represents increasing latitude while longitude remains constant; this structure continues throughout the dataset.
  • The image size is specified as 129 x 135 pixels, with each pixel representing rainfall over a defined grid size (10 km or 25 km).

Creating and Configuring the Raster Image

  • To create a raster image, initiate by obtaining a driver using gdal.GetDriverByName('GTiff').
  • The create function requires multiple arguments: destination filename, dimensions (x and y), number of bands, and data type for pixel intensity.

Handling No Data Values

  • Missing data may be represented by -99999; these values indicate unusable pixels in scientific analysis.
  • Metadata can be attached to raster images to provide essential information about creation date and scale factors for conversion.

Attaching Metadata to Bands

  • Metadata should be formatted as key-value pairs or dictionaries when attaching it to raster images.
  • Important metadata includes creation date and scale factors necessary for converting pixel values into usable formats like reflectance.

Setting No Data Values

  • Use band.SetNoDataValue(-9999) to mark certain pixels as nonusable; this helps software skip these values during analysis.
  • Additional metadata can also be applied at the dataset level using similar dictionary structures for comprehensive documentation.

This structured approach provides clarity on creating raster images from CSV files while emphasizing critical steps in handling data effectively.

Creating and Managing Geo-Referenced Rasters

Setting Up Metadata and Writing Data

  • The function DS.set_metadata is called to pass a dictionary containing metadata for the raster image.
  • After setting up the metadata, the function band.write_array is invoked to write a NumPy array to disk.

Handling Image Projections

  • When reading back the image, it lacks no data value settings; plotting reveals issues with rainfall data representation.
  • To create a geo-referenced raster, the OSR package is introduced for handling spatial references, while JAL manages raster data.

Creating Coordinate Reference Systems (CRS)

  • A CRS must be created using OSR.spatial_reference, which allows passing values to functions that require projection information.
  • The built-in function set_well_known_goggs sets the geographic coordinate system to WGS84 for better readability in formats like WKT.

Defining Transformation Parameters

  • A six-element variable defines extents of India, including longitude of 0 pixel and resolution in both x and y directions.
  • The process involves creating a driver and calling the create function with parameters such as destination file name and image size.

Setting Projection Information

  • The projection system of the destination dataset is set using destination_data_set.set_projection, requiring WKT format from CRS.export.
  • Transformation information is crucial for converting pixel numbers into real-world coordinates; this can include scaling or rotating images.

Finalizing Raster Creation

  • The transformation information is set via dst_ds.set_geot_transform, ensuring accurate mapping of pixel locations to geographical coordinates.
  • Data from a text file (source.ext) is read using NumPy's np.loadtxt, specifying delimiter and data type as float32 before writing it back to band arrays.

How to Create and Manage Raster Images in GIS

Introduction to Raster Data Creation

  • The speaker discusses the process of creating a raster image from rainfall data, emphasizing that the file can be opened in any GIS software.
  • The specific directory structure is mentioned, indicating where the rainfall image file is located within the DLP data folder.

File Management and Verification

  • The speaker demonstrates deleting an existing file and re-running code to create a new rainfall image file, showcasing dynamic file management.
  • To confirm data availability, the speaker reads the newly created file using dal.open and checks for no-data values.

Data Visualization Techniques

  • After reading the data array while masking no-data values (-9999), a visual representation of rainfall across India is generated.
  • Two methods for creating raster datasets are discussed: create and create copy, highlighting their applications in managing raster images.

Working with Satellite Imagery

  • The speaker explains how satellite images often come as separate bands, necessitating stacking them together for analysis or color composites.
  • A method for stacking individual band images into a multi-band raster is outlined, emphasizing ease of use with GIS software like Jalal.

Steps to Create Multi-Band Rasters

  • The process involves reading each band individually, copying projection information from one source image due to single-band limitations.
  • A driver is created for generating a stacked image by specifying dimensions and number of bands while ensuring proper data type storage (16-bit unsigned integer).

Finalizing Image Creation

  • Each band’s color interpretation is set before writing arrays sequentially to the destination dataset; cache flushing indicates completion.
  • Upon verification, it’s confirmed that three bands exist in the final stacked image, demonstrating successful integration of multiple bands into one raster.

Summary of Key Operations on Raster Images

  • The session concludes with a review of four types of operations performed on raster images: local, focal, zonal, and global analyses.
  • Specific examples include calculating NDVI and NDWI while masking non-water pixels based on thresholds. Additionally, methods for creating raster images were reiterated.
Video description

IIRS-ISRO