Welcome to the world of remote sensing with Prithvi, an innovative temporal Vision Transformer model developed by a collaboration between IBM and NASA. This groundbreaking model enables researchers and developers to leverage remote sensing data for various applications, from land cover classification to flood mapping. This article will guide you through the setup, inference, and fine-tuning of the Prithvi model.
Understanding the Model and Inputs
Prithvi allows you to process a time series of remote sensing images formatted as (B, C, T, H, W). In this structure:
- B stands for batch size.
- C denotes the number of channels (e.g., spectral bands).
- T is the temporal dimension, which is key for analyzing changes over time.
- H and W represent height and width, respectively.
Think of each input video as a flipbook, where each page represents a different moment in time, allowing the model to ‘see’ how changes occur across the landscape.
Pre-training
The model was pre-trained on the NASA HLS V2 L30 product, incorporating vital spectral bands such as:
- Blue
- Green
- Red
- Narrow NIR
- SWIR 1
- SWIR 2
These bands provide essential insights into environmental and landscape features, contributing to its advanced functionality.
Running the Inference
Performing inference with the Prithvi model is streamlined through the Prithvi_run_inference.py script. This allows you to reconstruct images from a series of HLS images. Use the following command to run it:
python Prithvi_run_inference.py --data_files t1.tif t2.tif t3.tif --yaml_file_path pathtoyamlPrithvi_100.yaml --checkpoint pathtocheckpointPrithvi_100.pth --output_dir pathtooutdir --input_indices space separated 0-based indices of channels to select from input --mask_ratio 0.5 --img_size length of one side of square input shape
Make sure to provide the images in chronological order, and in geotiff format, including the specified channels in reflectance units for the best results.
Fine-tuning Examples
To further enhance the model’s capabilities, it can be fine-tuned for specific tasks using the Hugging Face platform. Examples include:
Additional code and resources for fine-tuning can be found on GitHub.
Troubleshooting and Feedback
If you encounter any issues while using the Prithvi model or have feedback to share, don’t hesitate to submit an issue on our repository on GitHub. Your insights are invaluable! Make sure you consider the following:
- Verify that your input data format matches the expected structure.
- Check for compatibility issues with your local environment, such as dependencies and library versions.
- Ensure that the images are in chronological order when running the inference.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

