Week 0: Getting Started

Week 0 Overview

Welcome to GEOG 288KC: Geospatial Foundation Models and Applications! This week focuses on defining our course goals, the architecture of the class and the code we will be working with, and getting everyone set up with a consistent computational environment.

Key Activities

Complete local/server environment setup
Set up development environment (Python, PyTorch, Earth Engine access)
Submit project application describing experience and research interests

Basics of Foundation Models and LLM/GFM Comparisons

GFM Architecture Cheatsheet

This is a quick-reference for how our Geospatial Foundation Model (GFM) is organized, what each part does, and where to look during the course. We implement simple, readable components first, then show how to swap in optimized library counterparts as needed.

Roadmap at a glance

graph TD
  A[Week 1: Data Foundations] --> B[Week 2: Attention & Blocks]
  B --> C[Week 3: Full Encoder]
  C --> D[Week 4: MAE Pretraining]
  D --> E[Week 5: Training Loop]
  E --> F[Week 6: Eval & Viz]
  F --> G[Week 7+: Interop & Scale]

Minimal structure you’ll use

geogfm/
  core/
    config.py                # Minimal typed configs for model, data, training
  data/
    loaders.py               # build_dataloader(...)
    datasets/
      stac_dataset.py        # Simple STAC-backed dataset
    transforms/
      normalization.py       # Per-channel normalization
      patchify.py            # Extract fixed-size patches
  modules/
    attention/
      multihead_attention.py     # Standard MHA (from scratch)
    embeddings/
      patch_embedding.py         # Conv patch embedding
      positional_encoding.py     # Simple positional encoding
    blocks/
      transformer_block.py       # PreNorm block (MHA + MLP)
    heads/
      reconstruction_head.py     # Lightweight decoder/readout
    losses/
      mae_loss.py                # Masked reconstruction loss
  models/
    gfm_vit.py                   # GeoViT-style encoder
    gfm_mae.py                   # MAE wrapper (masking + encoder + head)
  training/
    optimizer.py                 # AdamW builder
    loop.py                      # fit/train_step/eval_step with basic checkpointing
  evaluation/
    visualization.py             # Visualize inputs vs reconstructions

# Outside the package (repo root)
configs/   # Small YAML/JSON run configs
tests/     # Unit tests
data/      # Datasets, splits, stats, build scripts

What each part does (one-liners)

core/config.py: Typed configs for model/data/training; keeps parameters organized.
data/datasets/stac_dataset.py: Reads imagery + metadata (e.g., STAC), returns tensors.
data/transforms/normalization.py: Normalizes channels using precomputed stats.
data/transforms/patchify.py: Turns large images into uniform patches for ViT.
data/loaders.py: Builds PyTorch DataLoaders for train/val.
modules/embeddings/patch_embedding.py: Projects image patches into token vectors.
modules/embeddings/positional_encoding.py: Adds position info to tokens.
modules/attention/multihead_attention.py: Lets tokens attend to each other.
modules/blocks/transformer_block.py: Core transformer layer (attention + MLP).
modules/heads/reconstruction_head.py: Reconstructs pixels from encoded tokens.
modules/losses/mae_loss.py: Computes masked reconstruction loss for MAE.
models/gfm_vit.py: Assembles the encoder backbone from blocks.
models/gfm_mae.py: Wraps encoder with masking + reconstruction for pretraining.
training/optimizer.py: Creates AdamW with common defaults.
training/loop.py: Runs epochs, backprop, validation, and simple checkpoints.
evaluation/visualization.py: Plots sample inputs and reconstructions.

From-scratch vs library-backed

Use PyTorch for Dataset/DataLoader, AdamW, schedulers, AMP, and checkpointing.
Build core blocks from scratch first: PatchEmbedding, MHA, TransformerBlock, MAE loss/head.
Later, swap in optimized options: torch.nn.MultiheadAttention, timm ViT blocks, FlashAttention, TorchGeo datasets, torchmetrics.

Quick start (conceptual)

from geogfm.core.config import ModelConfig, DataConfig, TrainConfig
from geogfm.models.gfm_vit import GeoViTBackbone
from geogfm.models.gfm_mae import MaskedAutoencoder
from geogfm.data.loaders import build_dataloader
from geogfm.training.loop import fit

model_cfg = ModelConfig(architecture="gfm_vit", embed_dim=768, depth=12, image_size=224)
data_cfg = DataConfig(dataset="stac", patch_size=16, num_workers=8)
train_cfg = TrainConfig(epochs=1, batch_size=8, optimizer={"name": "adamw", "lr": 2e-4})

encoder = GeoViTBackbone(model_cfg)
model = MaskedAutoencoder(model_cfg, encoder)
train_dl, val_dl = build_dataloader(data_cfg)
fit(model, (train_dl, val_dl), train_cfg)

What to notice: - The encoder and MAE wrapper are separate, so we can reuse the encoder for other tasks later. - Data transforms (normalize/patchify) are decoupled from the model and driven by config.

MVP vs later phases

MVP (Weeks 1–6): files shown above; single-node training; basic logging and visualization.
Later (Weeks 7–10): interop with existing models (e.g., Prithvi), task heads (classification/segmentation), inference tiling, Hub integration.

Comprehensive course roadmap (stages, files, libraries)

This table maps each week to the broader stages (see the course index), the key geogfm files you’ll touch, and the primary deep learning tools you’ll rely on.

Week	Stage	Focus	You will build (geogfm)	Library tools	Outcome
1	Stage 1: Build GFM Architecture	Data Foundations	`core/config.py`; `data/datasets/stac_dataset.py`; `data/transforms/{normalization.py, patchify.py}`; `data/loaders.py`	`torch.utils.data.Dataset`/`DataLoader`, `rasterio`, `numpy`	Config-driven dataloaders that yield normalized patches
2	Stage 1	Attention & Blocks	`modules/embeddings/{patch_embedding.py, positional_encoding.py}`; `modules/attention/multihead_attention.py`; `modules/blocks/transformer_block.py`	`torch.nn` (compare with `torch.nn.MultiheadAttention`)	Blocks run forward with stable shapes; unit tests green
3	Stage 1	Complete Architecture	`models/gfm_vit.py`; `modules/heads/reconstruction_head.py`	`torch.nn` (timm as reference)	Encoder assembled; end-to-end forward on dummy input
4	Stage 2: Train Foundation Model	MAE Pretraining	`models/gfm_mae.py`; `modules/losses/mae_loss.py`	`torch` masking utilities, `numpy`	Masking + reconstruction; loss decreases on toy batch
5	Stage 2	Training Optimization	`training/optimizer.py`; `training/loop.py`	`torch.optim.AdamW`; schedulers, AMP optional	Single-epoch run; basic checkpoint save/restore
6	Stage 2	Evaluation & Analysis	`evaluation/visualization.py`; (optional) `evaluation/metrics.py`	`matplotlib`; `torchmetrics` optional	Recon visuals; track validation loss/PSNR
7	Stage 2	Integration w/ Pretrained	(light) `core/registry.py`; `interoperability/huggingface.py` stubs	`huggingface_hub`, `transformers` optional	Show mapping to Prithvi structure; plan switch
8	Stage 3: Apply & Deploy	Task Fine-tuning	`tasks/{classification.py\|segmentation.py}` (light heads)	`torch.nn.CrossEntropyLoss`; timm optional	Head swap on frozen encoder; small dataset demo
9	Stage 3	Deployment & Inference	`inference/{tiling.py, sliding_window.py}`	`numpy`, `rasterio` windows	Sliding-window inference on a small scene
10	Stage 3	Presentations & Synthesis	Project deliverables (no new `geogfm` code required)	—	Present MVP builds, analysis, transition plan

Next Week Preview

Week 1 will start building our model, beginning with fundamental data loaders and transformers needed to use geospatial data in deep learning.

--- title: "Week 0: Getting Started" subtitle: "Setting up for geospatial AI in the UCSB AI Sandbox + Project Applications" format: html --- ## Week 0 Overview Welcome to GEOG 288KC: Geospatial Foundation Models and Applications! This week focuses on defining our course goals, the architecture of the class and the code we will be working with, and getting everyone set up with a consistent computational environment. ### Key Activities - [ ] Complete local/server environment setup - [ ] Set up development environment (Python, PyTorch, Earth Engine access) - [ ] Submit project application describing experience and research interests ## Basics of Foundation Models and LLM/GFM Comparisons ## GFM Architecture Cheatsheet This is a quick-reference for how our Geospatial Foundation Model (GFM) is organized, what each part does, and where to look during the course. We implement simple, readable components first, then show how to swap in optimized library counterparts as needed. ### Roadmap at a glance ```{=html} <div class="no-row-height"></div> ``` ```{mermaid} graph TD A[Week 1: Data Foundations] --> B[Week 2: Attention & Blocks] B --> C[Week 3: Full Encoder] C --> D[Week 4: MAE Pretraining] D --> E[Week 5: Training Loop] E --> F[Week 6: Eval & Viz] F --> G[Week 7+: Interop & Scale] ``` ### Minimal structure you’ll use ```text geogfm/ core/ config.py # Minimal typed configs for model, data, training data/ loaders.py # build_dataloader(...) datasets/ stac_dataset.py # Simple STAC-backed dataset transforms/ normalization.py # Per-channel normalization patchify.py # Extract fixed-size patches modules/ attention/ multihead_attention.py # Standard MHA (from scratch) embeddings/ patch_embedding.py # Conv patch embedding positional_encoding.py # Simple positional encoding blocks/ transformer_block.py # PreNorm block (MHA + MLP) heads/ reconstruction_head.py # Lightweight decoder/readout losses/ mae_loss.py # Masked reconstruction loss models/ gfm_vit.py # GeoViT-style encoder gfm_mae.py # MAE wrapper (masking + encoder + head) training/ optimizer.py # AdamW builder loop.py # fit/train_step/eval_step with basic checkpointing evaluation/ visualization.py # Visualize inputs vs reconstructions # Outside the package (repo root) configs/ # Small YAML/JSON run configs tests/ # Unit tests data/ # Datasets, splits, stats, build scripts ``` ### What each part does (one-liners) - core/config.py: Typed configs for model/data/training; keeps parameters organized. - data/datasets/stac_dataset.py: Reads imagery + metadata (e.g., STAC), returns tensors. - data/transforms/normalization.py: Normalizes channels using precomputed stats. - data/transforms/patchify.py: Turns large images into uniform patches for ViT. - data/loaders.py: Builds PyTorch DataLoaders for train/val. - modules/embeddings/patch_embedding.py: Projects image patches into token vectors. - modules/embeddings/positional_encoding.py: Adds position info to tokens. - modules/attention/multihead_attention.py: Lets tokens attend to each other. - modules/blocks/transformer_block.py: Core transformer layer (attention + MLP). - modules/heads/reconstruction_head.py: Reconstructs pixels from encoded tokens. - modules/losses/mae_loss.py: Computes masked reconstruction loss for MAE. - models/gfm_vit.py: Assembles the encoder backbone from blocks. - models/gfm_mae.py: Wraps encoder with masking + reconstruction for pretraining. - training/optimizer.py: Creates AdamW with common defaults. - training/loop.py: Runs epochs, backprop, validation, and simple checkpoints. - evaluation/visualization.py: Plots sample inputs and reconstructions. ### From-scratch vs library-backed - Use PyTorch for Dataset/DataLoader, AdamW, schedulers, AMP, and checkpointing. - Build core blocks from scratch first: PatchEmbedding, MHA, TransformerBlock, MAE loss/head. - Later, swap in optimized options: `torch.nn.MultiheadAttention`, timm ViT blocks, FlashAttention, TorchGeo datasets, torchmetrics. ### Quick start (conceptual) ```python from geogfm.core.config import ModelConfig, DataConfig, TrainConfig from geogfm.models.gfm_vit import GeoViTBackbone from geogfm.models.gfm_mae import MaskedAutoencoder from geogfm.data.loaders import build_dataloader from geogfm.training.loop import fit model_cfg = ModelConfig(architecture="gfm_vit", embed_dim=768, depth=12, image_size=224) data_cfg = DataConfig(dataset="stac", patch_size=16, num_workers=8) train_cfg = TrainConfig(epochs=1, batch_size=8, optimizer={"name": "adamw", "lr": 2e-4}) encoder = GeoViTBackbone(model_cfg) model = MaskedAutoencoder(model_cfg, encoder) train_dl, val_dl = build_dataloader(data_cfg) fit(model, (train_dl, val_dl), train_cfg) ``` What to notice: - The encoder and MAE wrapper are separate, so we can reuse the encoder for other tasks later. - Data transforms (normalize/patchify) are decoupled from the model and driven by config. ### MVP vs later phases - MVP (Weeks 1–6): files shown above; single-node training; basic logging and visualization. - Later (Weeks 7–10): interop with existing models (e.g., Prithvi), task heads (classification/segmentation), inference tiling, Hub integration. ### Comprehensive course roadmap (stages, files, libraries) This table maps each week to the broader stages (see the course index), the key `geogfm` files you’ll touch, and the primary deep learning tools you’ll rely on. ::: {.table-responsive} | Week | Stage | Focus | You will build (geogfm) | Library tools | Outcome | |------|-------|-------|--------------------------|---------------|---------| | 1 | Stage 1: Build GFM Architecture | Data Foundations | `core/config.py`; `data/datasets/stac_dataset.py`; `data/transforms/{normalization.py, patchify.py}`; `data/loaders.py` | `torch.utils.data.Dataset`/`DataLoader`, `rasterio`, `numpy` | Config-driven dataloaders that yield normalized patches | | 2 | Stage 1 | Attention & Blocks | `modules/embeddings/{patch_embedding.py, positional_encoding.py}`; `modules/attention/multihead_attention.py`; `modules/blocks/transformer_block.py` | `torch.nn` (compare with `torch.nn.MultiheadAttention`) | Blocks run forward with stable shapes; unit tests green | | 3 | Stage 1 | Complete Architecture | `models/gfm_vit.py`; `modules/heads/reconstruction_head.py` | `torch.nn` (timm as reference) | Encoder assembled; end-to-end forward on dummy input | | 4 | Stage 2: Train Foundation Model | MAE Pretraining | `models/gfm_mae.py`; `modules/losses/mae_loss.py` | `torch` masking utilities, `numpy` | Masking + reconstruction; loss decreases on toy batch | | 5 | Stage 2 | Training Optimization | `training/optimizer.py`; `training/loop.py` | `torch.optim.AdamW`; schedulers, AMP optional | Single-epoch run; basic checkpoint save/restore | | 6 | Stage 2 | Evaluation & Analysis | `evaluation/visualization.py`; (optional) `evaluation/metrics.py` | `matplotlib`; `torchmetrics` optional | Recon visuals; track validation loss/PSNR | | 7 | Stage 2 | Integration w/ Pretrained | (light) `core/registry.py`; `interoperability/huggingface.py` stubs | `huggingface_hub`, `transformers` optional | Show mapping to Prithvi structure; plan switch | | 8 | Stage 3: Apply & Deploy | Task Fine-tuning | `tasks/{classification.py|segmentation.py}` (light heads) | `torch.nn.CrossEntropyLoss`; timm optional | Head swap on frozen encoder; small dataset demo | | 9 | Stage 3 | Deployment & Inference | `inference/{tiling.py, sliding_window.py}` | `numpy`, `rasterio` windows | Sliding-window inference on a small scene | | 10 | Stage 3 | Presentations & Synthesis | Project deliverables (no new `geogfm` code required) | — | Present MVP builds, analysis, transition plan | ::: ### Next Week Preview Week 1 will start building our model, beginning with fundamental data loaders and transformers needed to use geospatial data in deep learning.