LLaVA for precipitation nowcasting on SEVIR

All modifications are placed in the following two directories:

EarthLMM/llava/datasets/sevir for the SEVIR dataloader.
EarthLMM/scripts/sevir for all running scripts.

Installation

Follow the official guide to install LLaVA in dev mode. The needed packages for SEVIR are already included in the pyproject.toml.

Clone this repository and navigate to LLaVA folder

git clone https://github.com/gaozhihan/EarthLMM.git
cd EarthLMM

Install Package

conda create -n earthlmm python=3.10 -y
conda activate earthlmm
pip install --upgrade pip  # enable PEP 660 support
pip install -e .

Install additional packages for training cases

pip install -e ".[train]"
pip install flash-attn --no-build-isolation

SEVIR Data Preparation

The official website of SEVIR.

Specifically, we only need the VIL data for precipitation nowcasting. Download it according to the official guide

aws s3 cp --no-sign-request s3://sevir/CATALOG.csv CATALOG.csv
aws s3 sync --no-sign-request s3://sevir/data/vil .

It takes around 138GB storage.

Then, place the data in the following structure:

EarthLMM
└── playground
    └── data
        └── sevir
            ├── CATALOG.csv
            └── data
                └── vil
                    └── ...

Convert Raw SEVIR data to LLaVA format

For multi input single output task, run

cd ROOT_DIR/EarthLMM
python ./scripts/sevir/convert_sevir.py --save sevir_convert_save_dir

It will load the configurations in ./scripts/sevir/sevir_cfg.yaml. Modify the following args to config the data conversion:

in_len: the number of frames in the input sequence.
out_len: the model is required to predict the out_len-th future frame.
seq_len: should be the sum of in_len and out_len.
stride: the stride between two adjacent sampled sequences. I use 8 for all my experiments.
frame_stride: the stride between two adjacent frames in the same sequence. E.g., 1 for 5-minute interval, 4 for 20-minute interval.
start_date: null for training and ID test. [2019, 6, 1] for OOD test.
end_date: [2019, 6, 1] for training and ID test. null for OOD test.

For multi input multi output task, run

cd ROOT_DIR/EarthLMM
python ./scripts/sevir/convert_sevir_multi_out.py --save sevir_convert_multi_out_save_dir

It will load the same config file: ./scripts/sevir/sevir_multi_out_cfg.yaml.

LoRA Fine-Tuning on SEVIR

Run

cd ROOT_DIR/EarthLMM
sh ./scripts/sevir/finetune_sevir_lora.sh

Remember to config the data and the save path via:

--data_path ./playground/data/sevir_convert_save_dir/sevir_llava.json.
--output_dir ./checkpoints/llava-v1.5-7b-sevir-lora.

It will save all checkpoints in the ./checkpoints/llava-v1.5-7b-sevir-lora directory.

Evaluating LLaVA with LoRA on SEVIR

Run

cd ROOT_DIR/EarthLMM
sh ./scripts/sevir/sevir_vqa_lora.sh

to generate predictions in corresponding data directories. Remember to config the script via:

--model-path ./checkpoints/llava-v1.5-7b-sevir-lora: point to the saved LoRA weights.
--question-file ./playground/data/sevir_convert_save_dir/sevir_questions.jsonl.
--answers-file ./playground/data/sevir_convert_save_dir/lora_answer.jsonl: to save the answers generated by LLaVA-LoRA.

Multi input multi output task uses the same script.

Once the answer is generated, the performance can be evaluated via

cd ROOT_DIR/EarthLMM
python scripts/sevir/eval_baseline.py --data sevir_convert_save_dir --answer lora_answer.jsonl

The performance of multi input multi output task can be evaluated via

cd ROOT_DIR/EarthLMM
python scripts/sevir/eval_multi_out.py --data sevir_convert_multi_out_save_dir --answer lora_answer.jsonl

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.devcontainer		.devcontainer
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs		docs
images		images
llava		llava
playground/data/prompts		playground/data/prompts
scripts		scripts
tmp		tmp
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cog.yaml		cog.yaml
predict.py		predict.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLaVA for precipitation nowcasting on SEVIR

Installation

SEVIR Data Preparation

Convert Raw SEVIR data to LLaVA format

LoRA Fine-Tuning on SEVIR

Evaluating LLaVA with LoRA on SEVIR

About

Releases

Packages

Contributors 2

Languages

License

zhaoxu98/Chao_Satellite_Image

Folders and files

Latest commit

History

Repository files navigation

LLaVA for precipitation nowcasting on SEVIR

Installation

SEVIR Data Preparation

Convert Raw SEVIR data to LLaVA format

LoRA Fine-Tuning on SEVIR

Evaluating LLaVA with LoRA on SEVIR

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages