# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is a LaTeX research paper titled "An I/O-Efficient Approach for Concurrent Spatio-Temporal Range Retrievals over Large-Scale Remote Sensing Image Data" submitted to an IEEE journal. The paper proposes novel techniques for efficient retrieval of remote sensing imagery, including: - **Index-as-an-Execution-Plan paradigm**: Integrates fine-grained partial retrieval directly into indexing structure - **Dual-layer inverted index (G2I/I2G)**: Pre-materializes grid-to-pixel mappings to eliminate runtime geometric calculations - **Hybrid concurrency-aware I/O coordination**: Combines Calvin-style deterministic ordering with optimistic execution - **SA-GMAB (Surrogate-Assisted Genetic Multi-Armed Bandit)**: Auto-tuning mechanism for fluctuating workloads ## Build and Compilation ### Primary Build Commands ```bash # Standard compilation (recommended for IEEE format) pdflatex rs_retrieval.tex # Alternative compilation (being tested) xelatex rs_retrieval.tex # Full build cycle (includes bibliography) pdflatex rs_retrieval.tex bibtex rs_retrieval pdflatex rs_retrieval.tex pdflatex rs_retrieval.tex ``` ### Build System - **Compiler**: pdfTeX (MiKTeX distribution on Windows) - **Document Class**: IEEEtran (IEEE journal format) - **Output**: 15-page PDF (~2.36MB) - **No automation**: No Makefile or build scripts - use manual compilation ## Document Structure The paper follows standard IEEE journal organization with these main sections: 1. **Introduction** - Motivation and problem statement 2. **Related Work** - Spatio-temporal retrieval, concurrency control, I/O tuning 3. **Problem Formulation** - Mathematical definitions and cost models 4. **I/O-aware Indexing Structure** (Section 3) - Core technical contribution - Grid-to-Image (G2I) index - Image-to-Grid (I2G) index - Pre-materialized execution plans 5. **Hybrid Concurrency-Aware I/O Coordination** (Section 4) - Deterministic vs optimistic execution modes - Adaptive mode switching 6. **I/O Stack Tuning** (Section 5) - SA-GMAB algorithm 7. **Performance Evaluation** (Section 6) - Experimental results on Martian datasets 8. **Conclusions** - Summary of contributions ### Key Files - `rs_retrieval.tex` - Main LaTeX source (single-file document) - `references.bib` - Bibliography database - `fig/` - Figures directory (index.png, st-query.png, cc.png) - `exp/` - Experimental results (PDF charts) ## LaTeX Package Dependencies ### Required Packages ```latex \documentclass[lettersize,journal]{IEEEtran} \usepackage{amsmath,amsfonts} % Mathematics \usepackage{graphicx} % Figures \usepackage[linesnumbered,lined,ruled]{algorithm2e} % Algorithms \usepackage{cite} % Citations \usepackage{array} % Table formatting \usepackage{makecell} % Table cells \usepackage{subfigure} % Subfigures ``` ### Chinese Language Support - The project directory name includes Chinese characters (遥感影像部分检索) - Document content is in English - Uses ctex distribution (Chinese TeX) on the system ## Document Conventions ### Cross-References All sections use `\label{}` and `\ref{}` for cross-referencing: - Section labels: `sec:XX` format (e.g., `\label{sec:Index}`) - Algorithm labels: `alg:XX` format - Figure labels: `fig:XX` format - Equation labels: `eq:XX` format ### Mathematical Notation - Extensive use of mathematical formulations - Cost models use notation: $C_{total}$, $T_{compute}$, etc. - Algorithm pseudo-code uses algorithm2e package ### Citation Style - IEEE citation style with numeric references - Citations in format: `\cite{AuthorYearKEY}` - Bibliography managed in `references.bib` ### Figure Organization Figures are organized by topic: - `fig/index.png` - Index schema design - `fig/st-query.png` - Retrieval-time execution flow - `fig/cc.png` - Concurrency coordination mechanism ## Common Editing Tasks ### Adding a New Section 1. Add `\section{Section Name}` with `\label{sec:NAME}` 2. Update the table of contents/organization paragraph in Introduction 3. Ensure cross-references use correct label format ### Modifying Algorithms - Use `algorithm2e` environment - Keep `linesnumbered,lined,ruled` options for consistency - Label with `\label{alg:NAME}` for referencing ### Adding Figures 1. Place figure files in `fig/` directory 2. Use `\begin{figure}[t]` for top placement (IEEE convention) 3. Include `\caption{}` and `\label{fig:NAME}` 4. Refer using `\ref{fig:NAME}` ### Bibliography Updates 1. Add entries to `references.bib` 2. Use BibTeX key format: `AuthorYearKEY` (e.g., `Ma15RS_bigdata`) 3. Run `bibtex rs_retrieval` after modifying .bib file 4. Compile LaTeX twice to resolve references ## Important Notes ### Compilation Workflow When making changes that affect: - **Text only**: Single `pdflatex` run sufficient - **Citations**: Run `pdflatex` → `bibtex` → `pdflatex` × 2 - **New sections/labels**: Run `pdflatex` twice to resolve cross-references - **Figures**: Ensure all figure files exist before compilation ### Git Repository - Main branch: `main` - Recent activity: Testing XeLaTeX compilation - Modified files tracked: .tex, .pdf, .aux, .log, .synctex.gz ### Document Formatting - Strict IEEE journal format compliance - Font: Times Roman family - Two-column layout - Letter size paper - 15-page final document ### Known Issues - Some font variants (bold/italic) unavailable in current TeX distribution - Testing migration from pdflatex to xelatex (commit f7ffed8) ## Experimental Data Reference The paper evaluates on Martian remote sensing datasets: - **Total volume**: 51.9 TB across 669,641 images - **Datasets**: MoRIC, CTX, THEMIS, HiRISE - **Environment**: 9-node cluster with HBase and Lustre file system - **Metrics**: Latency, I/O throughput, request collapse efficiency Results show: - Order-of-magnitude latency reduction with I/O-aware indexing - 54x speedup under high contention with hybrid coordination - 2x faster recovery from workload shifts with SA-GMAB