Files
rs-retrieval/CLAUDE.md
2026-02-02 20:13:50 +08:00

6.0 KiB
Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a LaTeX research paper titled "An I/O-Efficient Approach for Concurrent Spatio-Temporal Range Retrievals over Large-Scale Remote Sensing Image Data" submitted to an IEEE journal. The paper proposes novel techniques for efficient retrieval of remote sensing imagery, including:

  • Index-as-an-Execution-Plan paradigm: Integrates fine-grained partial retrieval directly into indexing structure
  • Dual-layer inverted index (G2I/I2G): Pre-materializes grid-to-pixel mappings to eliminate runtime geometric calculations
  • Hybrid concurrency-aware I/O coordination: Combines Calvin-style deterministic ordering with optimistic execution
  • SA-GMAB (Surrogate-Assisted Genetic Multi-Armed Bandit): Auto-tuning mechanism for fluctuating workloads

Build and Compilation

Primary Build Commands

# Standard compilation (recommended for IEEE format)
pdflatex rs_retrieval.tex

# Alternative compilation (being tested)
xelatex rs_retrieval.tex

# Full build cycle (includes bibliography)
pdflatex rs_retrieval.tex
bibtex rs_retrieval
pdflatex rs_retrieval.tex
pdflatex rs_retrieval.tex

Build System

  • Compiler: pdfTeX (MiKTeX distribution on Windows)
  • Document Class: IEEEtran (IEEE journal format)
  • Output: 15-page PDF (~2.36MB)
  • No automation: No Makefile or build scripts - use manual compilation

Document Structure

The paper follows standard IEEE journal organization with these main sections:

  1. Introduction - Motivation and problem statement
  2. Related Work - Spatio-temporal retrieval, concurrency control, I/O tuning
  3. Problem Formulation - Mathematical definitions and cost models
  4. I/O-aware Indexing Structure (Section 3) - Core technical contribution
    • Grid-to-Image (G2I) index
    • Image-to-Grid (I2G) index
    • Pre-materialized execution plans
  5. Hybrid Concurrency-Aware I/O Coordination (Section 4)
    • Deterministic vs optimistic execution modes
    • Adaptive mode switching
  6. I/O Stack Tuning (Section 5) - SA-GMAB algorithm
  7. Performance Evaluation (Section 6) - Experimental results on Martian datasets
  8. Conclusions - Summary of contributions

Key Files

  • rs_retrieval.tex - Main LaTeX source (single-file document)
  • references.bib - Bibliography database
  • fig/ - Figures directory (index.png, st-query.png, cc.png)
  • exp/ - Experimental results (PDF charts)

LaTeX Package Dependencies

Required Packages

\documentclass[lettersize,journal]{IEEEtran}
\usepackage{amsmath,amsfonts}      % Mathematics
\usepackage{graphicx}               % Figures
\usepackage[linesnumbered,lined,ruled]{algorithm2e}  % Algorithms
\usepackage{cite}                   % Citations
\usepackage{array}                  % Table formatting
\usepackage{makecell}               % Table cells
\usepackage{subfigure}              % Subfigures

Chinese Language Support

  • The project directory name includes Chinese characters (遥感影像部分检索)
  • Document content is in English
  • Uses ctex distribution (Chinese TeX) on the system

Document Conventions

Cross-References

All sections use \label{} and \ref{} for cross-referencing:

  • Section labels: sec:XX format (e.g., \label{sec:Index})
  • Algorithm labels: alg:XX format
  • Figure labels: fig:XX format
  • Equation labels: eq:XX format

Mathematical Notation

  • Extensive use of mathematical formulations
  • Cost models use notation: C_{total}, T_{compute}, etc.
  • Algorithm pseudo-code uses algorithm2e package

Citation Style

  • IEEE citation style with numeric references
  • Citations in format: \cite{AuthorYearKEY}
  • Bibliography managed in references.bib

Figure Organization

Figures are organized by topic:

  • fig/index.png - Index schema design
  • fig/st-query.png - Retrieval-time execution flow
  • fig/cc.png - Concurrency coordination mechanism

Common Editing Tasks

Adding a New Section

  1. Add \section{Section Name} with \label{sec:NAME}
  2. Update the table of contents/organization paragraph in Introduction
  3. Ensure cross-references use correct label format

Modifying Algorithms

  • Use algorithm2e environment
  • Keep linesnumbered,lined,ruled options for consistency
  • Label with \label{alg:NAME} for referencing

Adding Figures

  1. Place figure files in fig/ directory
  2. Use \begin{figure}[t] for top placement (IEEE convention)
  3. Include \caption{} and \label{fig:NAME}
  4. Refer using \ref{fig:NAME}

Bibliography Updates

  1. Add entries to references.bib
  2. Use BibTeX key format: AuthorYearKEY (e.g., Ma15RS_bigdata)
  3. Run bibtex rs_retrieval after modifying .bib file
  4. Compile LaTeX twice to resolve references

Important Notes

Compilation Workflow

When making changes that affect:

  • Text only: Single pdflatex run sufficient
  • Citations: Run pdflatexbibtexpdflatex × 2
  • New sections/labels: Run pdflatex twice to resolve cross-references
  • Figures: Ensure all figure files exist before compilation

Git Repository

  • Main branch: main
  • Recent activity: Testing XeLaTeX compilation
  • Modified files tracked: .tex, .pdf, .aux, .log, .synctex.gz

Document Formatting

  • Strict IEEE journal format compliance
  • Font: Times Roman family
  • Two-column layout
  • Letter size paper
  • 15-page final document

Known Issues

  • Some font variants (bold/italic) unavailable in current TeX distribution
  • Testing migration from pdflatex to xelatex (commit f7ffed8)

Experimental Data Reference

The paper evaluates on Martian remote sensing datasets:

  • Total volume: 51.9 TB across 669,641 images
  • Datasets: MoRIC, CTX, THEMIS, HiRISE
  • Environment: 9-node cluster with HBase and Lustre file system
  • Metrics: Latency, I/O throughput, request collapse efficiency

Results show:

  • Order-of-magnitude latency reduction with I/O-aware indexing
  • 54x speedup under high contention with hybrid coordination
  • 2x faster recovery from workload shifts with SA-GMAB