query换成retrieval
This commit is contained in:
@@ -177,7 +177,7 @@
|
|||||||
%
|
%
|
||||||
% paper title
|
% paper title
|
||||||
% can use linebreaks \\ within to get better formatting as desired
|
% can use linebreaks \\ within to get better formatting as desired
|
||||||
\title{An I/O efficient approach for concurrent spatio-temporal range queries over large-scale remote sensing image data}
|
\title{An I/O efficient approach for concurrent spatio-temporal range retrievals over large-scale remote sensing image data}
|
||||||
%
|
%
|
||||||
%
|
%
|
||||||
% author names and IEEE memberships
|
% author names and IEEE memberships
|
||||||
@@ -258,7 +258,7 @@
|
|||||||
\IEEEcompsoctitleabstractindextext{%
|
\IEEEcompsoctitleabstractindextext{%
|
||||||
\begin{abstract}
|
\begin{abstract}
|
||||||
%\boldmath
|
%\boldmath
|
||||||
High-performance remote sensing analytics workflows require ingesting and querying massive image archives to support real-time spatio-temporal applications. While modern systems utilize window-based I/O reading to reduce data transfer, they face a dual bottleneck: the prohibitive overhead of runtime geospatial computations caused by the decoupling of logical indexing from physical storage, and severe storage-level I/O contention triggered by uncoordinated concurrent reads. To address these limitations, we present a comprehensive I/O-aware query processing approach based on a novel "Index-as-an-Execution-Plan" paradigm. We introduce a dual-layer inverted structure that serves as a deterministic I/O planner, pre-materializing grid-to-pixel mappings to completely eliminate runtime geometric calculations. Furthermore, we design a hybrid concurrency-aware I/O coordination protocol that adaptively integrates Calvin-style deterministic ordering with optimistic execution, effectively converting I/O contention into request merging opportunities. To handle fluctuating workloads, we incorporate a Surrogate-Assisted Genetic Multi-Armed Bandit mechanism for automatic parameter tuning. Evaluated on a distributed cluster with Sentinel-2 datasets, our approach reduces end-to-end latency by an order of magnitude compared to standard window-based reading, achieves linear throughput scaling under high concurrency, and demonstrates superior convergence speed in automatic tuning.
|
High-performance remote sensing analytics workflows require ingesting and retrieving massive image archives to support real-time spatio-temporal applications. While modern systems utilize window-based I/O reading to reduce data transfer, they face a dual bottleneck: the prohibitive overhead of runtime geospatial computations caused by the decoupling of logical indexing from physical storage, and severe storage-level I/O contention triggered by uncoordinated concurrent reads. To address these limitations, we present a comprehensive I/O-aware retrieval processing approach based on a novel "Index-as-an-Execution-Plan" paradigm. We introduce a dual-layer inverted structure that serves as a deterministic I/O planner, pre-materializing grid-to-pixel mappings to completely eliminate runtime geometric calculations. Furthermore, we design a hybrid concurrency-aware I/O coordination protocol that adaptively integrates Calvin-style deterministic ordering with optimistic execution, effectively converting I/O contention into request merging opportunities. To handle fluctuating workloads, we incorporate a Surrogate-Assisted Genetic Multi-Armed Bandit mechanism for automatic parameter tuning. Evaluated on a distributed cluster with Sentinel-2 datasets, our approach reduces end-to-end latency by an order of magnitude compared to standard window-based reading, achieves linear throughput scaling under high concurrency, and demonstrates superior convergence speed in automatic tuning.
|
||||||
|
|
||||||
\end{abstract}
|
\end{abstract}
|
||||||
% IEEEtran.cls defaults to using nonbold math in the Abstract.
|
% IEEEtran.cls defaults to using nonbold math in the Abstract.
|
||||||
@@ -271,7 +271,7 @@ High-performance remote sensing analytics workflows require ingesting and queryi
|
|||||||
|
|
||||||
% Note that keywords are not normally used for peer review papers.
|
% Note that keywords are not normally used for peer review papers.
|
||||||
\begin{keywords}
|
\begin{keywords}
|
||||||
Remote sensing data management, Spatio-temporal range queries, I/O-aware indexing, Concurrency control, I/O tuning
|
Remote sensing data management, Spatio-temporal range retrievals, I/O-aware indexing, Concurrency control, I/O tuning
|
||||||
\end{keywords}}
|
\end{keywords}}
|
||||||
|
|
||||||
|
|
||||||
@@ -309,30 +309,30 @@ Remote sensing data management, Spatio-temporal range queries, I/O-aware indexin
|
|||||||
%%\hfill mds
|
%%\hfill mds
|
||||||
|
|
||||||
%%\hfill January 11, 2007
|
%%\hfill January 11, 2007
|
||||||
\IEEEPARstart{A} massive amount of remote sensing (RS) data, characterized by high spatial, temporal, and spectral resolutions, is being generated at an unprecedented speed due to the rapid advancement of Earth observation missions \cite{Ma15RS_bigdata}. For instance, NASA’s AVIRIS-NG acquires nearly 9 GB of data per hour, while the EO-1 Hyperion sensor generates over 1.6 TB daily \cite{Haut21DDL_RS}. Beyond the sheer volume of data, these datasets are increasingly subjected to intensive concurrent access from global research communities and real-time emergency response systems (e.g., multi-departmental coordination during natural disasters). Consequently, modern RS platforms are required to provide not only massive storage capacity but also high-throughput query capabilities to satisfy the simultaneous demands of numerous spatio-temporal analysis tasks.
|
\IEEEPARstart{A} massive amount of remote sensing (RS) data, characterized by high spatial, temporal, and spectral resolutions, is being generated at an unprecedented speed due to the rapid advancement of Earth observation missions \cite{Ma15RS_bigdata}. For instance, NASA's AVIRIS-NG acquires nearly 9 GB of data per hour, while the EO-1 Hyperion sensor generates over 1.6 TB daily \cite{Haut21DDL_RS}. Beyond the sheer volume of data, these datasets are increasingly subjected to intensive concurrent access from global research communities and real-time emergency response systems (e.g., multi-departmental coordination during natural disasters). Consequently, modern RS platforms are required to provide not only massive storage capacity but also high-throughput retrieval capabilities to satisfy the simultaneous demands of numerous spatio-temporal analysis tasks.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Existing RS data management systems \cite{LEWIS17datacube, Yan21RS_manage1, liu24mstgi} typically decompose a spatio-temporal range query into a decoupled two-phase execution model. The first phase is the metadata filtering phase, which utilizes spatio-temporal metadata (e.g., footprints, timestamps) to identify candidate image files that intersect the query predicate. Recent advancements have transitioned from traditional tree-based indexes \cite{Strobl08PostGIS, Simoes16PostGIST} to scalable distributed schemes based on grid encodings and space-filling curves, such as GeoHash \cite{suwardi15geohash}, GeoSOT \cite{Yan21RS_manage1}, and GeoMesa \cite{hughes15geomesa}. By leveraging these high-dimensional indexing structures, the search complexity of the first phase has been effectively reduced to $O(\log N)$ or even $O(1)$, making metadata discovery extremely efficient even for billion-scale datasets.
|
Existing RS data management systems \cite{LEWIS17datacube, Yan21RS_manage1, liu24mstgi} typically decompose a spatio-temporal range retrieval into a decoupled two-phase execution model. The first phase is the metadata filtering phase, which utilizes spatio-temporal metadata (e.g., footprints, timestamps) to identify candidate image files that intersect the retrieval predicate. Recent advancements have transitioned from traditional tree-based indexes \cite{Strobl08PostGIS, Simoes16PostGIST} to scalable distributed schemes based on grid encodings and space-filling curves, such as GeoHash \cite{suwardi15geohash}, GeoSOT \cite{Yan21RS_manage1}, and GeoMesa \cite{hughes15geomesa}. By leveraging these high-dimensional indexing structures, the search complexity of the first phase has been effectively reduced to $O(\log N)$ or even $O(1)$, making metadata discovery extremely efficient even for billion-scale datasets.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
The second phase is the data extraction phase, where the system reads the actual pixel data from the identified raw image files stored in distributed file systems or object stores. A critical observation in modern high-performance RS analytics is that the primary system bottleneck has fundamentally shifted from the first phase to the second. While the metadata search completes in milliseconds, the end-to-end query latency is now dominated by the massive I/O overhead required to fetch, decompress, and process large-scale raw images. Traditional systems attempted to reduce I/O overhead by pre-slicing tiles and building pyramids (e.g., approaches used in Google Earth Engine \cite{gorelick17GEE} that store metadata in HBase and serve pre-tiled image pyramids), but aggressive tiling increases management complexity and produces many small files. More recent Cloud-Optimized GeoTIFF (COG) formats and COG-aware frameworks \cite{LEWIS17datacube}, \cite{riotiler25riotiler} exploit internal overviews and window-based I/O to read only the portions of files that spatially intersect a query.
|
The second phase is the data extraction phase, where the system reads the actual pixel data from the identified raw image files stored in distributed file systems or object stores. A critical observation in modern high-performance RS analytics is that the primary system bottleneck has fundamentally shifted from the first phase to the second. While the metadata search completes in milliseconds, the end-to-end retrieval latency is now dominated by the massive I/O overhead required to fetch, decompress, and process large-scale raw images. Traditional systems attempted to reduce I/O overhead by pre-slicing tiles and building pyramids (e.g., approaches used in Google Earth Engine \cite{gorelick17GEE} that store metadata in HBase and serve pre-tiled image pyramids), but aggressive tiling increases management complexity and produces many small files. More recent Cloud-Optimized GeoTIFF (COG) formats and COG-aware frameworks \cite{LEWIS17datacube}, \cite{riotiler25riotiler} exploit internal overviews and window-based I/O to read only the portions of files that spatially intersect a retrieval.
|
||||||
|
|
||||||
While window-based I/O effectively reduces raw data transfer, it introduces a new "computation wall" due to the decoupling of logical indexing from physical storage. Current state-of-the-art systems operate on a "Search-then-Compute-then-Read" model: after identifying candidate files, they must perform fine-grained, per-image geospatial computations at runtime to map query coordinates to precise file offsets and clip boundaries. This runtime geometric resolution ($C_{geo}$) becomes computationally prohibitive when processing a large volume of candidate images, often negating the benefits of I/O reduction. Moreover, under concurrent workloads, the lack of coordination among these independent read requests leads to severe I/O contention and storage thrashing, rendering traditional indexing-centric optimizations insufficient for real-time applications.
|
While window-based I/O effectively reduces raw data transfer, it introduces a new "computation wall" due to the decoupling of logical indexing from physical storage. Current state-of-the-art systems operate on a "Search-then-Compute-then-Read" model: after identifying candidate files, they must perform fine-grained, per-image geospatial computations at runtime to map retrieval coordinates to precise file offsets and clip boundaries. This runtime geometric resolution ($C_{geo}$) becomes computationally prohibitive when processing a large volume of candidate images, often negating the benefits of I/O reduction. Moreover, under concurrent workloads, the lack of coordination among these independent read requests leads to severe I/O contention and storage thrashing, rendering traditional indexing-centric optimizations insufficient for real-time applications.
|
||||||
|
|
||||||
To address the problems above, we propose a novel "Index-as-an-Execution-Plan" paradigm to strictly bound the query latency. Unlike conventional approaches that treat indexing and I/O execution as separate stages, our approach integrates fine-grained partial querying directly into the indexing structure. By pre-materializing the mapping between logical spatial grids and physical pixel windows, our system enables deterministic I/O planning without runtime geometric computation. To further ensure scalability, we introduce a concurrency control protocol tailored for spatio-temporal range queries and an automatic I/O tuning mechanism. The principal contributions of this paper are summarized as follows:
|
To address the problems above, we propose a novel "Index-as-an-Execution-Plan" paradigm to strictly bound the retrieval latency. Unlike conventional approaches that treat indexing and I/O execution as separate stages, our approach integrates fine-grained partial retrieval directly into the indexing structure. By pre-materializing the mapping between logical spatial grids and physical pixel windows, our system enables deterministic I/O planning without runtime geometric computation. To further ensure scalability, we introduce a concurrency control protocol tailored for spatio-temporal range retrievals and an automatic I/O tuning mechanism. The principal contributions of this paper are summarized as follows:
|
||||||
|
|
||||||
\begin{enumerate}
|
\begin{enumerate}
|
||||||
\item We propose an I/O-aware "Index-as-an-Execution-Plan" schema. Instead of merely returning candidate image identifiers, our index directly translates high-level spatio-temporal predicates into concrete, byte-level windowed read plans. This design bridges the semantic gap between logical queries and physical storage, eliminating expensive runtime geospatial computations and ensuring that I/O cost is proportional strictly to the query footprint.
|
\item We propose an I/O-aware "Index-as-an-Execution-Plan" schema. Instead of merely returning candidate image identifiers, our index directly translates high-level spatio-temporal predicates into concrete, byte-level windowed read plans. This design bridges the semantic gap between logical retrievals and physical storage, eliminating expensive runtime geospatial computations and ensuring that I/O cost is proportional strictly to the retrieval footprint.
|
||||||
|
|
||||||
\item We propose a hybrid concurrency-aware I/O coordination protocol. This protocol adapts transaction processing principles by integrating Calvin-style deterministic ordering \cite{Thomson12Calvin} with optimistic execution \cite{Lim17OCC}. It shifts the focus from protecting database rows to coordinating shared I/O flows. This protocol dynamically switches strategies based on spatial contention, effectively converting "I/O contention" into "request merging opportunities."
|
\item We propose a hybrid concurrency-aware I/O coordination protocol. This protocol adapts transaction processing principles by integrating Calvin-style deterministic ordering \cite{Thomson12Calvin} with optimistic execution \cite{Lim17OCC}. It shifts the focus from protecting database rows to coordinating shared I/O flows. This protocol dynamically switches strategies based on spatial contention, effectively converting "I/O contention" into "request merging opportunities."
|
||||||
|
|
||||||
\item We proposed an automatic I/O tuning method to improve the I/O performance of spatio-temporal range queries over remote sensing data. The method extends an existing AI-powered I/O tuning framework \cite{Rajesh24TunIO} based on a surrogate-assisted genetic multi-armed bandits algorithm \cite{Preil25GMAB}.
|
\item We proposed an automatic I/O tuning method to improve the I/O performance of spatio-temporal range retrievals over remote sensing data. The method extends an existing AI-powered I/O tuning framework \cite{Rajesh24TunIO} based on a surrogate-assisted genetic multi-armed bandits algorithm \cite{Preil25GMAB}.
|
||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
|
|
||||||
\par
|
\par
|
||||||
The remainder of this paper is organized as follows:
|
The remainder of this paper is organized as follows:
|
||||||
Section~\ref{sec:RW} presents the related work.
|
Section~\ref{sec:RW} presents the related work.
|
||||||
Section~\ref{sec:DF} proposes the definition concerning the spatio-temporal range query problem.
|
Section~\ref{sec:DF} proposes the definition concerning the spatio-temporal range retrieval problem.
|
||||||
Section~\ref{sec:Index} proposes the indexing structre.
|
Section~\ref{sec:Index} proposes the indexing structre.
|
||||||
Section~\ref{sec:CC} proposes the hybrid concurrency control protocol.
|
Section~\ref{sec:CC} proposes the hybrid concurrency control protocol.
|
||||||
Section~\ref{sec:Tuning} proposes the method of I/O stack tuning.
|
Section~\ref{sec:Tuning} proposes the method of I/O stack tuning.
|
||||||
@@ -340,19 +340,19 @@ Section~\ref{sec:EXP} presents the experiments and results.
|
|||||||
Section~\ref{sec:Con} concludes this paper with a summary.
|
Section~\ref{sec:Con} concludes this paper with a summary.
|
||||||
|
|
||||||
\section{Related Work}\label{sec:RW}
|
\section{Related Work}\label{sec:RW}
|
||||||
This section describes the most salient studies of I/O-efficient spatio-temporal query processing, concurrency control and I/O Performance Tuning.
|
This section describes the most salient studies of I/O-efficient spatio-temporal retrieval processing, concurrency control and I/O Performance Tuning.
|
||||||
|
|
||||||
\subsection{I/O-Efficient Spatio-Temporal Query Processing}
|
\subsection{I/O-Efficient Spatio-Temporal Retrieval Processing}
|
||||||
Efficient spatio-temporal query processing for remote sensing data has been extensively studied, with early efforts primarily focusing on metadata organization and index-level pruning in relational database systems. Traditional approaches typically extend tree-based spatial indexes, such as R-tree \cite{Strobl08PostGIS}, quadtree \cite{Tang12Quad-Tree}, and their spatio-temporal variants \cite{Simoes16PostGIST}, to organize image footprints together with temporal attributes, and are commonly implemented on relational backends (e.g., MySQL and PostgreSQL). These methods provide efficient range filtering for moderate-scale datasets, but their reliance on balanced tree structures often leads to high maintenance overhead and limited scalability as the volume of remote sensing metadata grows rapidly. With the continuous increase in data volume and ingestion rate, recent systems have gradually shifted toward grid-based spatio-temporal indexing schemes deployed on distributed NoSQL stores. By encoding spatial footprints into uniform spatial grids using GeoHash \cite{suwardi15geohash}, GeoSOT \cite{Yan21RS_manage1}, or space-filling curves \cite{hughes15geomesa}, \cite{liu24mstgi}, and combining them with temporal identifiers, these approaches enable lightweight index construction and better horizontal scalability on backends such as HBase and Elasticsearch. Such grid-based indexes can effectively reduce the candidate search space through coarse-grained pruning and are more suitable for large-scale, continuously growing remote sensing archives.
|
Efficient spatio-temporal query processing for remote sensing data has been extensively studied, with early efforts primarily focusing on metadata organization and index-level pruning in relational database systems. Traditional approaches typically extend tree-based spatial indexes, such as R-tree \cite{Strobl08PostGIS}, quadtree \cite{Tang12Quad-Tree}, and their spatio-temporal variants \cite{Simoes16PostGIST}, to organize image footprints together with temporal attributes, and are commonly implemented on relational backends (e.g., MySQL and PostgreSQL). These methods provide efficient range filtering for moderate-scale datasets, but their reliance on balanced tree structures often leads to high maintenance overhead and limited scalability as the volume of remote sensing metadata grows rapidly. With the continuous increase in data volume and ingestion rate, recent systems have gradually shifted toward grid-based spatio-temporal indexing schemes deployed on distributed NoSQL stores. By encoding spatial footprints into uniform spatial grids using GeoHash \cite{suwardi15geohash}, GeoSOT \cite{Yan21RS_manage1}, or space-filling curves \cite{hughes15geomesa}, \cite{liu24mstgi}, and combining them with temporal identifiers, these approaches enable lightweight index construction and better horizontal scalability on backends such as HBase and Elasticsearch. Such grid-based indexes can effectively reduce the candidate search space through coarse-grained pruning and are more suitable for large-scale, continuously growing remote sensing archives.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
However, index pruning alone is insufficient to guarantee end-to-end query efficiency for remote sensing workloads, where individual images are usually large and query results require further pixel-level processing. To reduce the amount of raw I/O, Google Earth system \cite{gorelick17GEE} rely on tiling and multi-resolution pyramids that physically split images into small blocks. While more recent solutions leverage COG and window-based I/O to enable partial reads from monolithic image files. Frameworks such as OpenDataCube \cite{LEWIS17datacube} exploit these features to read only the image regions intersecting a query window, thereby reducing unnecessary data transfer. Nevertheless, after candidate images are identified, most systems still perform fine-grained geospatial computations for each image, including coordinate transformations and precise pixel-window derivation, which may incur substantial overhead when many images are involved.
|
However, index pruning alone is insufficient to guarantee end-to-end retrieval efficiency for remote sensing workloads, where individual images are usually large and retrieval results require further pixel-level processing. To reduce the amount of raw I/O, Google Earth system \cite{gorelick17GEE} rely on tiling and multi-resolution pyramids that physically split images into small blocks. While more recent solutions leverage COG and window-based I/O to enable partial reads from monolithic image files. Frameworks such as OpenDataCube \cite{LEWIS17datacube} exploit these features to read only the image regions intersecting a retrieval window, thereby reducing unnecessary data transfer. Nevertheless, after candidate images are identified, most systems still perform fine-grained geospatial computations for each image, including coordinate transformations and precise pixel-window derivation, which may incur substantial overhead when many images are involved.
|
||||||
|
|
||||||
\subsection{Concurrency Control}
|
\subsection{Concurrency Control}
|
||||||
Concurrency control has long been studied to provide correctness and high throughput in multi-user database and storage systems, with two broad paradigms dominating the literature: deterministic scheduling \cite{Thomson12Calvin} and non-deterministic schemes \cite{Bernstein812PL}, \cite{KungR81OCC}. Hybrid approaches \cite{WangK16MVOCC}, \cite{Hong25HDCC} that adaptively combine these paradigms seek to exploit the low-conflict efficiency of deterministic execution while retaining the flexibility of optimistic techniques. More recent proposals such as OOCC target read-heavy, disaggregated settings by reducing validation and round-trips for read-only transactions, achieving low latency under OLTP-like workloads \cite{Wu25OOCC}. These CC families are primarily optimized for record- or key-level access patterns: their metrics and designs emphasize transaction latency, abort rates, and throughput under workloads with small, well-defined read/write sets.
|
Concurrency control has long been studied to provide correctness and high throughput in multi-user database and storage systems, with two broad paradigms dominating the literature: deterministic scheduling \cite{Thomson12Calvin} and non-deterministic schemes \cite{Bernstein812PL}, \cite{KungR81OCC}. Hybrid approaches \cite{WangK16MVOCC}, \cite{Hong25HDCC} that adaptively combine these paradigms seek to exploit the low-conflict efficiency of deterministic execution while retaining the flexibility of optimistic techniques. More recent proposals such as OOCC target read-heavy, disaggregated settings by reducing validation and round-trips for read-only transactions, achieving low latency under OLTP-like workloads \cite{Wu25OOCC}. These CC families are primarily optimized for record- or key-level access patterns: their metrics and designs emphasize transaction latency, abort rates, and throughput under workloads with small, well-defined read/write sets.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Overall, existing concurrency control mechanisms are largely designed around transaction-level correctness and throughput, assuming record- or key-based access patterns and treating storage I/O as a black box. Their optimization objectives rarely account for I/O amplification or fine-grained storage contention induced by concurrent range queries. Consequently, these approaches are ill-suited for data-intensive spatio-temporal workloads, where coordinating overlapping window reads and mitigating storage-level interference are critical to achieving scalable performance under multi-user access.
|
Overall, existing concurrency control mechanisms are largely designed around transaction-level correctness and throughput, assuming record- or key-based access patterns and treating storage I/O as a black box. Their optimization objectives rarely account for I/O amplification or fine-grained storage contention induced by concurrent range retrievals. Consequently, these approaches are ill-suited for data-intensive spatio-temporal workloads, where coordinating overlapping window reads and mitigating storage-level interference are critical to achieving scalable performance under multi-user access.
|
||||||
|
|
||||||
\subsection{I/O Performance Tuning in Storage Systems}
|
\subsection{I/O Performance Tuning in Storage Systems}
|
||||||
I/O performance tuning has been extensively studied in the context of HPC and data-intensive storage systems, where complex multi-layer I/O stacks expose a large number of tunable parameters. These parameters span different layers, including application-level I/O libraries, middleware, and underlying storage systems, and their interactions often lead to highly non-linear performance behaviors. As a result, manual tuning is time-consuming and error-prone, motivating a wide range of auto-tuning approaches.
|
I/O performance tuning has been extensively studied in the context of HPC and data-intensive storage systems, where complex multi-layer I/O stacks expose a large number of tunable parameters. These parameters span different layers, including application-level I/O libraries, middleware, and underlying storage systems, and their interactions often lead to highly non-linear performance behaviors. As a result, manual tuning is time-consuming and error-prone, motivating a wide range of auto-tuning approaches.
|
||||||
@@ -364,10 +364,10 @@ Several studies focus on improving the efficiency of the tuning pipeline itself
|
|||||||
User-level I/O tuning has also been explored, most notably by H5Tuner \cite{Behzad13HDF5}, which employs genetic algorithms to optimize the configuration of the HDF5 I/O library. Although effective for single-layer tuning, H5Tuner does not consider cross-layer interactions and lacks mechanisms for reducing tuning cost, such as configuration prioritization or early stopping.
|
User-level I/O tuning has also been explored, most notably by H5Tuner \cite{Behzad13HDF5}, which employs genetic algorithms to optimize the configuration of the HDF5 I/O library. Although effective for single-layer tuning, H5Tuner does not consider cross-layer interactions and lacks mechanisms for reducing tuning cost, such as configuration prioritization or early stopping.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
More recently, TunIO \cite{Rajesh24TunIO} proposed an AI-powered I/O tuning framework that explicitly targets the growing configuration spaces of modern I/O stacks. TunIO integrates several advanced techniques, including I/O kernel extraction, smart selection of high-impact parameters, and reinforcement learning–driven early stopping, to balance tuning cost and performance gain across multiple layers. Despite its effectiveness, TunIO and related frameworks primarily focus on single-application or isolated workloads, assuming stable access patterns during tuning. Query-level I/O behaviors, such as fine-grained window access induced by spatio-temporal range queries, as well as interference among concurrent users, are generally outside the scope of existing I/O tuning approaches.
|
More recently, TunIO \cite{Rajesh24TunIO} proposed an AI-powered I/O tuning framework that explicitly targets the growing configuration spaces of modern I/O stacks. TunIO integrates several advanced techniques, including I/O kernel extraction, smart selection of high-impact parameters, and reinforcement learning–driven early stopping, to balance tuning cost and performance gain across multiple layers. Despite its effectiveness, TunIO and related frameworks primarily focus on single-application or isolated workloads, assuming stable access patterns during tuning. Retrieval-level I/O behaviors, such as fine-grained window access induced by spatio-temporal range retrievals, as well as interference among concurrent users, are generally outside the scope of existing I/O tuning approaches.
|
||||||
|
|
||||||
\section{Definition}\label{sec:DF}
|
\section{Definition}\label{sec:DF}
|
||||||
This section formalizes the spatio-temporal range query problem and establishes the cost models for query execution. We assume a distributed storage environment where large-scale remote sensing images are stored as objects or files.
|
This section formalizes the spatio-temporal range retrieval problem and establishes the cost models for retrieval execution. We assume a distributed storage environment where large-scale remote sensing images are stored as objects or files.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Definition~1 (Spatio-temporal Remote Sensing Image). A remote sensing image $R$ is defined as a tuple:
|
Definition~1 (Spatio-temporal Remote Sensing Image). A remote sensing image $R$ is defined as a tuple:
|
||||||
@@ -379,7 +379,7 @@ Definition~1 (Spatio-temporal Remote Sensing Image). A remote sensing image $R$
|
|||||||
where $id$ is the unique identifier; $\Omega = [0, W] \times [0, H]$ denotes the pixel coordinate space; $\mathcal{D}$ represents the raw pixel data; and $t$ is the temporal validity interval. The image is associated with a spatial footprint $MBR(R)$ in the global coordinate reference system.
|
where $id$ is the unique identifier; $\Omega = [0, W] \times [0, H]$ denotes the pixel coordinate space; $\mathcal{D}$ represents the raw pixel data; and $t$ is the temporal validity interval. The image is associated with a spatial footprint $MBR(R)$ in the global coordinate reference system.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Definition 2 (Spatio-temporal Range Query). Given a dataset $\mathbb{R}$, a query $Q$ is defined by a spatio-temporal predicate $Q = \langle S, T \rangle$, where $S$ is the spatial bounding box and $T$ is the time interval. The query result set $\mathcal{R}_Q$ is defined as:
|
Definition 2 (Spatio-temporal Range Retrieval). Given a dataset $\mathbb{R}$, a retrieval $Q$ is defined by a spatio-temporal predicate $Q = \langle S, T \rangle$, where $S$ is the spatial bounding box and $T$ is the time interval. The retrieval result set $\mathcal{R}_Q$ is defined as:
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
\label{eqn:pre_st_query}
|
\label{eqn:pre_st_query}
|
||||||
@@ -390,7 +390,7 @@ Definition 2 (Spatio-temporal Range Query). Given a dataset $\mathbb{R}$, a quer
|
|||||||
For each $R \in \mathcal{R}_Q$, the system must return the pixel matrix corresponding to the intersection region $MBR(R) \cap S$.
|
For each $R \in \mathcal{R}_Q$, the system must return the pixel matrix corresponding to the intersection region $MBR(R) \cap S$.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Definition 3 (Query Execution Cost Model). The execution latency of a query $Q$, denoted as $Cost(Q)$, is composed of two phases: metadata filtering and data extraction.
|
Definition 3 (Retrieval Execution Cost Model). The execution latency of a retrieval $Q$, denoted as $Cost(Q)$, is composed of two phases: metadata filtering and data extraction.
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
\label{eqn:cost_total}
|
\label{eqn:cost_total}
|
||||||
Cost\left( Q \right) =C_{meta}\left( Q \right) +\sum_{R\in \mathcal{R}_Q}{\left( C_{geo}\left( R,Q \right) +C_{io}\left( R,Q \right) \right)}.
|
Cost\left( Q \right) =C_{meta}\left( Q \right) +\sum_{R\in \mathcal{R}_Q}{\left( C_{geo}\left( R,Q \right) +C_{io}\left( R,Q \right) \right)}.
|
||||||
@@ -400,11 +400,11 @@ Definition 3 (Query Execution Cost Model). The execution latency of a query $Q$,
|
|||||||
Here, $C_{meta}(Q)$ is the cost of identifying candidate images $\mathcal{R}_Q$ using indices. The data extraction cost for each image consists of two components: geospatial computation cost ($C_{geo}$) and I/O access cost ($C_{io}$). $C_{geo}$ is the CPU time required to calculate the pixel-to-geographic mapping, determine the exact read windows (offsets and lengths), and handle boundary clipping. In window-based partial reading schemes, this cost is non-negligible due to the complexity of coordinate transformations. $C_{io}$ is the latency to fetch the actual binary data from storage.
|
Here, $C_{meta}(Q)$ is the cost of identifying candidate images $\mathcal{R}_Q$ using indices. The data extraction cost for each image consists of two components: geospatial computation cost ($C_{geo}$) and I/O access cost ($C_{io}$). $C_{geo}$ is the CPU time required to calculate the pixel-to-geographic mapping, determine the exact read windows (offsets and lengths), and handle boundary clipping. In window-based partial reading schemes, this cost is non-negligible due to the complexity of coordinate transformations. $C_{io}$ is the latency to fetch the actual binary data from storage.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Definition ~4 (Concurrent Spatio-temporal Queries). Let $\mathcal{Q} = \{Q_1, Q_2, \ldots, Q_N\}$ denote a set of spatio-temporal range queries issued concurrently by multiple users.
|
Definition ~4 (Concurrent Spatio-temporal Retrievals). Let $\mathcal{Q} = \{Q_1, Q_2, \ldots, Q_N\}$ denote a set of spatio-temporal range retrievals issued concurrently by multiple users.
|
||||||
Each query $Q_i$ independently specifies a spatio-temporal window $\langle S_i, T_i \rangle$ and may overlap with others in both spatial and temporal dimensions. Concurrent execution of $\mathcal{Q}$ may induce overlapping partial reads over the same images or image regions, leading to redundant I/O and storage-level contention if queries are processed independently.
|
Each retrieval $Q_i$ independently specifies a spatio-temporal window $\langle S_i, T_i \rangle$ and may overlap with others in both spatial and temporal dimensions. Concurrent execution of $\mathcal{Q}$ may induce overlapping partial reads over the same images or image regions, leading to redundant I/O and storage-level contention if retrievals are processed independently.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textbf{Problem Statement (Latency-Optimized Concurrent Query Processing).} Given a dataset $\mathbb{R}$ and a concurrent workload $\mathcal{Q}$, the objective is to minimize the total execution latency:
|
\textbf{Problem Statement (Latency-Optimized Concurrent Retrieval Processing).} Given a dataset $\mathbb{R}$ and a concurrent workload $\mathcal{Q}$, the objective is to minimize the total execution latency:
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
\label{eqn_pre_objective}
|
\label{eqn_pre_objective}
|
||||||
@@ -417,7 +417,7 @@ subject to:
|
|||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
|
|
||||||
\section{I/O-aware Indexing stucture}\label{sec:Index}
|
\section{I/O-aware Indexing stucture}\label{sec:Index}
|
||||||
This section introduces the details of indexing structre for spatio-temporal range query over remote sensing image data.
|
This section introduces the details of indexing structre for spatio-temporal range retrieval over remote sensing image data.
|
||||||
|
|
||||||
\begin{figure*}[htb]
|
\begin{figure*}[htb]
|
||||||
\centering
|
\centering
|
||||||
@@ -435,17 +435,17 @@ To enable I/O-efficient spatio-temporal query processing, we first decompose the
|
|||||||
Based on the grid decomposition, we construct a grid-centric inverted index to associate spatial units with covering images. In our system, each grid cell is assigned a unique \emph{GridKey}, encoded as a 64-bit Z-order value to preserve spatial locality and enable efficient range scans in key-value stores such as HBase. The \emph{G2I table} stores one row per grid cell, where the row key is the GridKey and the value maintains the list of image identifiers (ImageKeys) whose spatial footprints intersect the corresponding cell, as illustrated in Fig.~\ref{fig:index}(a).
|
Based on the grid decomposition, we construct a grid-centric inverted index to associate spatial units with covering images. In our system, each grid cell is assigned a unique \emph{GridKey}, encoded as a 64-bit Z-order value to preserve spatial locality and enable efficient range scans in key-value stores such as HBase. The \emph{G2I table} stores one row per grid cell, where the row key is the GridKey and the value maintains the list of image identifiers (ImageKeys) whose spatial footprints intersect the corresponding cell, as illustrated in Fig.~\ref{fig:index}(a).
|
||||||
|
|
||||||
\par
|
\par
|
||||||
This grid-to-image mapping allows query processing to begin with a lightweight enumeration of grid cells covered by a query region, followed by direct lookups of candidate images via exact GridKey matches. By treating each grid cell as an independent spatial bucket, the G2I table provides efficient metadata-level pruning and avoids costly geometric intersection tests over large image footprints.
|
This grid-to-image mapping allows retrieval processing to begin with a lightweight enumeration of grid cells covered by a retrieval region, followed by direct lookups of candidate images via exact GridKey matches. By treating each grid cell as an independent spatial bucket, the G2I table provides efficient metadata-level pruning and avoids costly geometric intersection tests over large image footprints.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
However, the G2I table alone is insufficient for I/O-efficient query execution. While it identifies which images are relevant to a given grid cell, it does not capture how the grid cell maps to pixel regions within each image. As a result, a grid-only representation cannot directly guide partial reads and would still require per-image geospatial computations at query time. Therefore, the G2I table functions as a coarse spatial filter and must be complemented by an image-centric structure that materializes the correspondence between grid cells and pixel windows, enabling fine-grained, window-based I/O.
|
However, the G2I table alone is insufficient for I/O-efficient retrieval execution. While it identifies which images are relevant to a given grid cell, it does not capture how the grid cell maps to pixel regions within each image. As a result, a grid-only representation cannot directly guide partial reads and would still require per-image geospatial computations at retrieval time. Therefore, the G2I table functions as a coarse spatial filter and must be complemented by an image-centric structure that materializes the correspondence between grid cells and pixel windows, enabling fine-grained, window-based I/O.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textbf{Image-to-Grid Mapping (I2G).}
|
\textbf{Image-to-Grid Mapping (I2G).}
|
||||||
To complement the grid-centric G2I table and enable fine-grained, I/O-efficient data access, we introduce an image-centric inverted structure, referred to as the Image-to-Grid mapping (I2G). In contrast to G2I, which organizes metadata by spatial grids, the I2G table stores all grid-level access information of a remote sensing image in a single row. Each image therefore occupies exactly one row in the table, significantly improving locality during query execution.
|
To complement the grid-centric G2I table and enable fine-grained, I/O-efficient data access, we introduce an image-centric inverted structure, referred to as the Image-to-Grid mapping (I2G). In contrast to G2I, which organizes metadata by spatial grids, the I2G table stores all grid-level access information of a remote sensing image in a single row. Each image therefore occupies exactly one row in the table, significantly improving locality during retrieval execution.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
As illustrated in Fig.~\ref{fig:index}(b), the row key of the I2G table is the \emph{ImageKey}, i.e., the unique identifier of a remote sensing image. The row value is organized into three column families, each serving a distinct role in query-time pruning and I/O coordination:
|
As illustrated in Fig.~\ref{fig:index}(b), the row key of the I2G table is the \emph{ImageKey}, i.e., the unique identifier of a remote sensing image. The row value is organized into three column families, each serving a distinct role in retrieval-time pruning and I/O coordination:
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textit{Grid–Window Mapping.}
|
\textit{Grid–Window Mapping.}
|
||||||
@@ -456,47 +456,47 @@ This column family records the list of grid cells intersected by the image toget
|
|||||||
where \textit{GridKey} identifies a grid cell at the chosen global resolution, and $W_{ImageKey\_GridKey}$ denotes the minimal pixel bounding rectangle within the image that exactly covers that grid cell.
|
where \textit{GridKey} identifies a grid cell at the chosen global resolution, and $W_{ImageKey\_GridKey}$ denotes the minimal pixel bounding rectangle within the image that exactly covers that grid cell.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
These precomputed window offsets allow the query executor to directly issue windowed reads on large raster files without loading entire images into memory or recomputing geographic-to-pixel transformations at query time. As a result, grid cells become the smallest unit of coordinated I/O, enabling precise partial reads and effective elimination of redundant disk accesses.
|
These precomputed window offsets allow the retrieval executor to directly issue windowed reads on large raster files without loading entire images into memory or recomputing geographic-to-pixel transformations at retrieval time. As a result, grid cells become the smallest unit of coordinated I/O, enabling precise partial reads and effective elimination of redundant disk accesses.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textit{Temporal Metadata.}
|
\textit{Temporal Metadata.}
|
||||||
To support spatio-temporal range queries, each image row includes a lightweight temporal column family that stores its acquisition time information, such as the sensing timestamp or time interval. This metadata enables efficient temporal filtering to be performed jointly with spatial grid matching, without consulting external catalogs or secondary indexes.
|
To support spatio-temporal range retrievals, each image row includes a lightweight temporal column family that stores its acquisition time information, such as the sensing timestamp or time interval. This metadata enables efficient temporal filtering to be performed jointly with spatial grid matching, without consulting external catalogs or secondary indexes.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textit{Storage Pointer.}
|
\textit{Storage Pointer.}
|
||||||
This column family contains the information required to retrieve image data from the underlying storage system. It stores a stable file identifier, such as an object key in an object store (e.g., MinIO/S3) or an absolute path in a POSIX-compatible file system. By decoupling logical image identifiers from physical storage locations, this design supports flexible deployment across heterogeneous storage backends while allowing the query engine to directly access image files once relevant pixel windows have been identified.
|
This column family contains the information required to retrieve image data from the underlying storage system. It stores a stable file identifier, such as an object key in an object store (e.g., MinIO/S3) or an absolute path in a POSIX-compatible file system. By decoupling logical image identifiers from physical storage locations, this design supports flexible deployment across heterogeneous storage backends while allowing the retrieval engine to directly access image files once relevant pixel windows have been identified.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
The I2G table offers several advantages. First, all grid-level access information for the same image is colocated in a single row, avoiding repeated random lookups and improving cache locality during query execution. Second, by materializing grid-to-window correspondences at ingestion time, the system completely avoids expensive per-query geometric computations and directly translates spatial overlap into byte-range I/O requests. Third, the number of rows in the I2G table scales with the number of images rather than the number of grid cells, substantially reducing metadata volume and maintenance overhead.
|
The I2G table offers several advantages. First, all grid-level access information for the same image is colocated in a single row, avoiding repeated random lookups and improving cache locality during retrieval execution. Second, by materializing grid-to-window correspondences at ingestion time, the system completely avoids expensive per-retrieval geometric computations and directly translates spatial overlap into byte-range I/O requests. Third, the number of rows in the I2G table scales with the number of images rather than the number of grid cells, substantially reducing metadata volume and maintenance overhead.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
During data ingestion, the grid–window mappings are generated by projecting grid boundaries into the image coordinate system using the image’s georeferencing parameters. This process requires only lightweight affine or RPC transformations and does not involve storing explicit geometries or performing polygon clipping. As a result, the I2G structure enables efficient partial reads while keeping metadata compact and ingestion costs manageable.
|
During data ingestion, the grid–window mappings are generated by projecting grid boundaries into the image coordinate system using the image’s georeferencing parameters. This process requires only lightweight affine or RPC transformations and does not involve storing explicit geometries or performing polygon clipping. As a result, the I2G structure enables efficient partial reads while keeping metadata compact and ingestion costs manageable.
|
||||||
|
|
||||||
\subsection{Query-time Execution}
|
\subsection{Retrieval-time Execution}
|
||||||
|
|
||||||
\begin{figure}
|
\begin{figure}
|
||||||
\centering
|
\centering
|
||||||
\includegraphics[width=2.2in]{fig/st-query.png}
|
\includegraphics[width=2.2in]{fig/st-query.png}
|
||||||
\caption{Query-time Execution}
|
\caption{Retrieval-time Execution}
|
||||||
\label{fig_ST_Query}
|
\label{fig_ST_Query}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
The I/O-aware index enables efficient spatio-temporal range queries by directly translating query predicates into windowed read plans, while avoiding both full-image loading and expensive geometric computations. Given a user-specified spatio-temporal query
|
The I/O-aware index enables efficient spatio-temporal range retrievals by directly translating retrieval predicates into windowed read plans, while avoiding both full-image loading and expensive geometric computations. Given a user-specified spatio-temporal retrieval
|
||||||
$q = \langle [x_{\min}, y_{\min}, x_{\max}, y_{\max}], [t_s, t_e] \rangle$,
|
$q = \langle [x_{\min}, y_{\min}, x_{\max}, y_{\max}], [t_s, t_e] \rangle$,
|
||||||
the system resolves the query through three consecutive stages: \emph{Grid Enumeration}, \emph{Candidate Image Retrieval with Temporal Pruning}, and \emph{Windowed Read Plan Generation}. As illustrated in Fig.~\ref{fig_ST_Query}, this execution pipeline bridges high-level query predicates and low-level I/O operations in a fully deterministic manner.
|
the system resolves the retrieval through three consecutive stages: \emph{Grid Enumeration}, \emph{Candidate Image Retrieval with Temporal Pruning}, and \emph{Windowed Read Plan Generation}. As illustrated in Fig.~\ref{fig_ST_Query}, this execution pipeline bridges high-level retrieval predicates and low-level I/O operations in a fully deterministic manner.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textbf{Grid Enumeration.}
|
\textbf{Grid Enumeration.}
|
||||||
As shown in Step~1 and Step~2 of Fig.~\ref{fig_ST_Query}, the query execution starts by rasterizing the spatial footprint of $q$ into the fixed global grid at zoom level 14. Instead of performing recursive space decomposition as in quadtrees or hierarchical spatial indexes, our system enumerates the minimal set of grid cells
|
As shown in Step~1 and Step~2 of Fig.~\ref{fig_ST_Query}, the retrieval execution starts by rasterizing the spatial footprint of $q$ into the fixed global grid at zoom level 14. Instead of performing recursive space decomposition as in quadtrees or hierarchical spatial indexes, our system enumerates the minimal set of grid cells
|
||||||
$\{g_1, \ldots, g_k\}$
|
$\{g_1, \ldots, g_k\}$
|
||||||
whose footprints intersect the query bounding box.
|
whose footprints intersect the retrieval bounding box.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Each grid cell corresponds to a unique 64-bit \textit{GridKey}, which directly matches the primary key of the G2I table. This design has important implications: grid enumeration has constant depth and low computational cost and the resulting GridKeys can be directly used as lookup keys without any geometric refinement. Consequently, spatial key generation is reduced to simple arithmetic operations on integer grid coordinates.
|
Each grid cell corresponds to a unique 64-bit \textit{GridKey}, which directly matches the primary key of the G2I table. This design has important implications: grid enumeration has constant depth and low computational cost and the resulting GridKeys can be directly used as lookup keys without any geometric refinement. Consequently, spatial key generation is reduced to simple arithmetic operations on integer grid coordinates.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textbf{Candidate Image Retrieval with Temporal Pruning.}
|
\textbf{Candidate Image Retrieval with Temporal Pruning.}
|
||||||
Given the enumerated grid set $\{g_1, \ldots, g_k\}$, the query processor performs a batched multi-get on the G2I table. Each G2I row corresponds to a single grid cell and stores the identifiers of all images whose spatial footprints intersect that cell. For each grid $g_i$, the lookup returns:
|
Given the enumerated grid set $\{g_1, \ldots, g_k\}$, the retrieval processor performs a batched multi-get on the G2I table. Each G2I row corresponds to a single grid cell and stores the identifiers of all images whose spatial footprints intersect that cell. For each grid $g_i$, the lookup returns:
|
||||||
\[
|
\[
|
||||||
G2I[g_i] = \{ imgKey_1, \ldots, imgKey_m \}.
|
G2I[g_i] = \{ imgKey_1, \ldots, imgKey_m \}.
|
||||||
\]
|
\]
|
||||||
@@ -507,11 +507,11 @@ $C_s = \bigcup_{i=1}^{k} G2I[g_i]$.
|
|||||||
This step eliminates the need for per-image polygon intersection tests that are commonly required in spatial databases and data cube systems.
|
This step eliminates the need for per-image polygon intersection tests that are commonly required in spatial databases and data cube systems.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
To incorporate the temporal constraint $[t_s, t_e]$, each candidate image in $C_s$ is further filtered using the temporal column family of the Image-to-Grid (I2G) table. Images whose acquisition time does not intersect the query interval are discarded early, yielding the final candidate set $C$. This lightweight temporal pruning is performed without accessing any image data and introduces negligible overhead.
|
To incorporate the temporal constraint $[t_s, t_e]$, each candidate image in $C_s$ is further filtered using the temporal column family of the Image-to-Grid (I2G) table. Images whose acquisition time does not intersect the retrieval interval are discarded early, yielding the final candidate set $C$. This lightweight temporal pruning is performed without accessing any image data and introduces negligible overhead.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textbf{Windowed Read Plan Generation.}
|
\textbf{Windowed Read Plan Generation.}
|
||||||
As shown in Step~3 of Fig.~\ref{fig_ST_Query}, the final stage translates the candidate image set into a concrete I/O plan. For each image $I \in C$, the query executor issues a selective range-get on the I2G table to retrieve only the grid–window mappings relevant to the query grids:
|
As shown in Step~3 of Fig.~\ref{fig_ST_Query}, the final stage translates the candidate image set into a concrete I/O plan. For each image $I \in C$, the retrieval executor issues a selective range-get on the I2G table to retrieve only the grid–window mappings relevant to the retrieval grids:
|
||||||
|
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
\label{eqn_pre_spatial_query}
|
\label{eqn_pre_spatial_query}
|
||||||
@@ -519,20 +519,20 @@ I2G\left[ I,\{g_1,...,g_k\} \right] =\left\{ W_{I\_g_i}\mid g_i\cap I\ne \emptys
|
|||||||
\end{equation}
|
\end{equation}
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Each $W_{I\_g_i}$ specifies the exact pixel window in the original raster file that corresponds to grid cell $g_i$. Since these window offsets are precomputed during ingestion, query execution requires only key-based lookups and arithmetic filtering. No geographic coordinate transformation, polygon clipping, or raster–vector intersection is performed at query time.
|
Each $W_{I\_g_i}$ specifies the exact pixel window in the original raster file that corresponds to grid cell $g_i$. Since these window offsets are precomputed during ingestion, retrieval execution requires only key-based lookups and arithmetic filtering. No geographic coordinate transformation, polygon clipping, or raster–vector intersection is performed at retrieval time.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
The resulting collection of pixel windows constitutes a \emph{windowed read plan}, which can be directly translated into byte-range I/O requests against the storage backend. This approach avoids loading entire scenes and ensures that the total I/O volume is proportional to the queried spatial extent rather than the image size.
|
The resulting collection of pixel windows constitutes a \emph{windowed read plan}, which can be directly translated into byte-range I/O requests against the storage backend. This approach avoids loading entire scenes and ensures that the total I/O volume is proportional to the retrieved spatial extent rather than the image size.
|
||||||
|
|
||||||
\subsection{Why I/O-aware}
|
\subsection{Why I/O-aware}
|
||||||
The key reason our indexing design is I/O-aware lies in the fact that the index lookup results are not merely candidate identifiers, but constitute a concrete I/O access plan. Unlike traditional spatial indexes, where query processing yields a set of objects that must still be fetched through opaque storage accesses, our Grid-to-Image and Image-to-Grid lookups deterministically produce the exact pixel windows to be read from disk. As a result, the logical query plan and the physical I/O plan are tightly coupled: resolving a spatio-temporal predicate directly specifies which byte ranges should be accessed and which can be skipped.
|
The key reason our indexing design is I/O-aware lies in the fact that the index lookup results are not merely candidate identifiers, but constitute a concrete I/O access plan. Unlike traditional spatial indexes, where retrieval processing yields a set of objects that must still be fetched through opaque storage accesses, our Grid-to-Image and Image-to-Grid lookups deterministically produce the exact pixel windows to be read from disk. As a result, the logical retrieval plan and the physical I/O plan are tightly coupled: resolving a spatio-temporal predicate directly specifies which byte ranges should be accessed and which can be skipped.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
This tight coupling fundamentally changes the optimization objective. Instead of minimizing index traversal cost or result-set size, the system explicitly minimizes data movement by ensuring that disk I/O is proportional to the query’s spatio-temporal footprint. Consequently, the index serves as an execution-aware abstraction that bridges query semantics and storage behavior, enabling predictable, bounded I/O under both single-query and concurrent workloads.
|
This tight coupling fundamentally changes the optimization objective. Instead of minimizing index traversal cost or result-set size, the system explicitly minimizes data movement by ensuring that disk I/O is proportional to the retrieval's spatio-temporal footprint. Consequently, the index serves as an execution-aware abstraction that bridges retrieval semantics and storage behavior, enabling predictable, bounded I/O under both single-retrieval and concurrent workloads.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
\textbf{Theoretical Cost Analysis.}
|
\textbf{Theoretical Cost Analysis.}
|
||||||
To rigorously quantify the performance advantage, we revisit the query cost model defined in Eq. (\ref{eqn:cost_total}):
|
To rigorously quantify the performance advantage, we revisit the retrieval cost model defined in Eq. (\ref{eqn:cost_total}):
|
||||||
\begin{equation*}
|
\begin{equation*}
|
||||||
Cost(Q) = C_{meta}(Q) + \sum_{R \in \mathcal{R}_Q} \left( C_{geo}(R, Q) + C_{io}(R, Q) \right).
|
Cost(Q) = C_{meta}(Q) + \sum_{R \in \mathcal{R}_Q} \left( C_{geo}(R, Q) + C_{io}(R, Q) \right).
|
||||||
\end{equation*}
|
\end{equation*}
|
||||||
@@ -541,10 +541,10 @@ To rigorously quantify the performance advantage, we revisit the query cost mode
|
|||||||
In traditional full-image reading systems, although the geospatial computation cost is negligible ($C_{geo} = 0$) as no clipping is performed, the I/O cost $C_{io}$ is determined by the full file size. Consequently, the total latency is entirely dominated by massive I/O overhead, rendering $C_{meta}$ (typically milliseconds) irrelevant.
|
In traditional full-image reading systems, although the geospatial computation cost is negligible ($C_{geo} = 0$) as no clipping is performed, the I/O cost $C_{io}$ is determined by the full file size. Consequently, the total latency is entirely dominated by massive I/O overhead, rendering $C_{meta}$ (typically milliseconds) irrelevant.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Existing window-based I/O systems (e.g., ODC or COG-aware libraries) successfully reduce the I/O cost to the size of the requested window. However, this reduction comes at the expense of a significant surge in $C_{geo}$. For every candidate image, the system must perform on-the-fly coordinate transformations and polygon clipping to calculate read offsets. When a query involves thousands of images, the accumulated CPU time ($\sum C_{geo}$) becomes a new bottleneck (e.g., hundreds of milliseconds to seconds), often negating the benefits of I/O reduction (detailed quantitative comparisons are provided in Sec.~\ref{sec:Index_exp_2}).
|
Existing window-based I/O systems (e.g., ODC or COG-aware libraries) successfully reduce the I/O cost to the size of the requested window. However, this reduction comes at the expense of a significant surge in $C_{geo}$. For every candidate image, the system must perform on-the-fly coordinate transformations and polygon clipping to calculate read offsets. When a retrieval involves thousands of images, the accumulated CPU time ($\sum C_{geo}$) becomes a new bottleneck (e.g., hundreds of milliseconds to seconds), often negating the benefits of I/O reduction (detailed quantitative comparisons are provided in Sec.~\ref{sec:Index_exp_2}).
|
||||||
|
|
||||||
\par
|
\par
|
||||||
In contrast, our I/O-aware indexing approach fundamentally alters this trade-off. By materializing the grid-to-pixel mapping in the I2G table, we effectively shift the computational burden from query time to ingestion time. Although the two-phase lookup (G2I and I2G) introduces a slight overhead compared to simple tree traversals, $C_{meta}$ remains in the order of milliseconds—orders of magnitude smaller than disk I/O latency. Since the precise pixel windows are pre-calculated and stored, the runtime geospatial computation is effectively eliminated, i.e., $C_{geo} = 0$. The system retains the minimal I/O cost characteristic of window-based approaches, fetching only relevant byte ranges. Therefore, our design achieves the theoretical minimum for both computation and I/O components within the query execution critical path.
|
In contrast, our I/O-aware indexing approach fundamentally alters this trade-off. By materializing the grid-to-pixel mapping in the I2G table, we effectively shift the computational burden from retrieval time to ingestion time. Although the two-phase lookup (G2I and I2G) introduces a slight overhead compared to simple tree traversals, $C_{meta}$ remains in the order of milliseconds—orders of magnitude smaller than disk I/O latency. Since the precise pixel windows are pre-calculated and stored, the runtime geospatial computation is effectively eliminated, i.e., $C_{geo} = 0$. The system retains the minimal I/O cost characteristic of window-based approaches, fetching only relevant byte ranges. Therefore, our design achieves the theoretical minimum for both computation and I/O components within the retrieval execution critical path.
|
||||||
|
|
||||||
\section{Hybrid Concurrency-Aware I/O Coordination}\label{sec:CC}
|
\section{Hybrid Concurrency-Aware I/O Coordination}\label{sec:CC}
|
||||||
In this section, we propose a hybrid coordination mechanism that adaptively employs either lock-free non-deterministic execution or deterministic coordinated scheduling based on the real-time contention level of spatio-temporal workloads.
|
In this section, we propose a hybrid coordination mechanism that adaptively employs either lock-free non-deterministic execution or deterministic coordinated scheduling based on the real-time contention level of spatio-temporal workloads.
|
||||||
@@ -557,55 +557,55 @@ In this section, we propose a hybrid coordination mechanism that adaptively empl
|
|||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
|
|
||||||
\subsection{Query Admission and I/O Plan Generation}
|
\subsection{Retrieval Admission and I/O Plan Generation}
|
||||||
When a spatio-temporal range query $Q$ arrives, the system first performs index-driven plan generation. The query footprint is rasterized into the global grid to enumerate the intersecting grid cells. The G2I table is then consulted to retrieve the set of candidate images, followed by selective lookups in the I2G table to obtain the corresponding pixel windows.
|
When a spatio-temporal range retrieval $Q$ arrives, the system first performs index-driven plan generation. The retrieval footprint is rasterized into the global grid to enumerate the intersecting grid cells. The G2I table is then consulted to retrieve the set of candidate images, followed by selective lookups in the I2G table to obtain the corresponding pixel windows.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
As a result, each query is translated into an explicit \emph{I/O access plan} consisting of image–window pairs:
|
As a result, each retrieval is translated into an explicit \emph{I/O access plan} consisting of image–window pairs:
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
\label{eq:io_plan}
|
\label{eq:io_plan}
|
||||||
Plan\left( Q \right) =\left\{ \left( img_1,w_1 \right) ,\left( img_1,w_2 \right) ,\left( img_3,w_5 \right) ,... \right\},
|
Plan\left( Q \right) =\left\{ \left( img_1,w_1 \right) ,\left( img_1,w_2 \right) ,\left( img_3,w_5 \right) ,... \right\},
|
||||||
\end{equation}
|
\end{equation}
|
||||||
where each window $w$ denotes a concrete pixel range to be accessed via byte-range I/O. Upon admission, the system assigns each query a unique \emph{QueryID} and records its arrival timestamp.
|
where each window $w$ denotes a concrete pixel range to be accessed via byte-range I/O. Upon admission, the system assigns each retrieval a unique \emph{RetrievalID} and records its arrival timestamp.
|
||||||
|
|
||||||
\subsection{Contention Estimation and Path Selection}
|
\subsection{Contention Estimation and Path Selection}
|
||||||
To minimize the overhead of global ordering in low-contention scenarios, the system introduces a Contention-Aware Switch. Upon the arrival of a query batch $\mathcal{Q} = \{Q_1, Q_2, ..., Q_n\}$, the system first estimates the Spatial Overlap Ratio ($\sigma$) among their generated I/O plans.
|
To minimize the overhead of global ordering in low-contention scenarios, the system introduces a Contention-Aware Switch. Upon the arrival of a retrieval batch $\mathcal{Q} = \{Q_1, Q_2, ..., Q_n\}$, the system first estimates the Spatial Overlap Ratio ($\sigma$) among their generated I/O plans.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Let $A(Plan(Q_i))$ be the aggregate spatial area of all pixel windows in the I/O plan of query $Q_i$. The overlap ratio $\sigma$ for a batch is defined as:
|
Let $A(Plan(Q_i))$ be the aggregate spatial area of all pixel windows in the I/O plan of retrieval $Q_i$. The overlap ratio $\sigma$ for a batch is defined as:
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
\label{eqn_tuning_table}
|
\label{eqn_tuning_table}
|
||||||
\sigma = 1 - \frac{\text{A}(\bigcup_{i=1}^n Plan(Q_i))}{\sum_{i=1}^n \text{A}(Plan(Q_i))},
|
\sigma = 1 - \frac{\text{A}(\bigcup_{i=1}^n Plan(Q_i))}{\sum_{i=1}^n \text{A}(Plan(Q_i))},
|
||||||
\end{equation}
|
\end{equation}
|
||||||
where $\sigma \in [0, 1]$. A high $\sigma$ indicates that multiple queries are competing for the same image regions, leading to high I/O amplification if executed independently.
|
where $\sigma \in [0, 1]$. A high $\sigma$ indicates that multiple retrievals are competing for the same image regions, leading to high I/O amplification if executed independently.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
The system utilizes a rule-based assignment mechanism similar to HDCC \cite{Hong25HDCC} to select the execution path:
|
The system utilizes a rule-based assignment mechanism similar to HDCC \cite{Hong25HDCC} to select the execution path:
|
||||||
\begin{enumerate}
|
\begin{enumerate}
|
||||||
\item Path A (Non-deterministic/OCC-style): If $\sigma < \tau$ (where $\tau$ is a configurable threshold), queries proceed directly to execution to maximize concurrency.
|
\item Path A (Non-deterministic/OCC-style): If $\sigma < \tau$ (where $\tau$ is a configurable threshold), retrievals proceed directly to execution to maximize concurrency.
|
||||||
\item Path B (Deterministic/Calvin-style): If $\sigma \ge \tau$, queries are routed to the Global I/O Plan Queue for coordinated merging.
|
\item Path B (Deterministic/Calvin-style): If $\sigma \ge \tau$, retrievals are routed to the Global I/O Plan Queue for coordinated merging.
|
||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
|
|
||||||
\subsection{Deterministic Coordinated and Non-deterministic Execution}
|
\subsection{Deterministic Coordinated and Non-deterministic Execution}
|
||||||
When $\sigma \ge \tau$, the system switches to a deterministic path to mitigate storage-level contention and I/O amplification, as shown in Fig.~\ref{fig:cc}. To coordinate concurrent access to shared storage resources, we introduce a \emph{Global I/O Plan Queue} that enforces a deterministic ordering over all admitted I/O plans. Each windowed access $(img, w)$ derived from incoming queries is inserted into this queue according to a predefined policy, such as FIFO based on arrival time or lexicographic ordering by $(timestamp, QueryID)$.
|
When $\sigma \ge \tau$, the system switches to a deterministic path to mitigate storage-level contention and I/O amplification, as shown in Fig.~\ref{fig:cc}. To coordinate concurrent access to shared storage resources, we introduce a \emph{Global I/O Plan Queue} that enforces a deterministic ordering over all admitted I/O plans. Each windowed access $(img, w)$ derived from incoming retrievals is inserted into this queue according to a predefined policy, such as FIFO based on arrival time or lexicographic ordering by $(timestamp, RetrievalID)$.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
This design is inspired by deterministic scheduling in systems such as Calvin, but differs fundamentally in its scope: the ordering is imposed on \emph{window-level I/O operations} rather than on transactions. As a result, accesses to the same image region across different queries follow a globally consistent order, preventing uncontrolled interleaving of reads and reducing contention at the storage layer. The deterministic ordering also provides a stable foundation for subsequent I/O coordination and sharing.
|
This design is inspired by deterministic scheduling in systems such as Calvin, but differs fundamentally in its scope: the ordering is imposed on \emph{window-level I/O operations} rather than on transactions. As a result, accesses to the same image region across different retrievals follow a globally consistent order, preventing uncontrolled interleaving of reads and reducing contention at the storage layer. The deterministic ordering also provides a stable foundation for subsequent I/O coordination and sharing.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
The core of our approach lies in coordinating concurrent windowed reads at the image level. Windows originating from different queries may overlap spatially, be adjacent, or even be identical. Executing these requests independently would lead to redundant reads and excessive I/O amplification.
|
The core of our approach lies in coordinating concurrent windowed reads at the image level. Windows originating from different retrievals may overlap spatially, be adjacent, or even be identical. Executing these requests independently would lead to redundant reads and excessive I/O amplification.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
To address this, the system performs three coordination steps within each scheduling interval. Stage 1: Global De-duplication. The system first extracts all windowed access pairs $(img, w)$ from the admitted queries and inserts them into a global window set ($\mathcal{W}_{total}$). If multiple queries $Q_1, Q_2, ..., Q_n$ request the same pixel window $w$ from image $img$, the system retains only one unique entry in $\mathcal{W}_{total}$. This stage ensures that any specific byte range is identified as a single logical requirement, effectively preventing the redundant retrieval of overlapping spatial grids. Stage 2: Range Merding. After de-duplication, the system analyzes the physical disk offsets of all unique windows in $\mathcal{W}_{total}$. Following the principle of improving access locality, windows that are physically contiguous or separated by a gap smaller than a threshold $\theta$ are merged into a single read. Stage 3: Dispatching. This stage maintains a mapping between the physical byte-offsets in the buffer and the logical window requirements of each active query. Each query $Q_i$ receives only the exact pixel windows $w \in Plan(Q_i)$ it originally requested. This is achieved via zero-copy memory mapping where possible, or by slicing the shared system buffer into local thread-wise structures. This ensures that while the physical I/O is shared to reduce amplification, the logical execution of each query remains independent and free from irrelevant data interference.
|
To address this, the system performs three coordination steps within each scheduling interval. Stage 1: Global De-duplication. The system first extracts all windowed access pairs $(img, w)$ from the admitted retrievals and inserts them into a global window set ($\mathcal{W}_{total}$). If multiple retrievals $Q_1, Q_2, ..., Q_n$ request the same pixel window $w$ from image $img$, the system retains only one unique entry in $\mathcal{W}_{total}$. This stage ensures that any specific byte range is identified as a single logical requirement, effectively preventing the redundant retrieval of overlapping spatial grids. Stage 2: Range Merging. After de-duplication, the system analyzes the physical disk offsets of all unique windows in $\mathcal{W}_{total}$. Following the principle of improving access locality, windows that are physically contiguous or separated by a gap smaller than a threshold $\theta$ are merged into a single read. Stage 3: Dispatching. This stage maintains a mapping between the physical byte-offsets in the buffer and the logical window requirements of each active retrieval. Each retrieval $Q_i$ receives only the exact pixel windows $w \in Plan(Q_i)$ it originally requested. This is achieved via zero-copy memory mapping where possible, or by slicing the shared system buffer into local thread-wise structures. This ensures that while the physical I/O is shared to reduce amplification, the logical execution of each retrieval remains independent and free from irrelevant data interference.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
For example, when $Q_1$ requests grids $\{1, 2\}$ and $Q_2$ requests grids $\{2, 3\}$, Stage 1 identifies the unique requirement set $\{1, 2, 3\}$. Stage 2 then merges these into a single contiguous I/O operation covering the entire range $[1, 3]$. In Stage 3, the dispatcher identifies memory offsets corresponding to grids $1$ and $2$ within the buffer and maps these slices to the private cache of $Q_1$. For $Q_2$, similarly, the dispatcher extracts and delivers slices for grids $2$ and $3$ to $Q_2$.
|
For example, when $Q_1$ requests grids $\{1, 2\}$ and $Q_2$ requests grids $\{2, 3\}$, Stage 1 identifies the unique requirement set $\{1, 2, 3\}$. Stage 2 then merges these into a single contiguous I/O operation covering the entire range $[1, 3]$. In Stage 3, the dispatcher identifies memory offsets corresponding to grids $1$ and $2$ within the buffer and maps these slices to the private cache of $Q_1$. For $Q_2$, similarly, the dispatcher extracts and delivers slices for grids $2$ and $3$ to $Q_2$.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Through these mechanisms, concurrent queries collaboratively share I/O, and the execution unit becomes a coordinated window read rather than an isolated request. Importantly, this coordination operates entirely at the I/O planning level and does not require any form of locking or transaction-level synchronization.
|
Through these mechanisms, concurrent retrievals collaboratively share I/O, and the execution unit becomes a coordinated window read rather than an isolated request. Importantly, this coordination operates entirely at the I/O planning level and does not require any form of locking or transaction-level synchronization.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
When contention remains below the threshold ($\sigma < \tau$), the system prioritizes low latency over merging efficiency by adopting an optimistic dispatch mechanism, as shown in Fig.~\ref{fig:cc}. Instead of undergoing heavy-weight sorting, I/O plans are immediately offloaded to the execution engine. By utilizing thread-local sublists, each thread independently handles its byte-range requests.
|
When contention remains below the threshold ($\sigma < \tau$), the system prioritizes low latency over merging efficiency by adopting an optimistic dispatch mechanism, as shown in Fig.~\ref{fig:cc}. Instead of undergoing heavy-weight sorting, I/O plans are immediately offloaded to the execution engine. By utilizing thread-local sublists, each thread independently handles its byte-range requests.
|
||||||
@@ -614,36 +614,36 @@ When contention remains below the threshold ($\sigma < \tau$), the system priori
|
|||||||
Once a coordinated window read is scheduled, the system issues the corresponding byte-range I/O request immediately. Read execution is fully optimistic: there is no validation phase, no abort, and no rollback. This is enabled by the immutability of remote-sensing imagery and by the deterministic ordering of I/O plans, which together ensure consistent and repeatable read behavior.
|
Once a coordinated window read is scheduled, the system issues the corresponding byte-range I/O request immediately. Read execution is fully optimistic: there is no validation phase, no abort, and no rollback. This is enabled by the immutability of remote-sensing imagery and by the deterministic ordering of I/O plans, which together ensure consistent and repeatable read behavior.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
A query is considered complete when all windows in its I/O plan have been served and the associated local processing (e.g., reprojection or mosaicking) has finished. By eliminating validation overhead and allowing read execution to proceed independently once scheduled, the system achieves low-latency query completion while maintaining predictable I/O behavior under concurrency.
|
A retrieval is considered complete when all windows in its I/O plan have been served and the associated local processing (e.g., reprojection or mosaicking) has finished. By eliminating validation overhead and allowing read execution to proceed independently once scheduled, the system achieves low-latency retrieval completion while maintaining predictable I/O behavior under concurrency.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Overall, this concurrency-aware I/O coordination mechanism reinterprets concurrency control as a problem of \emph{coordinating shared I/O flows}. By operating at the granularity of windowed reads and leveraging deterministic ordering and optimistic execution, it effectively reduces redundant I/O and improves scalability for multi-user spatio-temporal query workloads.
|
Overall, this concurrency-aware I/O coordination mechanism reinterprets concurrency control as a problem of \emph{coordinating shared I/O flows}. By operating at the granularity of windowed reads and leveraging deterministic ordering and optimistic execution, it effectively reduces redundant I/O and improves scalability for multi-user spatio-temporal retrieval workloads.
|
||||||
|
|
||||||
\section{I/O Stack Tuning}\label{sec:Tuning}
|
\section{I/O Stack Tuning}\label{sec:Tuning}
|
||||||
We first describe an I/O stack tuning problem and then the surrogate-assisted GMAB algorithm is proposed to solve the problem.
|
We first describe an I/O stack tuning problem and then the surrogate-assisted GMAB algorithm is proposed to solve the problem.
|
||||||
|
|
||||||
\subsection{Formulation of Online I/O Tuning}
|
\subsection{Formulation of Online I/O Tuning}
|
||||||
% TODO 这一节的小标题:Tuning Model?
|
% TODO 这一节的小标题:Tuning Model?
|
||||||
We study a concurrency spatio-temporal query engine that processes many range queries at the same time. The system works on large remote sensing images stored in shared storage. Different from traditional HPC jobs or single-application I/O workloads, the system does not run one fixed job. Instead, it keeps receiving a stream of user queries. Each query is turned into many small I/O operations that often touch overlapping regions in large raster files.
|
We study a concurrency spatio-temporal retrieval engine that processes many range retrievals at the same time. The system works on large remote sensing images stored in shared storage. Different from traditional HPC jobs or single-application I/O workloads, the system does not run one fixed job. Instead, it keeps receiving a stream of user retrievals. Each retrieval is turned into many small I/O operations that often touch overlapping regions in large raster files.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Let $\mathcal{Q} = \{Q_1, Q_2, \ldots, Q_N\}$ denote a stream of spatio-temporal range queries submitted by multiple users. Each query $q$ is decomposed by the I/O-aware index into a set of grid-aligned spatial windows based on a predefined global grid system. These windows are further mapped to sub-regions of one or more large remote sensing images. In this way, every query produces an I/O execution context $c= \langle W,M,S \rangle$, where $W$ describes the set of image windows to be accessed, including their sizes, spatial overlap, and distribution across images. $M$ captures window-level coordination opportunities, such as window merging, deduplication, or shared reads across concurrent queries. $S$ represents system-level execution decisions, including batching strategies, I/O scheduling order, and concurrency limits. Importantly, the I/O behavior of the system is not determined solely by static application code, but emerges dynamically from the interaction between query workloads, execution plans, and system policies.
|
Let $\mathcal{Q} = \{Q_1, Q_2, \ldots, Q_N\}$ denote a stream of spatio-temporal range retrievals submitted by multiple users. Each retrieval $q$ is decomposed by the I/O-aware index into a set of grid-aligned spatial windows based on a predefined global grid system. These windows are further mapped to sub-regions of one or more large remote sensing images. In this way, every retrieval produces an I/O execution context $c= \langle W,M,S \rangle$, where $W$ describes the set of image windows to be accessed, including their sizes, spatial overlap, and distribution across images. $M$ captures window-level coordination opportunities, such as window merging, deduplication, or shared reads across concurrent retrievals. $S$ represents system-level execution decisions, including batching strategies, I/O scheduling order, and concurrency limits. Importantly, the I/O behavior of the system is not determined solely by static application code, but emerges dynamically from the interaction between retrieval workloads, execution plans, and system policies.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
The goal of I/O tuning in this system is to optimize the performance of query-induced I/O execution under continuous, concurrent workloads. We focus on minimizing the observed I/O cost per query, which may be measured by metrics such as average query latency, effective I/O throughput, or amortized disk read time. Let $\theta \in \varTheta$ denote a tuning configuration, where each configuration specifies a combination of system-level I/O control parameters, including window batching size, merge thresholds, queue depth, concurrency limits, and selected storage-level parameters exposed to the engine. Unlike traditional I/O tuning frameworks, the decision variables $\theta$ are applied at the query execution level, rather than at application startup or compilation time.
|
The goal of I/O tuning in this system is to optimize the performance of retrieval-induced I/O execution under continuous, concurrent workloads. We focus on minimizing the observed I/O cost per retrieval, which may be measured by metrics such as average retrieval latency, effective I/O throughput, or amortized disk read time. Let $\theta \in \varTheta$ denote a tuning configuration, where each configuration specifies a combination of system-level I/O control parameters, including window batching size, merge thresholds, queue depth, concurrency limits, and selected storage-level parameters exposed to the engine. Unlike traditional I/O tuning frameworks, the decision variables $\theta$ are applied at the retrieval execution level, rather than at application startup or compilation time.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
For a given tuning configuration $\theta $ and execution context $c$, the observed I/O performance is inherently stochastic due to: interference among concurrent queries; shared storage contention; variability in window overlap and access locality. We model the observed performance outcome as a random variable:
|
For a given tuning configuration $\theta $ and execution context $c$, the observed I/O performance is inherently stochastic due to: interference among concurrent retrievals; shared storage contention; variability in window overlap and access locality. We model the observed performance outcome as a random variable:
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
\label{eqn_tuning_table}
|
\label{eqn_tuning_table}
|
||||||
Y\left( \theta ,c \right) =f\left( \theta ,c \right) +\epsilon ,
|
Y\left( \theta ,c \right) =f\left( \theta ,c \right) +\epsilon ,
|
||||||
\end{equation}
|
\end{equation}
|
||||||
where $f\left( \cdot \right) $ is an unknown performance function and $\epsilon$ captures stochastic noise. Moreover, as query workloads evolve over time, the distribution of execution contexts $c$ may change, making the tuning problem non-stationary.
|
where $f\left( \cdot \right) $ is an unknown performance function and $\epsilon$ captures stochastic noise. Moreover, as retrieval workloads evolve over time, the distribution of execution contexts $c$ may change, making the tuning problem non-stationary.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
Given a stream of queries $\mathcal{Q}$ and the resulting sequence of execution contexts $\left\{ c_t \right\} $, the problem is to design an online tuning strategy that adaptively selects tuning configurations $\theta _t$ for query execution, so as to minimize the long-term expected I/O cost:
|
Given a stream of retrievals $\mathcal{Q}$ and the resulting sequence of execution contexts $\left\{ c_t \right\} $, the problem is to design an online tuning strategy that adaptively selects tuning configurations $\theta _t$ for retrieval execution, so as to minimize the long-term expected I/O cost:
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
\vspace{-0.05in}
|
\vspace{-0.05in}
|
||||||
@@ -673,27 +673,27 @@ subject to practical constraints on tuning overhead and system stability.
|
|||||||
|
|
||||||
\BlankLine
|
\BlankLine
|
||||||
\tcp{Online Tuning Loop}
|
\tcp{Online Tuning Loop}
|
||||||
\While{arrival of query $q_t$ with execution context $c_t$}{
|
\While{arrival of retrieval $q_t$ with execution context $c_t$}{
|
||||||
|
|
||||||
\tcp{Candidate Generation}
|
\tcp{Candidate Generation}
|
||||||
Apply genetic operators (selection, crossover, mutation) on current population to generate candidate set $\mathcal{C}_t \subset \Theta$\;
|
Apply genetic operators (selection, crossover, mutation) on current population to generate candidate set $\mathcal{C}_t \subset \Theta$\;
|
||||||
|
|
||||||
\tcp{Surrogate-based Pre-evaluation}
|
\tcp{Surrogate-based Pre-evaluation}
|
||||||
\ForEach{$\theta \in \mathcal{C}_t$}{
|
\ForEach{$\theta \in \mathcal{C}_t$}{
|
||||||
$\hat{r}_\theta \leftarrow \tilde{f}(\theta, c_t)$\;
|
$\hat{r}_\theta \leftarrow \tilde{f}(\theta, c_t)$\;
|
||||||
}
|
}
|
||||||
|
|
||||||
\tcp{Candidate Filtering}
|
\tcp{Candidate Filtering}
|
||||||
Select top-$K$ configurations $\mathcal{C}'_t \subset \mathcal{C}_t$ based on $\hat{r}_\theta$ or uncertainty\;
|
Select top-$K$ configurations $\mathcal{C}'_t \subset \mathcal{C}_t$ based on $\hat{r}_\theta$ or uncertainty\;
|
||||||
|
|
||||||
\tcp{Bandit-based Selection}
|
\tcp{Bandit-based Selection}
|
||||||
\ForEach{$\theta \in \mathcal{C}'_t$}{
|
\ForEach{$\theta \in \mathcal{C}'_t$}{
|
||||||
$\text{Score}(\theta) = \hat{\mu}_\theta + \alpha \sqrt{\frac{\log(t+1)}{n_\theta + 1}}$\;
|
$\text{Score}(\theta) = \hat{\mu}_\theta + \alpha \sqrt{\frac{\log(t+1)}{n_\theta + 1}}$\;
|
||||||
}
|
}
|
||||||
Select configuration: $\theta_t = \arg\max_{\theta \in \mathcal{C}'_t} \text{Score}(\theta)$\;
|
Select configuration: $\theta_t = \arg\max_{\theta \in \mathcal{C}'_t} \text{Score}(\theta)$\;
|
||||||
|
|
||||||
\tcp{Query Execution \& Reward Observation}
|
\tcp{Retrieval Execution \& Reward Observation}
|
||||||
Execute query $q_t$ using I/O coordination policy $\theta_t$\;
|
Execute retrieval $q_t$ using I/O coordination policy $\theta_t$\;
|
||||||
Measure performance outcome and compute reward $r_t$\;
|
Measure performance outcome and compute reward $r_t$\;
|
||||||
|
|
||||||
\tcp{State Update}
|
\tcp{State Update}
|
||||||
@@ -709,27 +709,27 @@ subject to practical constraints on tuning overhead and system stability.
|
|||||||
\end{algorithm}
|
\end{algorithm}
|
||||||
|
|
||||||
\par
|
\par
|
||||||
To address the online I/O tuning problem, we use a Surrogate-Assisted Genetic Multi-Armed Bandit (SA-GMAB) framework. It combines genetic search, bandit-style exploration, and a simple performance model. The goal is to handle workloads where behavior changes over time, where results are random, and where queries may affect each other. The main steps of this framework are shown in Algorithm~\ref{alg:sa-gmab}.
|
To address the online I/O tuning problem, we use a Surrogate-Assisted Genetic Multi-Armed Bandit (SA-GMAB) framework. It combines genetic search, bandit-style exploration, and a simple performance model. The goal is to handle workloads where behavior changes over time, where results are random, and where retrievals may affect each other. The main steps of this framework are shown in Algorithm~\ref{alg:sa-gmab}.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
We first initialize the memory table and the surrogate model, and then generate an initial population of configurations (lines 1-–4). In our system, each arm is an I/O tuning configuration $\theta \in \varTheta$. A configuration is a group of I/O control parameters, such as merge thresholds, batch size, queue depth, and limits on parallel requests. The space of possible configurations is large and discrete. It is not possible to list or test all of them. So we do not fix all arms in advance. Instead, new configurations are created dynamically by genetic operators during candidate generation (line 6). Each configuration acts as a policy that tells the system how to run I/O plans during a scheduling period.
|
We first initialize the memory table and the surrogate model, and then generate an initial population of configurations (lines 1-–4). In our system, each arm is an I/O tuning configuration $\theta \in \varTheta$. A configuration is a group of I/O control parameters, such as merge thresholds, batch size, queue depth, and limits on parallel requests. The space of possible configurations is large and discrete. It is not possible to list or test all of them. So we do not fix all arms in advance. Instead, new configurations are created dynamically by genetic operators during candidate generation (line 6). Each configuration acts as a policy that tells the system how to run I/O plans during a scheduling period.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
When a query $q_t$ with context $c_t$ arrives, the framework enters the online tuning loop (line 5). For this query, a set of candidate configurations is created through selection, crossover, and mutation (line 6). For every candidate configuration, the surrogate model predicts its reward under the current context (lines 7–-9). These predicted rewards are then used to filter and keep only the top promising configurations, or those with high uncertainty (line 10).
|
When a retrieval $q_t$ with context $c_t$ arrives, the framework enters the online tuning loop (line 5). For this retrieval, a set of candidate configurations is created through selection, crossover, and mutation (line 6). For every candidate configuration, the surrogate model predicts its reward under the current context (lines 7–-9). These predicted rewards are then used to filter and keep only the top promising configurations, or those with high uncertainty (line 10).
|
||||||
|
|
||||||
\par
|
\par
|
||||||
When a configuration $\theta$ is used to process a query $q_t$ with context $c_t$, the system observes a random performance result $Y_t=Y\left( \theta ,c_t \right)$. We define the reward as a simple transformation of I/O cost so that a higher reward means better performance. A common form is the negative latency of the query, or the negative I/O time per unit work. Because other queries run at the same time, the reward may change even for the same configuration. Thus, many samples are needed to estimate the expected reward.
|
When a configuration $\theta$ is used to process a retrieval $q_t$ with context $c_t$, the system observes a random performance result $Y_t=Y\left( \theta ,c_t \right)$. We define the reward as a simple transformation of I/O cost so that a higher reward means better performance. A common form is the negative latency of the retrieval, or the negative I/O time per unit work. Because other retrievals run at the same time, the reward may change even for the same configuration. Thus, many samples are needed to estimate the expected reward.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
For the remaining candidates, the framework computes a bandit score using both historical average reward and exploration term (lines 11--13), and then selects the configuration with the highest score (line 14). In this way, the method prefers configurations that have performed well before, but it also tries configurations that have been used only a few times.
|
For the remaining candidates, the framework computes a bandit score using both historical average reward and exploration term (lines 11--13), and then selects the configuration with the highest score (line 14). In this way, the method prefers configurations that have performed well before, but it also tries configurations that have been used only a few times.
|
||||||
|
|
||||||
\par
|
\par
|
||||||
The selected configuration is then applied to execute the query (line 15). After execution, the system observes the performance result and converts it into a reward value (line 16). For each configuration $\theta$, the system keeps a memory entry that records how many times it has been used and its average reward. These values are updated after each execution (lines 17-–18). This keeps all historical observations instead of discarding older ones, so estimates become more accurate over time, and poor configurations are not repeatedly tried.
|
The selected configuration is then applied to execute the retrieval (line 15). After execution, the system observes the performance result and converts it into a reward value (line 16). For each configuration $\theta$, the system keeps a memory entry that records how many times it has been used and its average reward. These values are updated after each execution (lines 17-–18). This keeps all historical observations instead of discarding older ones, so estimates become more accurate over time, and poor configurations are not repeatedly tried.
|
||||||
|
|
||||||
The selected configuration may also be added into the population, while poor ones may be removed (line 19). The surrogate model is retrained periodically using data stored in memory (lines 20-–22), so that its predictions follow the most recent workload. The tuning step counter is then increased (line 23), and the framework continues with the next query (line 24).
|
The selected configuration may also be added into the population, while poor ones may be removed (line 19). The surrogate model is retrained periodically using data stored in memory (lines 20-–22), so that its predictions follow the most recent workload. The tuning step counter is then increased (line 23), and the framework continues with the next retrieval (line 24).
|
||||||
|
|
||||||
\section{Performance Evaluation}\label{sec:EXP}
|
\section{Performance Evaluation}\label{sec:EXP}
|
||||||
First, we introduce the experimental setup, covering the dataset characteristics, query workload generation, and the distributed cluster environment. Then, we present the experimental results evaluating the proposed I/O-aware indexing structure, the hybrid concurrency-aware I/O coordination mechanism, and the online I/O tuning framework, respectively.
|
First, we introduce the experimental setup, covering the dataset characteristics, retrieval workload generation, and the distributed cluster environment. Then, we present the experimental results evaluating the proposed I/O-aware indexing structure, the hybrid concurrency-aware I/O coordination mechanism, and the online I/O tuning framework, respectively.
|
||||||
|
|
||||||
\subsection{Experimental Setup}
|
\subsection{Experimental Setup}
|
||||||
|
|
||||||
@@ -755,12 +755,12 @@ We employed a large-scale real-world remote sensing dataset derived from the Sen
|
|||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{table}
|
\end{table}
|
||||||
|
|
||||||
\subsubsection{Query Workload}
|
\subsubsection{Retrieval Workload}
|
||||||
\par
|
\par
|
||||||
To evaluate the system performance under diverse scenarios, we developed a synthetic workload generator that simulates concurrent spatio-temporal range queries. The query parameters are configured as follows:
|
To evaluate the system performance under diverse scenarios, we developed a synthetic workload generator that simulates concurrent spatio-temporal range retrievals. The retrieval parameters are configured as follows:
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item \textbf{Spatial Extent:} The spatial range of queries follows a log-uniform distribution, ranging from small tile-level access ($0.001\%$ of the scene) to large-scale regional mosaics ($1\%$ to $100\%$ of the scene).
|
\item \textbf{Spatial Extent:} The spatial range of retrievals follows a log-uniform distribution, ranging from small tile-level access ($0.001\%$ of the scene) to large-scale regional mosaics ($1\%$ to $100\%$ of the scene).
|
||||||
\item \textbf{Temporal Range:} Each query specifies a time interval randomly chosen between 1 day and 1 month.
|
\item \textbf{Temporal Range:} Each retrieval specifies a time interval randomly chosen between 1 day and 1 month.
|
||||||
\item \textbf{Concurrency \& Contention:} The number of concurrent clients $N$ varies from 1 to 64. To test the coordination mechanism, we control the Spatial Overlap Ratio $\sigma \in [0, 0.9]$ to simulate workloads ranging from disjoint access to highly concentrated hotspots.
|
\item \textbf{Concurrency \& Contention:} The number of concurrent clients $N$ varies from 1 to 64. To test the coordination mechanism, we control the Spatial Overlap Ratio $\sigma \in [0, 0.9]$ to simulate workloads ranging from disjoint access to highly concentrated hotspots.
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
|
||||||
@@ -803,7 +803,7 @@ All experiments are conducted on a cluster with 9 homogenous nodes (1 master nod
|
|||||||
|
|
||||||
|
|
||||||
\subsection{Evaluating the data indexing structure}
|
\subsection{Evaluating the data indexing structure}
|
||||||
In the following experiments, we measured the indexing on a single node in the cluster, bacause each nodes needs to the indexing for spatial query. We investigated of query performance of the indexing for remote sensing images.
|
In the following experiments, we measured the indexing on a single node in the cluster, bacause each nodes needs to the indexing for spatial retrieval. We investigated of retrieval performance of the indexing for remote sensing images.
|
||||||
|
|
||||||
\subsubsection{I/O Selectivity Analysis}\label{sec:Index_exp_1}
|
\subsubsection{I/O Selectivity Analysis}\label{sec:Index_exp_1}
|
||||||
|
|
||||||
@@ -826,13 +826,13 @@ In the following experiments, we measured the indexing on a single node in the c
|
|||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
\par
|
\par
|
||||||
First, we evaluated the effectiveness of data reduction by measuring the I/O selectivity, defined as the ratio of the retrieved data volume to the total file size. Fig.~\ref{fig:index_exp1} compares our method against Baseline 1 (full-file retrieval) and Baseline 2 (exact window-based reading, e.g., OpenDataCube). As illustrated in Fig.~\ref{fig:index_exp1}(a), Baseline 1 exhibits a linear increase in I/O volume proportional to the file size, resulting in poor selectivity regardless of the query footprint. In contrast, both Baseline 2 and Ours significantly reduce I/O traffic by enabling partial reads. It is worth noting that our method incurs slightly higher I/O volume (approximately $16\%-23\%$ of the file size for small queries) compared to the theoretically optimal Baseline 2 ($10\%-20\%$). This marginal data redundancy is attributed to the grid alignment effect: our index retrieves pixel blocks based on fixed grid boundaries, whereas Baseline 2 performs precise geospatial clipping. Fig.~\ref{fig:index_exp1}(b) further presents the distribution of unnecessary data fraction. While our method introduces a small amount of "over-reading" due to grid padding, it successfully avoids the massive data waste observed in Baseline 1. As we will demonstrate in the next section, this slight compromise in I/O precision is a strategic trade-off that eliminates expensive runtime computations.
|
First, we evaluated the effectiveness of data reduction by measuring the I/O selectivity, defined as the ratio of the retrieved data volume to the total file size. Fig.~\ref{fig:index_exp1} compares our method against Baseline 1 (full-file retrieval) and Baseline 2 (exact window-based reading, e.g., OpenDataCube). As illustrated in Fig.~\ref{fig:index_exp1}(a), Baseline 1 exhibits a linear increase in I/O volume proportional to the file size, resulting in poor selectivity regardless of the retrieval footprint. In contrast, both Baseline 2 and Ours significantly reduce I/O traffic by enabling partial reads. It is worth noting that our method incurs slightly higher I/O volume (approximately $16\%-23\%$ of the file size for small retrievals) compared to the theoretically optimal Baseline 2 ($10\%-20\%$). This marginal data redundancy is attributed to the grid alignment effect: our index retrieves pixel blocks based on fixed grid boundaries, whereas Baseline 2 performs precise geospatial clipping. Fig.~\ref{fig:index_exp1}(b) further presents the distribution of unnecessary data fraction. While our method introduces a small amount of "over-reading" due to grid padding, it successfully avoids the massive data waste observed in Baseline 1. As we will demonstrate in the next section, this slight compromise in I/O precision is a strategic trade-off that eliminates expensive runtime computations.
|
||||||
|
|
||||||
\subsubsection{End-to-End Query Latency}\label{sec:Index_exp_2}
|
\subsubsection{End-to-End Retrieval Latency}\label{sec:Index_exp_2}
|
||||||
|
|
||||||
\begin{figure}[tb]
|
\begin{figure}[tb]
|
||||||
\centering
|
\centering
|
||||||
\subfigure[The query latency]{
|
\subfigure[The retrieval latency]{
|
||||||
\begin{minipage}[b]{0.227\textwidth}
|
\begin{minipage}[b]{0.227\textwidth}
|
||||||
\includegraphics[width=0.98\textwidth]{exp/index_exp2_1.pdf}
|
\includegraphics[width=0.98\textwidth]{exp/index_exp2_1.pdf}
|
||||||
\end{minipage}
|
\end{minipage}
|
||||||
@@ -844,14 +844,14 @@ First, we evaluated the effectiveness of data reduction by measuring the I/O sel
|
|||||||
\end{minipage}
|
\end{minipage}
|
||||||
}
|
}
|
||||||
\label{fig:index_exp2_2}
|
\label{fig:index_exp2_2}
|
||||||
\caption{End-to-End Query Latency}
|
\caption{End-to-End Retrieval Latency}
|
||||||
\label{fig:index_exp2}
|
\label{fig:index_exp2}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
\par
|
\par
|
||||||
We next measured the end-to-end query latency to verify whether the I/O reduction translates into time efficiency. Fig.~\ref{fig:index_exp2}(a) reports the mean and 95th percentile (P95) latency across varying query footprint ratios (log scale).The results reveal three distinct performance behaviors:Baseline 1 shows a high and flat latency curve ($\approx 4500$ ms), dominated by the cost of transferring entire images.Baseline 2, despite its optimal I/O selectivity, exhibits a significant latency floor ($\approx 380$ ms for small queries). This overhead stems from the on-the-fly geospatial computations required to calculate precise read windows.Ours achieves the lowest latency, ranging from 34 ms to 59 ms for typical tile-level queries ($10^{-4}$ coverage).Crucially, for small-to-medium queries, our method outperforms Baseline 2 by an order of magnitude. The gap between the two curves highlights the advantage of our deterministic indexing approach: by pre-materializing grid-to-window mappings, we eliminate runtime coordinate transformations. Although our I/O volume is slightly larger (as shown in Sec.~\ref{sec:Index_exp_1}), the time saved by avoiding computational overhead far outweighs the cost of transferring a few extra kilobytes of padding data.
|
We next measured the end-to-end retrieval latency to verify whether the I/O reduction translates into time efficiency. Fig.~\ref{fig:index_exp2}(a) reports the mean and 95th percentile (P95) latency across varying retrieval footprint ratios (log scale).The results reveal three distinct performance behaviors:Baseline 1 shows a high and flat latency curve ($\approx 4500$ ms), dominated by the cost of transferring entire images.Baseline 2, despite its optimal I/O selectivity, exhibits a significant latency floor ($\approx 380$ ms for small retrievals). This overhead stems from the on-the-fly geospatial computations required to calculate precise read windows.Ours achieves the lowest latency, ranging from 34 ms to 59 ms for typical tile-level retrievals ($10^{-4}$ coverage).Crucially, for small-to-medium retrievals, our method outperforms Baseline 2 by an order of magnitude. The gap between the two curves highlights the advantage of our deterministic indexing approach: by pre-materializing grid-to-window mappings, we eliminate runtime coordinate transformations. Although our I/O volume is slightly larger (as shown in Sec.~\ref{sec:Index_exp_1}), the time saved by avoiding computational overhead far outweighs the cost of transferring a few extra kilobytes of padding data.
|
||||||
|
|
||||||
To empirically validate the cost model proposed in Eq.~\ref{eqn:cost_total}, we further decomposed the query latency into three components: metadata lookup ($C_{meta}$), geospatial computation ($C_{geo}$), and I/O access ($C_{io}$). Fig.~\ref{fig:index_exp2}(b) presents the time consumption breakdown for a representative medium-scale query (involving approx. 50 image tiles). As expected, the latency of Baseline 1 is entirely dominated by $C_{io}$ ($>99\%$), rendering $C_{meta}$ and $C_{geo}$ negligible. The massive data transfer masks all other overheads. While $C_{io}$ of Baseline 2 is successfully reduced to the window size, a new bottleneck emerges in $C_{geo}$. The runtime coordinate transformations and polygon clipping consume nearly $70\%$ of the total execution time (approx. 350 ms). This observation confirms our theoretical analysis that window-based I/O shifts the bottleneck from storage to CPU. The proposed method exhibits a balanced profile. Although $C_{meta}$ increases slightly (approx. 60 ms) due to the two-phase index lookup (G2I + I2G), this cost is well-amortized. Crucially, $C_{geo}$ is effectively eliminated ($<1$ ms) thanks to the pre-computed grid-window mappings. Consequently, our approach achieves a total latency of approx. 150 ms, providing a $3\times$ speedup over Baseline 2 by removing the computational bottleneck without regressing on I/O performance.
|
To empirically validate the cost model proposed in Eq.~\ref{eqn:cost_total}, we further decomposed the retrieval latency into three components: metadata lookup ($C_{meta}$), geospatial computation ($C_{geo}$), and I/O access ($C_{io}$). Fig.~\ref{fig:index_exp2}(b) presents the time consumption breakdown for a representative medium-scale retrieval (involving approx. 50 image tiles). As expected, the latency of Baseline 1 is entirely dominated by $C_{io}$ ($>99\%$), rendering $C_{meta}$ and $C_{geo}$ negligible. The massive data transfer masks all other overheads. While $C_{io}$ of Baseline 2 is successfully reduced to the window size, a new bottleneck emerges in $C_{geo}$. The runtime coordinate transformations and polygon clipping consume nearly $70\%$ of the total execution time (approx. 350 ms). This observation confirms our theoretical analysis that window-based I/O shifts the bottleneck from storage to CPU. The proposed method exhibits a balanced profile. Although $C_{meta}$ increases slightly (approx. 60 ms) due to the two-phase index lookup (G2I + I2G), this cost is well-amortized. Crucially, $C_{geo}$ is effectively eliminated ($<1$ ms) thanks to the pre-computed grid-window mappings. Consequently, our approach achieves a total latency of approx. 150 ms, providing a $3\times$ speedup over Baseline 2 by removing the computational bottleneck without regressing on I/O performance.
|
||||||
|
|
||||||
\subsubsection{Ablation Study}\label{sec:Index_exp_3}
|
\subsubsection{Ablation Study}\label{sec:Index_exp_3}
|
||||||
\begin{figure}[tb]
|
\begin{figure}[tb]
|
||||||
@@ -1145,7 +1145,7 @@ Fig.~\ref{fig:tune_exp4_1} presents a latency trace during steady-state operatio
|
|||||||
|
|
||||||
|
|
||||||
\section{Conclusions}\label{sec:Con}
|
\section{Conclusions}\label{sec:Con}
|
||||||
Modern high-performance remote sensing data management systems face a critical bottleneck shift from metadata discovery to data extraction, driven by prohibitive runtime geospatial computations ($C_{geo}$) and severe I/O contention under concurrent access. This paper presents a comprehensive I/O-aware query processing framework designed to strictly bound query latency and maximize throughput for large-scale spatio-temporal analytics. By introducing the "Index-as-an-Execution-Plan" paradigm and a dual-layer inverted structure (G2I and I2G), we bridge the semantic gap between logical indexing and physical storage, effectively shifting the computational burden from query time to ingestion time.To address the scalability challenges in multi-user environments, we developed a hybrid concurrency-aware I/O coordination protocol that adaptively switches between deterministic ordering and optimistic execution based on spatial contention. Furthermore, to handle the complexity of parameter configuration in fluctuating workloads, we integrated a Surrogate-Assisted Genetic Multi-Armed Bandit (SA-GMAB) mechanism for online automatic I/O tuning.Our empirical evaluation on large-scale Sentinel-2 datasets demonstrates that the proposed I/O-aware index reduces end-to-end latency by an order of magnitude compared to standard window-based reading approaches. The hybrid coordination mechanism effectively converts I/O contention into request merging opportunities, achieving linear throughput scaling significantly superior to traditional isolated execution. Additionally, the SA-GMAB tuning method exhibits faster convergence speed and greater robustness against stochastic noise compared to existing genetic baselines. These findings provide a scalable and predictable path for next-generation remote sensing platforms to support real-time, data-intensive concurrent workloads.
|
Modern high-performance remote sensing data management systems face a critical bottleneck shift from metadata discovery to data extraction, driven by prohibitive runtime geospatial computations ($C_{geo}$) and severe I/O contention under concurrent access. This paper presents a comprehensive I/O-aware retrieval processing framework designed to strictly bound retrieval latency and maximize throughput for large-scale spatio-temporal analytics. By introducing the "Index-as-an-Execution-Plan" paradigm and a dual-layer inverted structure (G2I and I2G), we bridge the semantic gap between logical indexing and physical storage, effectively shifting the computational burden from retrieval time to ingestion time.To address the scalability challenges in multi-user environments, we developed a hybrid concurrency-aware I/O coordination protocol that adaptively switches between deterministic ordering and optimistic execution based on spatial contention. Furthermore, to handle the complexity of parameter configuration in fluctuating workloads, we integrated a Surrogate-Assisted Genetic Multi-Armed Bandit (SA-GMAB) mechanism for online automatic I/O tuning.Our empirical evaluation on large-scale Sentinel-2 datasets demonstrates that the proposed I/O-aware index reduces end-to-end latency by an order of magnitude compared to standard window-based reading approaches. The hybrid coordination mechanism effectively converts I/O contention into request merging opportunities, achieving linear throughput scaling significantly superior to traditional isolated execution. Additionally, the SA-GMAB tuning method exhibits faster convergence speed and greater robustness against stochastic noise compared to existing genetic baselines. These findings provide a scalable and predictable path for next-generation remote sensing platforms to support real-time, data-intensive concurrent workloads.
|
||||||
|
|
||||||
% if have a single appendix:
|
% if have a single appendix:
|
||||||
%\appendix[Proof of the Zonklar Equations]
|
%\appendix[Proof of the Zonklar Equations]
|
||||||
|
|||||||
Reference in New Issue
Block a user