HDP-Net: Hierarchical Dynamic Prototype Learning for Text-Based Remote Sensing Image Generation

GQ Zhou and WY Wang and ET Gao and X Zhou and JJ Chen and JS Xu and YF Wang and XT Wang and S Liu and X Zhou, IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 18, 29295-29314 (2025).

DOI: 10.1109/JSTARS.2025.3626406

Existing methods for text-based remote sensing image (RSI) generation still face challenges such as inefficient semantic alignment with multiscale spatial relationships. The issue involves aligning fine details with global structures, where models often misrepresent spatial relationships. This study proposes a hierarchical dynamic prototype learning network to address this issue. First, a bidirectional hierarchical prototype module is proposed to model complex features in RSIs, which integrates hierarchical prototype learning with a bidirectional Hopfield network to enhance feature representation. Second, a dynamic gated relative position self-attention mechanism is built to process the integrated feature maps, which combines dynamic weighting, gating factor, and relative position bias to capture spatial and semantic relationships. Third, a CLIP-based cross-modal alignment module is created to optimize text-image matching through contrastive learning, substantially improving visual image consistency. Finally, a text-generated RSI data enhancement algorithm is proposed for extension of the training set to improve model performance and alleviate insufficient text-image pairs. Experimental results using the RSICD (a benchmark for RSI captioning) show that the inception score increases from 5.99 to 7.82, and the Fr & eacute;chet inception distance decreases from 102.44 to 83.57, which represent a 30.5% improvement and an 18.2% reduction with the proposed method relative to Txt2Img-MHN, respectively. Meanwhile, zero-shot classification overall accuracy increases by 9.57% compared to Txt2Img-MHN. These results show significant improvements in semantic alignment in complex scenes.

Return to Publications page