Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/16393
Title: | TexStFusion : a controllable diffusion model using textural, structural, and textual feature fusion |
Authors: | Hegde, Suhas G. Tiwari, Aruna |
Keywords: | Controllable diffusion models;Image editing;Image generation;Text-to-Image diffusion models |
Issue Date: | 2025 |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Citation: | Hegde, S., & Tiwari, A. (2025). TexStFusion : a controllable diffusion model using textural, structural, and textual feature fusion. Signal, Image and Video Processing. https://doi.org/10.1007/s11760-025-04367-2 |
Abstract: | Recent advances in Text-to-Image (T2I) diffusion models enable highly realistic image generation from text. However, long and intricate descriptions often struggle to provide precise controls. To address this, we propose TexStFusion (TEXtural, STructural, TEXtual feature FUSION), a method that adds conditional controls to pre-trained T2I models. Unlike existing approaches relying on visual cues, we introduce composite maps, which fuse texture and structure-text maps derived from TextureNet and StructureNet encoders. This integration occurs without fine-tuning the T2I model, preserving prior knowledge. Our method achieves 25% better FID, 33% better SSIM, and 5% better CLIP-T scores with a dataset of just 30k images, in the best case. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2025. |
URI: | https://dx.doi.org/10.1007/s11760-025-04367-2 https://dspace.iiti.ac.in:8080/jspui/handle/123456789/16393 |
ISSN: | 1863-1703 |
Type of Material: | Journal Article |
Appears in Collections: | Department of Computer Science and Engineering |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: