Dimension-Aware and Occlusion-Preserving Object Compositing for Catalog Image Generation
Generative object compositing methods have shown remarkable ability to seamlessly insert objects into scenes. However, when applied to real-world catalog image generation, these methods require tedious manual intervention: users must carefully adjust masks when product dimensions differ, and painstakingly restore occluded elements post-generation. We present CatalogStitch, a set of model-agnostic techniques that automate these corrections, enabling user-friendly content creation. Our dimension-aware mask computation algorithm automatically adapts the target region to accommodate products with different dimensions; users simply provide a product image and background, without manual mask adjustments. Our occlusion-aware hybrid restoration method guarantees pixel-perfect preservation of occluding elements, eliminating post-editing workflows. We additionally introduce CatalogStitch-Eval, a 58-example benchmark covering aspect-ratio mismatch and occlusion-heavy catalog scenarios, together with supplementary PDF and HTML viewers. We evaluate our techniques with three state-of-the-art compositing models (ObjectStitch, OmniPaint, and InsertAnything), demonstrating consistent improvements across diverse catalog scenarios. By reducing manual intervention and automating tedious corrections, our approach transforms generative compositing into a practical, human-friendly tool for production catalog workflows.
Products retain their native proportions with our adapted mask, while freeform and bounding-box masks cause stretching and distortion. Results consistent across all three compositing models.
Foreground occluders are perfectly preserved after compositing via exact pixel restoration. Comparing before and after our restoration step across mask types and compositing models.
Evaluated on CatalogStitch-Eval across five metrics. Our techniques consistently improve every baseline model.
| Method | AR Error ↓ | Occ. PSNR ↑ | FID ↓ | CLIP ↑ | DINO ↑ |
|---|---|---|---|---|---|
| OmniPaint | 31.07 | — | 137.15 | 83.03 | 68.50 |
| OmniPaint + Ours | 4.57 | — | 135.79 | 86.11 | 72.99 |
| ObjectStitch | 30.97 | 11.60 | 101.55 | 90.27 | 85.76 |
| ObjectStitch + Ours | 5.05 | 26.84 | 91.52 | 90.62 | 88.09 |
| InsertAnything | 29.98 | 13.33 | 105.99 | 90.23 | 82.63 |
| InsertAnything + Ours | 3.92 | 27.54 | 77.72 | 92.68 | 88.30 |
AR Error drops from ~30% to ~4–5% uniformly across all models. InsertAnything + Ours achieves best results on all five metrics.
A challenging evaluation benchmark specifically designed for catalog image compositing, covering both core failure modes.
Products with significantly different aspect ratios — tall lamps replacing wide tables, square bags replacing rectangular clutches, etc.
Products partially occluded by 1–2 foreground elements — plants, vases, side tables, lamps, and decorative objects.