CatalogStitch | CVPR 2026 HiGen Workshop

Abstract

Generative object compositing methods have shown remarkable ability to seamlessly insert objects into scenes. However, when applied to real-world catalog image generation, these methods require tedious manual intervention: users must carefully adjust masks when product dimensions differ, and painstakingly restore occluded elements post-generation. We present CatalogStitch, a set of model-agnostic techniques that automate these corrections, enabling user-friendly content creation. Our dimension-aware mask computation algorithm automatically adapts the target region to accommodate products with different dimensions; users simply provide a product image and background, without manual mask adjustments. Our occlusion-aware hybrid restoration method guarantees pixel-perfect preservation of occluding elements, eliminating post-editing workflows. We additionally introduce CatalogStitch-Eval, a 58-example benchmark covering aspect-ratio mismatch and occlusion-heavy catalog scenarios, together with supplementary PDF and HTML viewers. We evaluate our techniques with three state-of-the-art compositing models (ObjectStitch, OmniPaint, and InsertAnything), demonstrating consistent improvements across diverse catalog scenarios. By reducing manual intervention and automating tedious corrections, our approach transforms generative compositing into a practical, human-friendly tool for production catalog workflows.

Results

Dimension-Aware Mask Compositing

Products retain their native proportions with our adapted mask, while freeform and bounding-box masks cause stretching and distortion. Results consistent across all three compositing models.

Shoe replacement with different proportions

Background

Product

Freeform

BBox

Ours

InsertAnything

OmniPaint

ObjectStitch

Helmet replacement with aspect ratio mismatch

InsertAnything

OmniPaint

ObjectStitch

Product swap in lifestyle setting

InsertAnything

OmniPaint

ObjectStitch

Results

Occlusion-Aware Restoration

Foreground occluders are perfectly preserved after compositing via exact pixel restoration. Comparing before and after our restoration step across mask types and compositing models.

Sofa with foreground furniture

Background

Product

Occluders

Freeform

BBox

Ours

InsertAnything — Before

InsertAnything — After Restoration

ObjectStitch — Before

ObjectStitch — After Restoration

Scene with decorative elements

InsertAnything — Before

InsertAnything — After Restoration

ObjectStitch — Before

ObjectStitch — After Restoration

Generated scene with occluding objects

InsertAnything — Before

InsertAnything — After Restoration

ObjectStitch — Before

ObjectStitch — After Restoration

Evaluation

Quantitative Results

Evaluated on CatalogStitch-Eval across five metrics. Our techniques consistently improve every baseline model.

Method	AR Error ↓	Occ. PSNR ↑	FID ↓	CLIP ↑	DINO ↑
OmniPaint	31.07	—	137.15	83.03	68.50
OmniPaint + Ours	4.57	—	135.79	86.11	72.99
ObjectStitch	30.97	11.60	101.55	90.27	85.76
ObjectStitch + Ours	5.05	26.84	91.52	90.62	88.09
InsertAnything	29.98	13.33	105.99	90.23	82.63
InsertAnything + Ours	3.92	27.54	77.72	92.68	88.30

AR Error drops from ~30% to ~4–5% uniformly across all models. InsertAnything + Ours achieves best results on all five metrics.

Dataset

CatalogStitch-Eval Benchmark

A challenging evaluation benchmark specifically designed for catalog image compositing, covering both core failure modes.

Dimension Mismatch Scenes

Products with significantly different aspect ratios — tall lamps replacing wide tables, square bags replacing rectangular clutches, etc.

Occlusion Scenes

Products partially occluded by 1–2 foreground elements — plants, vases, side tables, lamps, and decorative objects.

Explore the Benchmark

Dataset Overview Metadata, images, and masks Dimension-Aware Results 35 aspect-ratio mismatch examples Occlusion Results 23 occlusion-heavy examples

Reference

Citation

@inproceedings{catalogstitch, title = {CatalogStitch: Dimension-Aware and Occlusion-Preserving Object Compositing for Catalog Image Generation}, author = {Sanyam Jain and Pragya Kandari and Manit Singhal and He Zhang and Soo Ye Kim}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, workshop = {HiGen: Human-Interactive Generation and Editing}, year = {2026} }