about demo inference #39

QiqLiang · 2024-12-05T06:10:32Z

Hi, I tried my content image and style image for style transformation using your pretrained models and styleshot _image_driven_demo.py, but the results are very bad, I can’t see any clear object in the output image, the output is messy. And I can’t even get the correct result using the picture in your readme.md. Why does it happen ? I set the prompt to be default, should I add some prompt and how can I get the correct results? Thanks for answering.

Jeoyal · 2024-12-05T06:15:49Z

Hi @QiqLiang , thank you for your interest in our work. Could you please post your content image, style image, text prompt and corresponding results?

QiqLiang · 2024-12-05T07:23:30Z

Hi @QiqLiang , thank you for your interest in our work. Could you please post your content image, style image, text prompt and corresponding results?

Sorry about that, my computer can’t upload images, but I can briefly describe the images. My style image is an infrared photo taken by a camera, my content image is a baby sleeping on the bed, and I didn’t set the text prompt. I want to get an infrared photo of a baby sleeping on bed, but what I actually get is an image with no useful object but some messy and irregular pattern.
And I tried to use your content image and style image by cropping from your readme.md, and I didn’t get the results as what is written in your readme.md either.
Could you tell me what your actual order and how to properly set the prompt? Is it so important to set a proper prompt?
Thanks!

QiqLiang · 2024-12-05T07:28:36Z

And I also tried the styleshot_text_driven_demo.py, I used your handwriting draw of a cat as the style image, and set the prompt as “a dog”, but I can’t see a dog in the corresponding result image either.

Jeoyal · 2024-12-05T08:38:41Z

And I also tried the styleshot_text_driven_demo.py, I used your handwriting draw of a cat as the style image, and set the prompt as “a dog”, but I can’t see a dog in the corresponding result image either.

This is because we use batch images during inference rather than a single style image, this randomness leads to a different sampling diffusion process and result. However, in cases of image-driven style transfer where we use single-image inference, you can try it to verify whether your local settings are correct.

Jeoyal · 2024-12-05T08:43:15Z

The style images and content images are uploaded in here.

QiqLiang · 2024-12-05T08:49:07Z

And I also tried the styleshot_text_driven_demo.py, I used your handwriting draw of a cat as the style image, and set the prompt as “a dog”, but I can’t see a dog in the corresponding result image either.

This is because we use batch images during inference rather than a single style image, this randomness leads to a different sampling diffusion process and result. However, in cases of image-driven style transfer where we use single-image inference, you can try it to verify whether your local settings are correct.

I’m still very confused and have several question,
1, the styleshot _image_driven_demo.py file only accepts single style image as input, how can I use a batch of style images ?
2, what your prompt in cases of image driven style transfer, could you please give me an example so that I can check whether it is caused by prompt.
3, is it so much difference between using a batch of style images or a single style images, cuz the result I get from single style image is very bad, I can’t see any similarity between the results and style images or content images.

Jeoyal · 2024-12-05T08:57:09Z

For question 1.3: Batch means that we perform inference on all the style images in a batch simultaneously, rather than treating them as a single style.
The key difference between inferring a batch at once and inferring a single image lies in the randomness, which introduces slight variations in the subsequent diffusion process and generated result.

For question 2: Please refer to our provided download link.

QiqLiang · 2024-12-06T01:31:18Z

For question 1.3: Batch means that we perform inference on all the style images in a batch simultaneously, rather than treating them as a single style. The key difference between inferring a batch at once and inferring a single image lies in the randomness, which introduces slight variations in the subsequent diffusion process and generated result.

For question 2: Please refer to our provided download link.

thanks a lot, it works. But I still have a question, is there any requirement for the style images and content images? Cuz when I used my own style images, I got bad result.

Jeoyal · 2024-12-09T03:19:56Z

For the fine-grained content control, we provide a version trained on Lineart.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about demo inference #39

about demo inference #39

QiqLiang commented Dec 5, 2024

Jeoyal commented Dec 5, 2024

QiqLiang commented Dec 5, 2024

QiqLiang commented Dec 5, 2024

Jeoyal commented Dec 5, 2024 •

edited

Loading

Jeoyal commented Dec 5, 2024

QiqLiang commented Dec 5, 2024

Jeoyal commented Dec 5, 2024

QiqLiang commented Dec 6, 2024

Jeoyal commented Dec 9, 2024

about demo inference #39

about demo inference #39

Comments

QiqLiang commented Dec 5, 2024

Jeoyal commented Dec 5, 2024

QiqLiang commented Dec 5, 2024

QiqLiang commented Dec 5, 2024

Jeoyal commented Dec 5, 2024 • edited Loading

Jeoyal commented Dec 5, 2024

QiqLiang commented Dec 5, 2024

Jeoyal commented Dec 5, 2024

QiqLiang commented Dec 6, 2024

Jeoyal commented Dec 9, 2024

Jeoyal commented Dec 5, 2024 •

edited

Loading