The basic concept of prompt
A prompt is the text or information provided by the user to guide an AI model in generating an image based on specific requirements. Essentially, a prompt is the language through which the user communicates their desired outcome to the AI.
In the context of AI image generation, "text to image" refers to the process where the entire generation is based solely on textual descriptions. On the other hand, "image to image" involves using an initial image to convey information, although prompts remain an important element in this process as well.
The scope of a prompt is broad, encompassing everything from the title of the image to its various characteristics and details.
The basic logic of prompt
Firstly, prompts must be written in English. Secondly, prompts should be composed of phrases rather than complete sentences, with phrases separated by commas.
Here are some example prompts to help you generate better images:
Character & Main Characteristics
- Clothing: white dress
- Hair style & color: blonde hair, long hair, short hair
- Face details: small eyes, big mouth, large nose
- Expression: smiling, crying
- Body language: stretching arms, standing
Environment Details
- Indoor/Outdoor
- Main environment: forest, city, street
- Details: tree, bush, white flower, day/night, morning, sunset, sunlight, blue sky
Composition
- Distance: close-up, distant
- Proportion: full body, upper body
Quality
- High quality: best quality, ultra-detailed, masterpiece, high-res, 8k
- Specified high detail: extremely detailed CG unity 8k wallpaper, unreal engine rendered
Painting Style
- Illustration style: painting, illustration, drawing
- Two-dimensional: anime, comic, game CG
- Realistic: photorealistic, realistic
Weight of prompt
The effect of weight is to increase or decrease the priority of certain prompts. For example, if you provide many prompts to the AI, it might overlook some due to the sheer number. Therefore, you can add weight to the prompts you want to emphasize most in the image.
There are two ways to adjust the weight:
- Using Parentheses:
- Adding parentheses around a prompt increases its weight by 1.1 times. For greater emphasis, you can use double parentheses, which increases the weight by 1.331 times.
- Example: (green flowers), (((green flowers)))
- Using Numbers:
- Adding a number to a prompt adjusts its weight by the specified factor.
- Example: (green flowers:1), (white flowers:1.5), (purple flowers:1.5)
However, it's important to avoid setting the weight too high, as it might cause distortion in the image. A reasonable range is from 0.5 to 1.5.
Negative Prompt
Negative prompts specify the elements you do not want to appear in your image. Conversely, positive prompts list the elements you want to include.
Here are some common negative prompts:
- Low quality: low quality, low resolution
- Single color: monochrome, grayscale
- Body & face: ugly, bad proportions, short
- Body details: missing hands, extra fingers
Parameter setting
The higher the number of steps, the more detailed the generated image will be. However, once the steps exceed 20, the changes in the image become minimal. There is only a slight difference between images generated with 20 and 40 steps. Additionally, increasing the steps adds to the algorithm's processing time. Therefore, 20 steps is the default option, while the recommended range is 10 to 30 steps.
The sampler refers to the specific algorithm the AI uses to generate images. WebUI offers more than 10 algorithms, but typically, only 4-5 are commonly used. Different algorithms have distinct characteristics. For example, Euler and Euler a are suitable for illustration styles, while DPM 2M and 2M Karras are faster in generating speed. Samplers with a "+" are recommended for their stability. Some models are optimized for specific samplers, making the recommended sampler the best choice.
If the resolution is too low, the generated image will be blurry and lack detail. If the resolution is too high, the algorithm will run slower. Therefore, it is crucial to test various resolutions to find the optimal balance between quality and efficiency.
For facial resolution, it is recommended to enable this setting as it can enhance the faces of some characters. Tiling should not be enabled unless generating patterns. The safe range for CFG scale is between 7 and 12.
Batch drawing is a useful feature for generating multiple images at once. You can set the number of batches to produce as many images as you need. However, it is recommended to keep the quantity per batch at 1.