Server Config - ComfyUI

Currently the Server Config settings menu only exists in the Desktop version, and this settings menu item does not exist in other versions

Network

Host: The IP address to listen on

Function: Sets the IP address the server binds to. Default 127.0.0.1 means only local access is allowed. If you need LAN access, you can set it to 0.0.0.0

Although we provide LAN listening settings for the Desktop version, as a desktop application, it is not suitable for use as a server. We recommend that if you need to use ComfyUI as a public service within the LAN, please refer to the manual deployment tutorial to deploy the corresponding ComfyUI service.

Port: The port to listen on

Function: The port number the server listens on. Desktop version defaults to port 8000, Web version typically uses port 8188

TLS Key File: Path to TLS key file for HTTPS

Function: The private key file path required for HTTPS encryption, used to establish secure connections

TLS Certificate File: Path to TLS certificate file for HTTPS

Function: The certificate file path required for HTTPS encryption, used in conjunction with the private key

Enable CORS header: Use ”*” for all origins or specify domain

Function: Cross-Origin Resource Sharing settings, allowing web browsers to access the server from different domains

Maximum upload size (MB)

Function: Limits the maximum size of single file uploads, in MB, default 100MB. Affects upload limits for images, models and other files

CUDA

CUDA device index to use

Function: Specifies which NVIDIA graphics card to use. 0 represents the first graphics card, 1 represents the second, and so on. Important for multi-GPU systems

Use CUDA malloc for memory allocation

Function: Controls whether to use CUDA’s memory allocator. Can improve memory management efficiency in certain situations

Inference

Global floating point precision

Function: Sets the numerical precision for model calculations. FP16 saves VRAM but may affect quality, FP32 is more precise but uses more VRAM

UNET precision

Options:

auto: Automatically selects the most suitable precision
fp64: 64-bit floating point precision, highest precision but largest VRAM usage
fp32: 32-bit floating point precision, standard precision
fp16: 16-bit floating point precision, can save VRAM
bf16: 16-bit brain floating point precision, between fp16 and fp32
fp8_e4m3fn: 8-bit floating point precision (e4m3), minimal VRAM usage
fp8_e5m2: 8-bit floating point precision (e5m2), minimal VRAM usage

Function: Specifically controls the computational precision of the UNET core component of diffusion models. Higher precision can provide better image generation quality but uses more VRAM. Lower precision can significantly save VRAM but may affect the quality of generated results.

VAE precision

Options and Recommendations:

auto: Automatically selects the most suitable precision, recommended for users with 8-12GB VRAM
fp16: 16-bit floating point precision, recommended for users with 6GB or less VRAM, can save VRAM but may affect quality
fp32: 32-bit floating point precision, recommended for users with 16GB or more VRAM who pursue the best quality
bf16: 16-bit brain floating point precision, recommended for newer graphics cards that support this format, can achieve better performance balance

Function: Controls the computational precision of the Variational Autoencoder (VAE), affecting the quality and speed of image encoding/decoding. Higher precision can provide better image reconstruction quality but uses more VRAM. Lower precision can save VRAM but may affect image detail restoration.

Run VAE on CPU

Function: Forces VAE to run on CPU, can save VRAM but will reduce processing speed

Text Encoder precision

Options:

auto: Automatically selects the most suitable precision
fp8_e4m3fn: 8-bit floating point precision (e4m3), minimal VRAM usage
fp8_e5m2: 8-bit floating point precision (e5m2), minimal VRAM usage
fp16: 16-bit floating point precision, can save VRAM
fp32: 32-bit floating point precision, standard precision

Function: Controls the computational precision of the text prompt encoder, affecting the accuracy of text understanding and VRAM usage. Higher precision can provide more accurate text understanding but uses more VRAM. Lower precision can save VRAM but may affect prompt parsing effectiveness.

Memory

Force channels-last memory format

Function: Changes the data arrangement in memory, may improve performance on certain hardware

DirectML device index

Function: Specifies the device when using DirectML acceleration on Windows, mainly for AMD graphics cards

Disable IPEX optimization

Function: Disables Intel CPU optimization, mainly affects Intel processor performance

VRAM management mode

Options:

auto: Automatically manages VRAM, allocating VRAM based on model size and requirements
lowvram: Low VRAM mode, uses minimal VRAM, may affect generation quality
normalvram: Standard VRAM mode, balances VRAM usage and performance
highvram: High VRAM mode, uses more VRAM for better performance
novram: No VRAM usage, runs entirely on system memory
cpu: CPU-only mode, doesn’t use graphics card

Function: Controls VRAM usage strategy, such as automatic management, low VRAM mode, etc.

Reserved VRAM (GB)

Function: Amount of VRAM reserved for the operating system and other programs, prevents system freezing

Disable smart memory management

Function: Disables automatic memory optimization, forces models to move to system memory to free VRAM

Preview

Method used for latent previews

Options:

none: No preview images displayed, only shows progress bar during generation
auto: Automatically selects the most suitable preview method, dynamically adjusts based on system performance and VRAM
latent2rgb: Directly converts latent space data to RGB images for preview, faster but average quality
taesd: Uses lightweight TAESD model for preview, balances speed and quality

Function: Controls how to preview intermediate results during generation. Different preview methods affect preview quality and performance consumption. Choosing the right preview method can find a balance between preview effects and system resource usage.

Size of preview images

Function: Sets the resolution of preview images, affects preview clarity and performance. Larger sizes provide higher preview quality but also consume more VRAM

Cache

Use classic cache system

Function: Uses traditional caching strategy, more conservative but stable

Use LRU caching with a maximum of N node results cached

Function: Uses Least Recently Used (LRU) algorithm caching system, can cache a specified number of node computation results

Description:

Set a specific number to control maximum cache count, such as 10, 50, 100, etc.
Caching can avoid repeated computation of the same node operations, improving workflow execution speed
When cache reaches the limit, automatically clears the least recently used results
Cached results occupy system memory (RAM/VRAM), larger values use more memory

Usage Recommendations:

Default value is null, meaning LRU caching is not enabled
Set appropriate cache count based on system memory capacity and usage requirements
Recommended for workflows that frequently reuse the same node configurations
If system memory is sufficient, larger values can be set for better performance improvement

Attention

Cross attention method

Options:

auto: Automatically selects the most suitable attention computation method
split: Block-wise attention computation, can save VRAM but slower speed
quad: Uses quad attention algorithm, balances speed and VRAM usage
pytorch: Uses PyTorch native attention computation, faster but higher VRAM usage

Function: Controls the specific algorithm used when the model computes attention. Different algorithms make different trade-offs between generation quality, speed, and VRAM usage. Usually recommended to use auto for automatic selection.

Force attention upcast

Function: Forces high-precision attention computation, improves quality but increases VRAM usage

Prevent attention upcast

Function: Disables high-precision attention computation, saves VRAM

General

Disable xFormers optimization

Function: Disables the optimization features of the xFormers library. xFormers is a library specifically designed to optimize the attention mechanisms of Transformer models, typically improving computational efficiency, reducing memory usage, and accelerating inference speed. Disabling this optimization will:

Fall back to standard attention computation methods
May increase memory usage and computation time
Provide a more stable runtime environment in certain situations

Use Cases:

When encountering compatibility issues related to xFormers
When more precise computation results are needed (some optimizations may affect numerical precision)
When debugging or troubleshooting requires using standard implementations

Default hashing function for model files

Options:

sha256: Uses SHA-256 algorithm for hash verification, high security but slower computation
sha1: Uses SHA-1 algorithm, faster but slightly lower security
sha512: Uses SHA-512 algorithm, provides highest security but slowest computation
md5: Uses MD5 algorithm, fastest but lowest security

Function: Sets the hash algorithm for model file verification, used to verify file integrity. Different hash algorithms have different trade-offs between computation speed and security. Usually recommended to use sha256 as the default option, which achieves a good balance between security and performance.

Make pytorch use slower deterministic algorithms when it can

Function: Forces PyTorch to use deterministic algorithms when possible to improve result reproducibility.

Description:

When enabled, PyTorch will prioritize deterministic algorithms over faster non-deterministic algorithms
Same inputs will produce same outputs, helpful for debugging and result verification
Deterministic algorithms typically run slower than non-deterministic algorithms
Even with this setting enabled, completely identical image results cannot be guaranteed in all situations

Use Cases:

Scientific research requiring strict result reproducibility
Debugging processes requiring stable output results
Production environments requiring result consistency

Enable some untested and potentially quality deteriorating optimizations

Function: Enables experimental optimizations that may improve speed but could potentially affect generation quality

Don’t print server output to console

Function: Prevents displaying server runtime information in the console, keeping the interface clean.

Description:

When enabled, ComfyUI server logs and runtime information will not be displayed
Can reduce console information interference, making the interface cleaner
May slightly improve system performance when there’s heavy log output
Default is disabled (false), meaning server output is displayed by default

Use Cases:

Production environments where debugging information is not needed
When wanting to keep the console interface clean
When the system runs stably and log monitoring is not required

Note: It’s recommended to keep this option disabled during development and debugging to promptly view server runtime status and error information.

Disable saving prompt metadata in files

Function: Does not save workflow information in generated images, reducing file size, but also means the loss of corresponding workflow information, preventing you from using workflow output files to reproduce the corresponding generation results

Disable loading all custom nodes

Function: Prevents loading all third-party extension nodes, typically used when troubleshooting issues to locate whether errors are caused by third-party extension nodes

Logging verbosity level

Function: Controls the verbosity level of log output, used for debugging and monitoring system runtime status.

Options:

CRITICAL: Only outputs critical error information that may cause the program to stop running
ERROR: Outputs error information indicating some functions cannot work properly
WARNING: Outputs warning information indicating possible issues that don’t affect main functionality
INFO: Outputs general information including system runtime status and important operation records
DEBUG: Outputs the most detailed debugging information including system internal runtime details

Description:

Log levels increase in verbosity from top to bottom
Each level includes all log information from higher levels
Recommended to set to INFO level for normal use
Can be set to DEBUG level when troubleshooting for more information
Can be set to WARNING or ERROR level in production environments to reduce log volume

Directories

Input directory

Function: Sets the default storage path for input files (such as images, models)

Output directory

Function: Sets the save path for generation results

Text to ImageThis guide will help you understand the concept of text-to-image in AI art generation and complete a text-to-image workflow in ComfyUI

On this page

Network
Host: The IP address to listen on
Port: The port to listen on
TLS Key File: Path to TLS key file for HTTPS
TLS Certificate File: Path to TLS certificate file for HTTPS
Enable CORS header: Use ”*” for all origins or specify domain
Maximum upload size (MB)
CUDA
CUDA device index to use
Use CUDA malloc for memory allocation
Inference
Global floating point precision
UNET precision
VAE precision
Run VAE on CPU
Text Encoder precision
Memory
Force channels-last memory format
DirectML device index
Disable IPEX optimization
VRAM management mode
Reserved VRAM (GB)
Disable smart memory management
Preview
Method used for latent previews
Size of preview images
Cache
Use classic cache system
Use LRU caching with a maximum of N node results cached
Attention
Cross attention method
Force attention upcast
Prevent attention upcast
General
Disable xFormers optimization
Default hashing function for model files
Make pytorch use slower deterministic algorithms when it can
Enable some untested and potentially quality deteriorating optimizations
Don’t print server output to console
Disable saving prompt metadata in files
Disable loading all custom nodes
Logging verbosity level
Directories
Input directory
Output directory

Get Started

Basic Concepts

Interface Guide

Tutorials

Troubleshooting

Community

​Network

​Host: The IP address to listen on

​Port: The port to listen on

​TLS Key File: Path to TLS key file for HTTPS

​TLS Certificate File: Path to TLS certificate file for HTTPS

​Enable CORS header: Use ”*” for all origins or specify domain

​Maximum upload size (MB)

​CUDA

​CUDA device index to use

​Use CUDA malloc for memory allocation

​Inference

​Global floating point precision

​UNET precision

​VAE precision

​Run VAE on CPU

​Text Encoder precision

​Memory

​Force channels-last memory format

​DirectML device index

​Disable IPEX optimization

​VRAM management mode

​Reserved VRAM (GB)

​Disable smart memory management

​Preview

​Method used for latent previews

​Size of preview images

​Cache

​Use classic cache system

​Use LRU caching with a maximum of N node results cached

​Attention

​Cross attention method

​Force attention upcast

​Prevent attention upcast

​General

​Disable xFormers optimization

​Default hashing function for model files

​Make pytorch use slower deterministic algorithms when it can

​Enable some untested and potentially quality deteriorating optimizations

​Don’t print server output to console

​Disable saving prompt metadata in files

​Disable loading all custom nodes

​Logging verbosity level

​Directories

​Input directory

​Output directory

Network

Host: The IP address to listen on

Port: The port to listen on

TLS Key File: Path to TLS key file for HTTPS

TLS Certificate File: Path to TLS certificate file for HTTPS

Enable CORS header: Use ”*” for all origins or specify domain

Maximum upload size (MB)

CUDA

CUDA device index to use

Use CUDA malloc for memory allocation

Inference

Global floating point precision

UNET precision

VAE precision

Run VAE on CPU

Text Encoder precision

Memory

Force channels-last memory format

DirectML device index

Disable IPEX optimization

VRAM management mode

Reserved VRAM (GB)

Disable smart memory management

Preview

Method used for latent previews

Size of preview images

Cache

Use classic cache system

Use LRU caching with a maximum of N node results cached

Attention

Cross attention method

Force attention upcast

Prevent attention upcast

General

Disable xFormers optimization

Default hashing function for model files

Make pytorch use slower deterministic algorithms when it can

Enable some untested and potentially quality deteriorating optimizations

Don’t print server output to console

Disable saving prompt metadata in files

Disable loading all custom nodes

Logging verbosity level

Directories

Input directory

Output directory