Server Config
Detailed description of ComfyUI server configuration options
Currently the Server Config
settings menu only exists in the Desktop version, and this settings menu item does not exist in other versions
Network
Host: The IP address to listen on
- Function: Sets the IP address the server binds to. Default
127.0.0.1
means only local access is allowed. If you need LAN access, you can set it to0.0.0.0
Although we provide LAN listening settings for the Desktop version, as a desktop application, it is not suitable for use as a server. We recommend that if you need to use ComfyUI as a public service within the LAN, please refer to the manual deployment tutorial to deploy the corresponding ComfyUI service.
Port: The port to listen on
Function: The port number the server listens on. Desktop version defaults to port 8000, Web version typically uses port 8188
TLS Key File: Path to TLS key file for HTTPS
Function: The private key file path required for HTTPS encryption, used to establish secure connections
TLS Certificate File: Path to TLS certificate file for HTTPS
Function: The certificate file path required for HTTPS encryption, used in conjunction with the private key
Enable CORS header: Use ”*” for all origins or specify domain
Function: Cross-Origin Resource Sharing settings, allowing web browsers to access the server from different domains
Maximum upload size (MB)
Function: Limits the maximum size of single file uploads, in MB, default 100MB. Affects upload limits for images, models and other files
CUDA
CUDA device index to use
Function: Specifies which NVIDIA graphics card to use. 0 represents the first graphics card, 1 represents the second, and so on. Important for multi-GPU systems
Use CUDA malloc for memory allocation
Function: Controls whether to use CUDA’s memory allocator. Can improve memory management efficiency in certain situations
Inference
Global floating point precision
Function: Sets the numerical precision for model calculations. FP16 saves VRAM but may affect quality, FP32 is more precise but uses more VRAM
UNET precision
Options:
auto
: Automatically selects the most suitable precisionfp64
: 64-bit floating point precision, highest precision but largest VRAM usagefp32
: 32-bit floating point precision, standard precisionfp16
: 16-bit floating point precision, can save VRAMbf16
: 16-bit brain floating point precision, between fp16 and fp32fp8_e4m3fn
: 8-bit floating point precision (e4m3), minimal VRAM usagefp8_e5m2
: 8-bit floating point precision (e5m2), minimal VRAM usage
Function: Specifically controls the computational precision of the UNET core component of diffusion models. Higher precision can provide better image generation quality but uses more VRAM. Lower precision can significantly save VRAM but may affect the quality of generated results.
VAE precision
Options and Recommendations:
auto
: Automatically selects the most suitable precision, recommended for users with 8-12GB VRAMfp16
: 16-bit floating point precision, recommended for users with 6GB or less VRAM, can save VRAM but may affect qualityfp32
: 32-bit floating point precision, recommended for users with 16GB or more VRAM who pursue the best qualitybf16
: 16-bit brain floating point precision, recommended for newer graphics cards that support this format, can achieve better performance balance
Function: Controls the computational precision of the Variational Autoencoder (VAE), affecting the quality and speed of image encoding/decoding. Higher precision can provide better image reconstruction quality but uses more VRAM. Lower precision can save VRAM but may affect image detail restoration.
Run VAE on CPU
Function: Forces VAE to run on CPU, can save VRAM but will reduce processing speed
Text Encoder precision
Options:
auto
: Automatically selects the most suitable precisionfp8_e4m3fn
: 8-bit floating point precision (e4m3), minimal VRAM usagefp8_e5m2
: 8-bit floating point precision (e5m2), minimal VRAM usagefp16
: 16-bit floating point precision, can save VRAMfp32
: 32-bit floating point precision, standard precision
Function: Controls the computational precision of the text prompt encoder, affecting the accuracy of text understanding and VRAM usage. Higher precision can provide more accurate text understanding but uses more VRAM. Lower precision can save VRAM but may affect prompt parsing effectiveness.
Memory
Force channels-last memory format
Function: Changes the data arrangement in memory, may improve performance on certain hardware
DirectML device index
Function: Specifies the device when using DirectML acceleration on Windows, mainly for AMD graphics cards
Disable IPEX optimization
Function: Disables Intel CPU optimization, mainly affects Intel processor performance
VRAM management mode
Options:
auto
: Automatically manages VRAM, allocating VRAM based on model size and requirementslowvram
: Low VRAM mode, uses minimal VRAM, may affect generation qualitynormalvram
: Standard VRAM mode, balances VRAM usage and performancehighvram
: High VRAM mode, uses more VRAM for better performancenovram
: No VRAM usage, runs entirely on system memorycpu
: CPU-only mode, doesn’t use graphics card
Function: Controls VRAM usage strategy, such as automatic management, low VRAM mode, etc.
Reserved VRAM (GB)
Function: Amount of VRAM reserved for the operating system and other programs, prevents system freezing
Disable smart memory management
Function: Disables automatic memory optimization, forces models to move to system memory to free VRAM
Preview
Method used for latent previews
Options:
none
: No preview images displayed, only shows progress bar during generationauto
: Automatically selects the most suitable preview method, dynamically adjusts based on system performance and VRAMlatent2rgb
: Directly converts latent space data to RGB images for preview, faster but average qualitytaesd
: Uses lightweight TAESD model for preview, balances speed and quality
Function: Controls how to preview intermediate results during generation. Different preview methods affect preview quality and performance consumption. Choosing the right preview method can find a balance between preview effects and system resource usage.
Size of preview images
Function: Sets the resolution of preview images, affects preview clarity and performance. Larger sizes provide higher preview quality but also consume more VRAM
Cache
Use classic cache system
Function: Uses traditional caching strategy, more conservative but stable
Use LRU caching with a maximum of N node results cached
Function: Uses Least Recently Used (LRU) algorithm caching system, can cache a specified number of node computation results
Description:
- Set a specific number to control maximum cache count, such as 10, 50, 100, etc.
- Caching can avoid repeated computation of the same node operations, improving workflow execution speed
- When cache reaches the limit, automatically clears the least recently used results
- Cached results occupy system memory (RAM/VRAM), larger values use more memory
Usage Recommendations:
- Default value is null, meaning LRU caching is not enabled
- Set appropriate cache count based on system memory capacity and usage requirements
- Recommended for workflows that frequently reuse the same node configurations
- If system memory is sufficient, larger values can be set for better performance improvement
Attention
Cross attention method
Options:
auto
: Automatically selects the most suitable attention computation methodsplit
: Block-wise attention computation, can save VRAM but slower speedquad
: Uses quad attention algorithm, balances speed and VRAM usagepytorch
: Uses PyTorch native attention computation, faster but higher VRAM usage
Function: Controls the specific algorithm used when the model computes attention. Different algorithms make different trade-offs between generation quality, speed, and VRAM usage. Usually recommended to use auto for automatic selection.
Force attention upcast
Function: Forces high-precision attention computation, improves quality but increases VRAM usage
Prevent attention upcast
Function: Disables high-precision attention computation, saves VRAM
General
Disable xFormers optimization
Function: Disables the optimization features of the xFormers library. xFormers is a library specifically designed to optimize the attention mechanisms of Transformer models, typically improving computational efficiency, reducing memory usage, and accelerating inference speed. Disabling this optimization will:
- Fall back to standard attention computation methods
- May increase memory usage and computation time
- Provide a more stable runtime environment in certain situations
Use Cases:
- When encountering compatibility issues related to xFormers
- When more precise computation results are needed (some optimizations may affect numerical precision)
- When debugging or troubleshooting requires using standard implementations
Default hashing function for model files
Options:
sha256
: Uses SHA-256 algorithm for hash verification, high security but slower computationsha1
: Uses SHA-1 algorithm, faster but slightly lower securitysha512
: Uses SHA-512 algorithm, provides highest security but slowest computationmd5
: Uses MD5 algorithm, fastest but lowest security
Function: Sets the hash algorithm for model file verification, used to verify file integrity. Different hash algorithms have different trade-offs between computation speed and security. Usually recommended to use sha256 as the default option, which achieves a good balance between security and performance.
Make pytorch use slower deterministic algorithms when it can
Function: Forces PyTorch to use deterministic algorithms when possible to improve result reproducibility.
Description:
- When enabled, PyTorch will prioritize deterministic algorithms over faster non-deterministic algorithms
- Same inputs will produce same outputs, helpful for debugging and result verification
- Deterministic algorithms typically run slower than non-deterministic algorithms
- Even with this setting enabled, completely identical image results cannot be guaranteed in all situations
Use Cases:
- Scientific research requiring strict result reproducibility
- Debugging processes requiring stable output results
- Production environments requiring result consistency
Enable some untested and potentially quality deteriorating optimizations
Function: Enables experimental optimizations that may improve speed but could potentially affect generation quality
Don’t print server output to console
Function: Prevents displaying server runtime information in the console, keeping the interface clean.
Description:
- When enabled, ComfyUI server logs and runtime information will not be displayed
- Can reduce console information interference, making the interface cleaner
- May slightly improve system performance when there’s heavy log output
- Default is disabled (false), meaning server output is displayed by default
Use Cases:
- Production environments where debugging information is not needed
- When wanting to keep the console interface clean
- When the system runs stably and log monitoring is not required
Note: It’s recommended to keep this option disabled during development and debugging to promptly view server runtime status and error information.
Disable saving prompt metadata in files
Function: Does not save workflow information in generated images, reducing file size, but also means the loss of corresponding workflow information, preventing you from using workflow output files to reproduce the corresponding generation results
Disable loading all custom nodes
Function: Prevents loading all third-party extension nodes, typically used when troubleshooting issues to locate whether errors are caused by third-party extension nodes
Logging verbosity level
Function: Controls the verbosity level of log output, used for debugging and monitoring system runtime status.
Options:
CRITICAL
: Only outputs critical error information that may cause the program to stop runningERROR
: Outputs error information indicating some functions cannot work properlyWARNING
: Outputs warning information indicating possible issues that don’t affect main functionalityINFO
: Outputs general information including system runtime status and important operation recordsDEBUG
: Outputs the most detailed debugging information including system internal runtime details
Description:
- Log levels increase in verbosity from top to bottom
- Each level includes all log information from higher levels
- Recommended to set to INFO level for normal use
- Can be set to DEBUG level when troubleshooting for more information
- Can be set to WARNING or ERROR level in production environments to reduce log volume
Directories
Input directory
Function: Sets the default storage path for input files (such as images, models)
Output directory
Function: Sets the save path for generation results