Creates a context object that manages the computational state for text generation. The context maintains the conversation history and manages memory efficiently for processing input tokens and generating responses. Each model can have multiple contexts with different settings.
Arguments
- model
A model object returned by
model_load- n_ctx
Maximum context length in tokens (default: 2048). This determines how many tokens of conversation history can be maintained. Larger values require more memory but allow for longer conversations. Must not exceed the model's maximum context length
- n_threads
Number of CPU threads for inference (default: 4). Set to the number of available CPU cores for optimal performance. Only affects CPU computation
- n_seq_max
Maximum number of parallel sequences (default: 1). Used for batch processing multiple conversations simultaneously. Higher values require more memory
- verbosity
Control backend logging during context creation (default: 1L). Larger values print more information:
0emits only errors,1includes warnings,2adds informational logs, and3enables the most verbose debug output.
Value
A context object (external pointer) used for text generation with generate
Examples
if (FALSE) { # \dontrun{
# Load model and create basic context
model <- model_load("path/to/model.gguf")
ctx <- context_create(model)
# Create context with larger buffer for long conversations
long_ctx <- context_create(model, n_ctx = 4096)
# High-performance context with more threads
fast_ctx <- context_create(model, n_ctx = 2048, n_threads = 8)
# Context for batch processing multiple conversations
batch_ctx <- context_create(model, n_ctx = 2048, n_seq_max = 4)
# Create context with minimal verbosity (quiet mode)
quiet_ctx <- context_create(model, verbosity = 2L)
} # }