Skip to contents

Generates text using a loaded language model context with automatic tokenization. Simply provide a text prompt and the model will handle tokenization internally. This function now has a unified API with generate_parallel.

Usage

generate(
  context,
  prompt,
  max_tokens = 100L,
  top_k = 40L,
  top_p = 1,
  temperature = 0,
  repeat_last_n = 0L,
  penalty_repeat = 1,
  seed = 1234L,
  clean = FALSE,
  hash = TRUE
)

Arguments

context

A context object created with context_create

prompt

Character string containing the input text prompt

max_tokens

Maximum number of tokens to generate (default: 100). Higher values produce longer responses

top_k

Top-k sampling parameter (default: 40). Limits vocabulary to k most likely tokens. Use 0 to disable

top_p

Top-p (nucleus) sampling parameter (default: 1.0). Cumulative probability threshold for token selection

temperature

Sampling temperature (default: 0.0). Set to 0 for greedy decoding. Higher values increase creativity

repeat_last_n

Number of recent tokens to consider for repetition penalty (default: 0). Set to 0 to disable

penalty_repeat

Repetition penalty strength (default: 1.0). Values >1 discourage repetition. Set to 1.0 to disable

seed

Random seed for reproducible generation (default: 1234). Use positive integers for deterministic output

clean

If TRUE, strip common chat-template control tokens from the generated text (default: FALSE).

hash

When `TRUE` (default), computes SHA-256 hashes for the provided prompt and the resulting output. Hashes are attached via the `"hashes"` attribute for later inspection.

Value

Character string containing the generated text

Examples

if (FALSE) { # \dontrun{
# Load model and create context
model <- model_load("path/to/model.gguf")
ctx <- context_create(model, n_ctx = 2048)

response <- generate(ctx, "Hello, how are you?", max_tokens = 50)

# Creative writing with higher temperature
story <- generate(ctx, "Once upon a time", max_tokens = 200, temperature = 0.8)

# Prevent repetition
no_repeat <- generate(ctx, "Tell me about AI",
                     repeat_last_n = 64,
                     penalty_repeat = 1.1)

# Clean output (remove special tokens)
clean_output <- generate(ctx, "Explain quantum physics", clean = TRUE)
} # }