Manage LLM Configurations

With Crafting, org admins manage LLM-related configurations in a centralized location, enabling AI features seamlessly for the entire org. Developers can use AI out of the box with no additional setup required.

Activate AI Capability

If AI has not been configured yet, using AI-related features in Crafting will prompt you to set up AI. As an org admin, visit Connect/LLM to configure LLM providers and models.

Add LLM Providers

The first step is to add LLM providers. Crafting supports commonly used LLM providers as well as cloud provider LLM infrastructure such as Bedrock and Vertex AI.

For most LLM providers, an API key or authentication token is required. These can be stored as org-level secrets. For additional security, a secret can be set to Admin Only and/or Not Mountable.

For cloud provider infrastructure, if the Crafting control plane is deployed in the same cloud, it can be used directly — Crafting will authenticate with the service using the cloud provider's metadata service.

Once the corresponding providers are added, the following commands become available inside workspaces:

cs claude: requires at least one Anthropic provider or Bedrock
cs gemini: requires at least one Gemini provider
cs codex: requires at least one OpenAI provider

These commands automatically install and configure the coding CLIs; they do not need to be pre-installed in the base snapshot.

Add LLM Models

Crafting AI features are not fully activated until at least one LLM model is added and assigned to the GENERIC purpose. Crafting agents use Purpose to select models, giving org admins flexible control over model selection.

For example, the workspace agent performing coding tasks selects the first model assigned to the CODING purpose. Text summarization selects the first model assigned to the FAST purpose. If no model is assigned to the desired purpose, the agent falls back to the model assigned to the GENERIC purpose.

A model can be assigned to zero or any number of purposes. When selecting by purpose, the first model assigned to that purpose is used. A model not assigned to any purpose may not be used automatically by Crafting agents, but can be explicitly selected by the user in specific scenarios, such as in the Copilot UI of the Web IDE.

Optionally, a model can be assigned one or more aliases. This gives org admins the flexibility to switch models at any time while developers continue to reference a fixed alias.

Context Window Limit per Model

Because it is not possible to retrieve the accurate context window limit of a specific model from the provider, an explicit value can be specified when adding a model. If no value is provided, Crafting agents use a general default slightly below 200K tokens.

This value helps Crafting determine when to trigger context compression in large conversations more accurately. If set to a negative value, automatic context compression is disabled, and a conversation may stall if the context window becomes too large.

Export and Import LLM Config

The LLM configuration at the org level can be exported or imported, enabling management via Configuration-as-Code:

cs llm config export — prints the current LLM configuration to stdout.
cs llm config import FILE — imports LLM configuration from a file (or from stdin if FILE is -).

Authorize Global MCP Servers

Custom MCP servers can be deployed as Pinned sandboxes running continuously. For self-hosted deployments, it is recommended to create a dedicated, small node pool for such workloads. Once authorized, an MCP server is made available in every workspace and can be used directly.

See MCP Servers for full details on how MCP servers work in Crafting.

Deploy an MCP server as a Pinned sandbox running 24x7. Expose the HTTP port as an INTERNAL endpoint with auth_proxy disabled. For example:

   endpoints:
   - name: mcp
     type: INTERNAL
     http:
       routes:
       - path_prefix: /
         backend:
           target: slack-mcp
           port: http-stream
       auth_proxy:
         disabled: true
   containers:
   - name: slack-mcp
     ports:
     - name: http-stream
       port: 3000
       protocol: HTTP/TCP
     images: slack-mcp-image:latest
     volume_mounts:
     - name: token
       path: /etc/slack/bot-token
       from:
         volume:
           secret:
             name: slack-bot token
   customizations:
   - mcp_server:
       endpoint: mcp

Authorize this MCP server to be used in all workspaces. Visit Connect/LLM, switch to the Discovery tab, click ADD, and specify the sandbox name.

After completing these steps, the MCP server is added to /run/sandbox/fs/metadata/mcp.json in all workspaces. This configuration can be loaded by any agent. The Crafting workspace agent loads it automatically, and commands such as cs claude also load it automatically.

Session Retention

The Session Config tab on the Connect/LLM page specifies the retention policy for LLM sessions stored by Crafting. Sessions are automatically deleted when they have been inactive beyond the configured retention window.

Limit Use of Templates

The Crafting agent can pick the best matching template and create a sandbox for certain tasks. If a particular template should not be used by the agent, it can be excluded from all tool calls involving templates by adding the following customization to the template definition:

customizations:
- property_set:
    type: crafting.dev/sandbox/llm
    properties:
      authorizedTemplate: excluded