Description

Multi-protocol LLM router and client library.

Description

Protocol converter library that lets your application connect to any LLM API (OpenAI, Gemini, Anthropic) with automatic protocol translation, SSE streaming, and function call buffering. Use as library or run as proxy server.

README.md

hackage.haskell.org

Louter

Multi-protocol LLM proxy and Haskell client library. Connect to any LLM API (OpenAI, Anthropic, Gemini) using any SDK with automatic protocol translation.

Features

Protocol Translation: OpenAI ↔ Anthropic ↔ Gemini automatic conversion
Dual Usage: Haskell library or standalone proxy server
Streaming: Full SSE support with smart buffering
Function Calling: Works across all protocols (JSON and XML formats)
Vision: Multimodal image support
Flexible Auth: Optional authentication for local vs cloud backends

Quick Start

As a Proxy Server

# Install
git clone https://github.com/junjihashimoto/louter.git
cd louter
cabal build all

# Configure
cat > config.yaml <<EOF
backends:
  llama-server:
    type: openai
    url: http://localhost:11211
    requires_auth: false
    model_mapping:
      gpt-4: qwen/qwen2.5-vl-7b
EOF

# Run
cabal run louter-server -- --config config.yaml --port 9000

Now send OpenAI/Anthropic/Gemini requests to localhost:9000.

Test it:

curl http://localhost:9000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'

As a Haskell Library

Add to your project:

# package.yaml
dependencies:
  - louter
  - text
  - aeson

Basic usage:

import Louter.Client
import Louter.Client.OpenAI (llamaServerClient)

main = do
  client <- llamaServerClient "http://localhost:11211"
  response <- chatCompletion client $
    defaultChatRequest "gpt-4" [Message RoleUser "Hello!"]
  print response

Streaming:

import Louter.Client
import Louter.Types.Streaming
import System.IO (hFlush, stdout)

main = do
  client <- llamaServerClient "http://localhost:11211"
  let request = (defaultChatRequest "gpt-4"
        [Message RoleUser "Write a haiku"]) { reqStream = True }

  streamChatWithCallback client request $ \event -> case event of
    StreamContent txt -> putStr txt >> hFlush stdout
    StreamFinish reason -> putStrLn $ "\n[Done: " <> reason <> "]"
    StreamError err -> putStrLn $ "[Error: " <> err <> "]"
    _ -> pure ()

Function calling:

import Data.Aeson (object, (.=))

weatherTool = Tool
  { toolName = "get_weather"
  , toolDescription = Just "Get current weather"
  , toolParameters = object
      [ "type" .= ("object" :: Text)
      , "properties" .= object
          [ "location" .= object
              [ "type" .= ("string" :: Text) ]
          ]
      , "required" .= (["location"] :: [Text])
      ]
  }

request = (defaultChatRequest "gpt-4"
    [Message RoleUser "Weather in Tokyo?"])
    { reqTools = [weatherTool]
    , reqToolChoice = ToolChoiceAuto
    }

Use Cases

Frontend	Backend	Use Case
OpenAI SDK	Gemini API	Use OpenAI SDK with Gemini models
Anthropic SDK	Local llama-server	Use Claude Code with local models
Gemini SDK	OpenAI API	Use Gemini SDK with GPT models
Any SDK	Any Backend	Protocol-agnostic development

Configuration

Local model (no auth):

backends:
  local:
    type: openai
    url: http://localhost:11211
    requires_auth: false
    model_mapping:
      gpt-4: qwen/qwen2.5-vl-7b

Cloud API (with auth):

backends:
  openai:
    type: openai
    url: https://api.openai.com
    requires_auth: true
    api_key: "${OPENAI_API_KEY}"
    model_mapping:
      gpt-4: gpt-4-turbo-preview

Multi-backend:

backends:
  local:
    type: openai
    url: http://localhost:11211
    requires_auth: false
    model_mapping:
      gpt-3.5-turbo: qwen/qwen2.5-7b

  openai:
    type: openai
    url: https://api.openai.com
    requires_auth: true
    api_key: "${OPENAI_API_KEY}"
    model_mapping:
      gpt-4: gpt-4-turbo-preview

See examples/ for more configurations.

API Types

Client Creation

-- Local llama-server (no auth)
import Louter.Client.OpenAI (llamaServerClient)
client <- llamaServerClient "http://localhost:11211"

-- Cloud APIs (with auth)
import Louter.Client.OpenAI (openAIClient)
import Louter.Client.Anthropic (anthropicClient)
import Louter.Client.Gemini (geminiClient)

client <- openAIClient "sk-..."
client <- anthropicClient "sk-ant-..."
client <- geminiClient "your-api-key"

Request Types

-- ChatRequest
data ChatRequest = ChatRequest
  { reqModel :: Text
  , reqMessages :: [Message]
  , reqTools :: [Tool]
  , reqTemperature :: Maybe Float
  , reqMaxTokens :: Maybe Int
  , reqStream :: Bool
  }

-- Message
data Message = Message
  { msgRole :: MessageRole  -- RoleSystem | RoleUser | RoleAssistant
  , msgContent :: Text
  }

-- Tool
data Tool = Tool
  { toolName :: Text
  , toolDescription :: Maybe Text
  , toolParameters :: Value  -- JSON schema
  }

Response Types

-- Non-streaming
chatCompletion :: Client -> ChatRequest -> IO (Either Text ChatResponse)

data ChatResponse = ChatResponse
  { respId :: Text
  , respChoices :: [Choice]
  , respUsage :: Maybe Usage
  }

-- Streaming
streamChatWithCallback :: Client -> ChatRequest -> (StreamEvent -> IO ()) -> IO ()

data StreamEvent
  = StreamContent Text           -- Response text
  | StreamReasoning Text         -- Thinking tokens
  | StreamToolCall ToolCall      -- Complete tool call (buffered)
  , StreamFinish FinishReason
  | StreamError Text

Docker

# Build
docker build -t louter .

# Run with config
docker run -p 9000:9000 -v $(pwd)/config.yaml:/app/config.yaml louter

# Or use docker-compose
docker-compose up

Testing

# Python SDK integration tests (43+ tests)
python tests/run_all_tests.py

# Haskell unit tests
cabal test all

Architecture

Client Request (Any Format)
    ↓
Protocol Converter
    ↓
Core IR (OpenAI-based)
    ↓
Backend Adapter
    ↓
LLM Backend (Any Format)

Key Components:

SSE Parser: Incremental streaming with attoparsec
Smart Buffering: Tool calls buffered until complete JSON
Type Safety: Strict Haskell types throughout

Streaming Strategy:

Content/Reasoning: Stream immediately (real-time output)
Tool Calls: Buffer until complete (valid JSON required)
State Machine: Track tool call assembly by index

Proxy Examples

Use OpenAI SDK with Local Models

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:9000/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="gpt-4",  # Routed to qwen/qwen2.5-vl-7b
    messages=[{"role": "user", "content": "Hello!"}]
)

Use Claude Code with Gemini

# config.yaml
backends:
  gemini:
    type: gemini
    url: https://generativelanguage.googleapis.com
    requires_auth: true
    api_key: "${GEMINI_API_KEY}"
    model_mapping:
      claude-3-5-sonnet-20241022: gemini-2.0-flash

# Start proxy on Anthropic-compatible port
cabal run louter-server -- --config config.yaml --port 8000

# Configure Claude Code:
# API Endpoint: http://localhost:8000
# Model: claude-3-5-sonnet-20241022

Monitoring

Health check:

curl http://localhost:9000/health

JSON-line logging:

cabal run louter-server -- --config config.yaml --port 9000 2>&1 | jq .

Troubleshooting

Connection refused:

# Check backend is running
curl http://localhost:11211/v1/models

Invalid API key:

# Verify environment variable
echo $OPENAI_API_KEY

Model not found:

Check model_mapping in config
Frontend model (client requests) → Backend model (sent to API)

Examples

See examples/ for configuration examples and use cases.

License

MIT License - see LICENSE file.

louter

Louter

Features

Quick Start

As a Proxy Server

As a Haskell Library

Use Cases

Configuration

API Types

Client Creation

Request Types

Response Types

Docker

Testing

Architecture

Proxy Examples

Use OpenAI SDK with Local Models

Use Claude Code with Gemini

Monitoring

Troubleshooting

Examples

License

Version

License

Status

Source

Homepage

Platforms (78)

Louter

Features

Quick Start

As a Proxy Server

As a Haskell Library

Use Cases

Configuration

API Types

Client Creation

Request Types

Response Types

Docker

Testing

Architecture

Proxy Examples

Use OpenAI SDK with Local Models

Use Claude Code with Gemini

Monitoring

Troubleshooting

Examples

License

Version

License

Status

Source

Homepage

Platforms78 (78)

Platforms (78)