MyNixOS website logo
Description

Multi-protocol LLM router and client library.

Protocol converter library that lets your application connect to any LLM API (OpenAI, Gemini, Anthropic) with automatic protocol translation, SSE streaming, and function call buffering. Use as library or run as proxy server.

Louter

Multi-protocol LLM proxy and Haskell client library. Connect to any LLM API (OpenAI, Anthropic, Gemini) using any SDK with automatic protocol translation.

Features

  • Protocol Translation: OpenAI ↔ Anthropic ↔ Gemini automatic conversion
  • Dual Usage: Haskell library or standalone proxy server
  • Streaming: Full SSE support with smart buffering
  • Function Calling: Works across all protocols (JSON and XML formats)
  • Vision: Multimodal image support
  • Flexible Auth: Optional authentication for local vs cloud backends

Quick Start

As a Proxy Server

# Install
git clone https://github.com/junjihashimoto/louter.git
cd louter
cabal build all

# Configure
cat > config.yaml <<EOF
backends:
  llama-server:
    type: openai
    url: http://localhost:11211
    requires_auth: false
    model_mapping:
      gpt-4: qwen/qwen2.5-vl-7b
EOF

# Run
cabal run louter-server -- --config config.yaml --port 9000

Now send OpenAI/Anthropic/Gemini requests to localhost:9000.

Test it:

curl http://localhost:9000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'

As a Haskell Library

Add to your project:

# package.yaml
dependencies:
  - louter
  - text
  - aeson

Basic usage:

import Louter.Client
import Louter.Client.OpenAI (llamaServerClient)

main = do
  client <- llamaServerClient "http://localhost:11211"
  response <- chatCompletion client $
    defaultChatRequest "gpt-4" [Message RoleUser "Hello!"]
  print response

Streaming:

import Louter.Client
import Louter.Types.Streaming
import System.IO (hFlush, stdout)

main = do
  client <- llamaServerClient "http://localhost:11211"
  let request = (defaultChatRequest "gpt-4"
        [Message RoleUser "Write a haiku"]) { reqStream = True }

  streamChatWithCallback client request $ \event -> case event of
    StreamContent txt -> putStr txt >> hFlush stdout
    StreamFinish reason -> putStrLn $ "\n[Done: " <> reason <> "]"
    StreamError err -> putStrLn $ "[Error: " <> err <> "]"
    _ -> pure ()

Function calling:

import Data.Aeson (object, (.=))

weatherTool = Tool
  { toolName = "get_weather"
  , toolDescription = Just "Get current weather"
  , toolParameters = object
      [ "type" .= ("object" :: Text)
      , "properties" .= object
          [ "location" .= object
              [ "type" .= ("string" :: Text) ]
          ]
      , "required" .= (["location"] :: [Text])
      ]
  }

request = (defaultChatRequest "gpt-4"
    [Message RoleUser "Weather in Tokyo?"])
    { reqTools = [weatherTool]
    , reqToolChoice = ToolChoiceAuto
    }

Use Cases

FrontendBackendUse Case
OpenAI SDKGemini APIUse OpenAI SDK with Gemini models
Anthropic SDKLocal llama-serverUse Claude Code with local models
Gemini SDKOpenAI APIUse Gemini SDK with GPT models
Any SDKAny BackendProtocol-agnostic development

Configuration

Local model (no auth):

backends:
  local:
    type: openai
    url: http://localhost:11211
    requires_auth: false
    model_mapping:
      gpt-4: qwen/qwen2.5-vl-7b

Cloud API (with auth):

backends:
  openai:
    type: openai
    url: https://api.openai.com
    requires_auth: true
    api_key: "${OPENAI_API_KEY}"
    model_mapping:
      gpt-4: gpt-4-turbo-preview

Multi-backend:

backends:
  local:
    type: openai
    url: http://localhost:11211
    requires_auth: false
    model_mapping:
      gpt-3.5-turbo: qwen/qwen2.5-7b

  openai:
    type: openai
    url: https://api.openai.com
    requires_auth: true
    api_key: "${OPENAI_API_KEY}"
    model_mapping:
      gpt-4: gpt-4-turbo-preview

See examples/ for more configurations.

API Types

Client Creation

-- Local llama-server (no auth)
import Louter.Client.OpenAI (llamaServerClient)
client <- llamaServerClient "http://localhost:11211"

-- Cloud APIs (with auth)
import Louter.Client.OpenAI (openAIClient)
import Louter.Client.Anthropic (anthropicClient)
import Louter.Client.Gemini (geminiClient)

client <- openAIClient "sk-..."
client <- anthropicClient "sk-ant-..."
client <- geminiClient "your-api-key"

Request Types

-- ChatRequest
data ChatRequest = ChatRequest
  { reqModel :: Text
  , reqMessages :: [Message]
  , reqTools :: [Tool]
  , reqTemperature :: Maybe Float
  , reqMaxTokens :: Maybe Int
  , reqStream :: Bool
  }

-- Message
data Message = Message
  { msgRole :: MessageRole  -- RoleSystem | RoleUser | RoleAssistant
  , msgContent :: Text
  }

-- Tool
data Tool = Tool
  { toolName :: Text
  , toolDescription :: Maybe Text
  , toolParameters :: Value  -- JSON schema
  }

Response Types

-- Non-streaming
chatCompletion :: Client -> ChatRequest -> IO (Either Text ChatResponse)

data ChatResponse = ChatResponse
  { respId :: Text
  , respChoices :: [Choice]
  , respUsage :: Maybe Usage
  }

-- Streaming
streamChatWithCallback :: Client -> ChatRequest -> (StreamEvent -> IO ()) -> IO ()

data StreamEvent
  = StreamContent Text           -- Response text
  | StreamReasoning Text         -- Thinking tokens
  | StreamToolCall ToolCall      -- Complete tool call (buffered)
  , StreamFinish FinishReason
  | StreamError Text

Docker

# Build
docker build -t louter .

# Run with config
docker run -p 9000:9000 -v $(pwd)/config.yaml:/app/config.yaml louter

# Or use docker-compose
docker-compose up

Testing

# Python SDK integration tests (43+ tests)
python tests/run_all_tests.py

# Haskell unit tests
cabal test all

Architecture

Client Request (Any Format)
    ↓
Protocol Converter
    ↓
Core IR (OpenAI-based)
    ↓
Backend Adapter
    ↓
LLM Backend (Any Format)

Key Components:

  • SSE Parser: Incremental streaming with attoparsec
  • Smart Buffering: Tool calls buffered until complete JSON
  • Type Safety: Strict Haskell types throughout

Streaming Strategy:

  • Content/Reasoning: Stream immediately (real-time output)
  • Tool Calls: Buffer until complete (valid JSON required)
  • State Machine: Track tool call assembly by index

Proxy Examples

Use OpenAI SDK with Local Models

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:9000/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="gpt-4",  # Routed to qwen/qwen2.5-vl-7b
    messages=[{"role": "user", "content": "Hello!"}]
)

Use Claude Code with Gemini

# config.yaml
backends:
  gemini:
    type: gemini
    url: https://generativelanguage.googleapis.com
    requires_auth: true
    api_key: "${GEMINI_API_KEY}"
    model_mapping:
      claude-3-5-sonnet-20241022: gemini-2.0-flash
# Start proxy on Anthropic-compatible port
cabal run louter-server -- --config config.yaml --port 8000

# Configure Claude Code:
# API Endpoint: http://localhost:8000
# Model: claude-3-5-sonnet-20241022

Monitoring

Health check:

curl http://localhost:9000/health

JSON-line logging:

cabal run louter-server -- --config config.yaml --port 9000 2>&1 | jq .

Troubleshooting

Connection refused:

# Check backend is running
curl http://localhost:11211/v1/models

Invalid API key:

# Verify environment variable
echo $OPENAI_API_KEY

Model not found:

  • Check model_mapping in config
  • Frontend model (client requests) → Backend model (sent to API)

Examples

See examples/ for configuration examples and use cases.

License

MIT License - see LICENSE file.

Metadata

Version

0.1.1.1

License

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows