Developer API
Before using the Notate API, you'll need to generate an API key from the Settings page under Developer Integration. Keys are stored in your local SQLite database.
API Key Management
Key Features
- Generate new API keys
- Add custom names for organization
- Set expiration periods
- View and manage active keys
Key Management
- Copy keys to clipboard
- Delete unused keys
- View key expiration status
API Endpoints
The Notate API provides three main endpoints for different types of interactions with your collections and AI models.
Vector Query
Perform semantic searches across your collections using vector embeddings. Returns the most semantically similar documents based on the input query.
Endpoint:
POST http://127.0.0.1:47372/api/vector
Request Body:
{ "input": "string", // Required - The search query to find similar documents "collection_name": "string", // Required - Collection to search within "top_k": number, // Optional - Number of results to return (default: 5) }
Example Response:
{ "status": "success", "results": [ { "content": "The document content...", "metadata": { "source": "document.txt", "title": "Document Title", "author": "", "description": "Document description...", "keywords": "", "ogImage": "" } } ] }
RAG Query
Retrieves AI-generated responses with source citations using Retrieval-Augmented Generation. This endpoint combines semantic search with LLM capabilities to provide contextually relevant answers.
Important Note:
For external AI providers (OpenAI, Anthropic, etc.), you must specify both the provider
and model
parameters, and have valid API keys configured in your settings. If you only specify a model
without a provider
, the system will default to using Ollama with local models.
Endpoint:
POST http://127.0.0.1:47372/api/rag
Request Body:
{ "input": "string", // Required - The query to answer "collection_name": "string", // Required - Collection for context retrieval "model": "string", // Required - Model to use (e.g., "gpt-4", "llama2") "provider": "string", // Required for external APIs (e.g., "openai", "anthropic") - Omit for local models "prompt": "string", // Optional - Custom system prompt "top_k": number, // Optional - Number of context chunks (default: 5) "temperature": number, // Optional - Response creativity (default: 0.5) "max_completion_tokens": number, // Optional - Max response length (default: 2048) "top_p": number, // Optional - Nucleus sampling (default: 1) "frequency_penalty": number, // Optional - Repetition control (default: 0) "presence_penalty": number, // Optional - Topic diversity (default: 0) "is_ooba": boolean // Optional - Use Oobabooga LLM processing (default: false) "is_ollama": boolean // Optional - Use Ollama LLM processing (default: false) "character": string // Optional - Oobabooga character }
Example Response:
{ "id": "local-llama3.2-1735945911", "choices": [{ "finish_reason": "stop", "index": 0, "message": { "content": "Example response from RAG query", "role": "assistant" } }], "created": 1735945911, "model": "llama3.2", "object": "chat.completion", "usage": { "completion_tokens": -1, "prompt_tokens": -1, "total_tokens": -1 } }
LLM Query
Direct interaction with the configured Language Model without collection context. Useful for general AI interactions or when collection context isn't needed.
Important Note:
For external AI providers (OpenAI, Anthropic, etc.), you must specify both the provider
and model
parameters, and have valid API keys configured in your settings. If you only specify a model
without a provider
, the system will default to using Ollama with local models.
Endpoint:
POST http://127.0.0.1:47372/api/llm
Request Body:
{ "input": "string", // Required - The prompt or question for the LLM "model": "string", // Required - Model to use (e.g., "gpt-4", "claude-2", "llama2") "provider": "string", // Required for external APIs (e.g., "openai", "anthropic") - Omit for local models "prompt": "string", // Optional - Custom system prompt "temperature": number, // Optional - Response creativity (default: 0.5) "max_completion_tokens": number, // Optional - Max response length (default: 2048) "top_p": number, // Optional - Nucleus sampling (default: 1) "frequency_penalty": number, // Optional - Repetition control (default: 0) "presence_penalty": number, // Optional - Topic diversity (default: 0) "is_ollama": boolean // Optional - Use Ollama LLM processing (default: false) "is_ooba": boolean // Optional - Use Oobabooga LLM processing (default: false) "character": string // Optional - Oobabooga character }
Example Response:
{ "id": "local-llama3.2-1735945911", "choices": [{ "finish_reason": "stop", "index": 0, "message": { "content": "Example response from RAG query", "role": "assistant" } }], "created": 1735945911, "model": "llama3.2", "object": "chat.completion", "usage": { "completion_tokens": -1, "prompt_tokens": -1, "total_tokens": -1 } }
Error Handling
The API uses standard HTTP status codes and returns detailed error messages to help you debug issues.
Common Error Codes
{ "400": "Bad Request - Invalid parameters", "401": "Unauthorized - Invalid API key", "403": "Forbidden - Insufficient permissions", "404": "Not Found - Resource doesn't exist", "409": "Conflict - SQLite constraint violation", "429": "Too Many Requests - Rate limit exceeded", "500": "Internal Server Error - Database error", "503": "Service Unavailable - SQLite database locked" }
Best Practices
Coming soon! We're working on comprehensive best practices documentation for the API.
What to Expect
Our upcoming best practices guide will cover performance optimization, security recommendations, integration patterns, and more. Check back soon!