Plugin Development Guide
The @DevilsDev/rag-pipeline-utils toolkit is built on a fully modular plugin architecture that enables seamless extension and customization. This comprehensive guide covers plugin development, architecture patterns, contracts, and best practices for building production-ready plugins.
🏗️ Plugin Architecture Overview
The plugin system follows SOLID principles with a registry-based architecture that supports:
- Type Safety: Strict TypeScript interfaces for all plugin contracts
- Runtime Validation: Contract enforcement at registration and execution
- Hot Swapping: Dynamic plugin loading and replacement
- Dependency Injection: Clean separation of concerns
- Extensibility: Support for custom plugin types and middleware
Core Plugin Types
import { PluginRegistry } from '@DevilsDev/rag-pipeline-utils';
const registry = new PluginRegistry();
// Register plugins by type
registry.register('loader', 'my-custom-loader', new MyLoader());
registry.register('embedder', 'my-embedder', new MyEmbedder());
registry.register('retriever', 'my-retriever', new MyRetriever());
registry.register('llm', 'my-llm', new MyLLM());
registry.register('reranker', 'my-reranker', new MyReranker());
// Retrieve and use plugins
const loader = registry.get('loader', 'my-custom-loader');
const result = await loader.load('./document.pdf');
Supported Plugin Types:
loader
: Document ingestion and parsingembedder
: Text-to-vector embedding generationretriever
: Vector similarity search and retrievalllm
: Language model inference and generationreranker
: Context reordering and relevance scoring
📋 Plugin Contracts & Interfaces
1. Loader Plugin Contract
interface LoaderPlugin {
name: string;
version: string;
supportedFormats: string[];
load(filePath: string, options?: LoaderOptions): Promise<Document[]>;
loadBatch(filePaths: string[], options?: LoaderOptions): Promise<Document[]>;
validate(filePath: string): Promise<boolean>;
}
interface Document {
id: string;
content: string;
metadata: Record<string, any>;
chunks?: TextChunk[];
}
interface TextChunk {
id: string;
content: string;
startIndex: number;
endIndex: number;
metadata: Record<string, any>;
}
Example Implementation:
import { BaseLoader } from '@DevilsDev/rag-pipeline-utils';
import fs from 'fs/promises';
import path from 'path';
class CustomMarkdownLoader extends BaseLoader {
constructor() {
super();
this.name = 'custom-markdown';
this.version = '1.0.0';
this.supportedFormats = ['.md', '.markdown', '.txt'];
}
async load(filePath, options = {}) {
const content = await fs.readFile(filePath, 'utf-8');
const metadata = {
filename: path.basename(filePath),
size: content.length,
lastModified: (await fs.stat(filePath)).mtime
};
const chunks = this.chunkText(content, {
chunkSize: options.chunkSize || 1000,
overlap: options.overlap || 200
});
return [{
id: this.generateId(filePath),
content,
metadata,
chunks
}];
}
async validate(filePath) {
const ext = path.extname(filePath).toLowerCase();
return this.supportedFormats.includes(ext);
}
chunkText(text, options) {
// Custom chunking logic
const chunks = [];
const sentences = text.split(/[.!?]+/);
let currentChunk = '';
let chunkIndex = 0;
for (const sentence of sentences) {
if (currentChunk.length + sentence.length > options.chunkSize) {
if (currentChunk) {
chunks.push({
id: `chunk-${chunkIndex++}`,
content: currentChunk.trim(),
startIndex: text.indexOf(currentChunk),
endIndex: text.indexOf(currentChunk) + currentChunk.length,
metadata: { chunkIndex, wordCount: currentChunk.split(' ').length }
});
}
currentChunk = sentence;
} else {
currentChunk += sentence;
}
}
if (currentChunk) {
chunks.push({
id: `chunk-${chunkIndex}`,
content: currentChunk.trim(),
startIndex: text.lastIndexOf(currentChunk),
endIndex: text.lastIndexOf(currentChunk) + currentChunk.length,
metadata: { chunkIndex, wordCount: currentChunk.split(' ').length }
});
}
return chunks;
}
}
2. Embedder Plugin Contract
interface EmbedderPlugin {
name: string;
version: string;
dimensions: number;
maxTokens: number;
embed(text: string, options?: EmbedderOptions): Promise<number[]>;
embedBatch(texts: string[], options?: EmbedderOptions): Promise<number[][]>;
getTokenCount(text: string): number;
}
Example Implementation:
import { BaseEmbedder } from '@DevilsDev/rag-pipeline-utils';
import OpenAI from 'openai';
class CustomOpenAIEmbedder extends BaseEmbedder {
constructor(apiKey, options = {}) {
super();
this.name = 'custom-openai';
this.version = '1.0.0';
this.dimensions = 1536;
this.maxTokens = 8191;
this.client = new OpenAI({ apiKey });
this.model = options.model || 'text-embedding-ada-002';
}
async embed(text, options = {}) {
try {
const response = await this.client.embeddings.create({
model: this.model,
input: text,
encoding_format: 'float'
});
return response.data[0].embedding;
} catch (error) {
throw new Error(`Embedding failed: ${error.message}`);
}
}
async embedBatch(texts, options = {}) {
const batchSize = options.batchSize || 100;
const results = [];
for (let i = 0; i < texts.length; i += batchSize) {
const batch = texts.slice(i, i + batchSize);
const response = await this.client.embeddings.create({
model: this.model,
input: batch,
encoding_format: 'float'
});
results.push(...response.data.map(item => item.embedding));
}
return results;
}
getTokenCount(text) {
// Approximate token count (actual implementation would use tiktoken)
return Math.ceil(text.length / 4);
}
}
3. Retriever Plugin Contract
interface RetrieverPlugin {
name: string;
version: string;
retrieve(query: string, options?: RetrievalOptions): Promise<RetrievalResult[]>;
index(documents: Document[], options?: IndexOptions): Promise<void>;
delete(documentIds: string[]): Promise<void>;
getStats(): Promise<RetrievalStats>;
}
interface RetrievalResult {
id: string;
content: string;
score: number;
metadata: Record<string, any>;
}
4. LLM Plugin Contract
interface LLMPlugin {
name: string;
version: string;
maxTokens: number;
generate(prompt: string, options?: GenerationOptions): Promise<string>;
generateStream(prompt: string, options?: GenerationOptions): AsyncIterableIterator<string>;
getTokenCount(text: string): number;
}
Example Implementation:
import { BaseLLM } from '@DevilsDev/rag-pipeline-utils';
import OpenAI from 'openai';
class CustomOpenAILLM extends BaseLLM {
constructor(apiKey, options = {}) {
super();
this.name = 'custom-openai-llm';
this.version = '1.0.0';
this.maxTokens = 4096;
this.client = new OpenAI({ apiKey });
this.model = options.model || 'gpt-3.5-turbo';
}
async generate(prompt, options = {}) {
const response = await this.client.chat.completions.create({
model: this.model,
messages: [{ role: 'user', content: prompt }],
max_tokens: options.maxTokens || 1000,
temperature: options.temperature || 0.7,
top_p: options.topP || 1.0
});
return response.choices[0].message.content;
}
async* generateStream(prompt, options = {}) {
const stream = await this.client.chat.completions.create({
model: this.model,
messages: [{ role: 'user', content: prompt }],
max_tokens: options.maxTokens || 1000,
temperature: options.temperature || 0.7,
stream: true
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) {
yield content;
}
}
}
getTokenCount(text) {
// Approximate token count
return Math.ceil(text.length / 4);
}
}
🔧 Plugin Registration & Management
Dynamic Plugin Loading
import { PluginManager } from '@DevilsDev/rag-pipeline-utils';
const pluginManager = new PluginManager();
// Load plugin from file
await pluginManager.loadPlugin('./plugins/my-custom-loader.js');
// Load plugin from npm package
await pluginManager.loadPlugin('@company/rag-plugin-suite');
// Load plugin from URL
await pluginManager.loadPlugin('https://github.com/user/plugin.git');
// Register plugin instance directly
pluginManager.register('loader', 'my-loader', new MyCustomLoader());
// List available plugins
const plugins = pluginManager.listPlugins();
console.log('Available plugins:', plugins);
Plugin Metadata & Validation
// Plugin metadata export
export const metadata = {
name: 'custom-pdf-loader',
version: '2.1.0',
description: 'Advanced PDF loader with OCR support',
author: 'Your Name <email@example.com>',
license: 'MIT',
keywords: ['pdf', 'ocr', 'loader'],
dependencies: {
'pdf-parse': '^1.1.1',
'tesseract.js': '^4.0.0'
},
config: {
ocrEnabled: { type: 'boolean', default: false },
language: { type: 'string', default: 'eng' },
quality: { type: 'number', min: 1, max: 5, default: 3 }
}
};
// Plugin validation
import { validatePlugin } from '@DevilsDev/rag-pipeline-utils';
const validation = await validatePlugin(MyCustomLoader, {
strictMode: true,
checkDependencies: true,
validateContract: true
});
if (!validation.isValid) {
console.error('Plugin validation failed:', validation.errors);
}
🚀 CLI Plugin Development Tools
Plugin Scaffolding
# Create new plugin
rag-pipeline plugins create my-loader --type loader
rag-pipeline plugins create my-embedder --type embedder --template typescript
# Plugin templates available:
# - javascript (default)
# - typescript
# - minimal
# - advanced
Plugin Testing & Validation
# Validate plugin
rag-pipeline plugins validate ./my-plugin
rag-pipeline plugins validate ./my-plugin --strict
# Test plugin
rag-pipeline plugins test ./my-plugin --test-data ./test-cases.json
# Benchmark plugin
rag-pipeline plugins benchmark ./my-plugin --iterations 1000
Plugin Publishing
# Package plugin
rag-pipeline plugins package ./my-plugin --output my-plugin-v1.0.0.tgz
# Publish to npm
rag-pipeline plugins publish ./my-plugin --registry npm
# Publish to private registry
rag-pipeline plugins publish ./my-plugin --registry https://my-registry.com
⚙️ Configuration & Runtime
Plugin Configuration in .ragrc.json
{
"plugins": {
"loader": {
"name": "custom-pdf-loader",
"version": "^2.1.0",
"config": {
"ocrEnabled": true,
"language": "eng",
"quality": 4
}
},
"embedder": {
"name": "custom-openai",
"config": {
"model": "text-embedding-ada-002",
"batchSize": 100
}
},
"retriever": {
"name": "pinecone",
"config": {
"environment": "us-west1-gcp",
"indexName": "my-index"
}
},
"llm": {
"name": "openai-gpt-4",
"config": {
"temperature": 0.7,
"maxTokens": 2000
}
}
}
}
Environment-Specific Overrides
// Development environment
process.env.NODE_ENV = 'development';
const devConfig = {
plugins: {
llm: {
name: 'mock-llm', // Use mock for testing
config: { responses: ['Test response'] }
}
}
};
// Production environment
process.env.NODE_ENV = 'production';
const prodConfig = {
plugins: {
llm: {
name: 'openai-gpt-4',
config: {
temperature: 0.3, // Lower temperature for consistency
maxTokens: 1500
}
}
}
};
🎯 Best Practices
1. Error Handling
class RobustPlugin extends BaseLoader {
async load(filePath, options = {}) {
try {
// Validate inputs
if (!filePath || typeof filePath !== 'string') {
throw new Error('Invalid file path provided');
}
// Check file existence
if (!await this.fileExists(filePath)) {
throw new Error(`File not found: ${filePath}`);
}
// Implement retry logic
return await this.withRetry(() => this.processFile(filePath), {
maxRetries: 3,
backoff: 'exponential'
});
} catch (error) {
// Log error with context
this.logger.error('Load operation failed', {
filePath,
error: error.message,
stack: error.stack
});
// Re-throw with enhanced context
throw new Error(`Failed to load ${filePath}: ${error.message}`);
}
}
}
2. Performance Optimization
class OptimizedEmbedder extends BaseEmbedder {
constructor(options = {}) {
super();
this.cache = new LRUCache({ max: 1000 });
this.batchSize = options.batchSize || 50;
this.concurrency = options.concurrency || 3;
}
async embedBatch(texts, options = {}) {
// Check cache first
const uncachedTexts = texts.filter(text => !this.cache.has(text));
if (uncachedTexts.length === 0) {
return texts.map(text => this.cache.get(text));
}
// Process in parallel batches
const batches = this.createBatches(uncachedTexts, this.batchSize);
const results = await Promise.allSettled(
batches.map(batch => this.processBatch(batch))
);
// Cache results
results.forEach((result, index) => {
if (result.status === 'fulfilled') {
result.value.forEach((embedding, i) => {
this.cache.set(uncachedTexts[index * this.batchSize + i], embedding);
});
}
});
return texts.map(text => this.cache.get(text));
}
}
3. Testing & Quality Assurance
// Plugin test suite
import { PluginTester } from '@DevilsDev/rag-pipeline-utils';
const tester = new PluginTester(MyCustomLoader);
// Contract compliance tests
await tester.testContract();
// Performance benchmarks
const benchmarks = await tester.benchmark({
testData: './test-documents/',
iterations: 100,
metrics: ['latency', 'throughput', 'memory']
});
// Integration tests
await tester.testIntegration({
pipeline: myRagPipeline,
testCases: './integration-tests.json'
});
// Generate test report
const report = tester.generateReport();
console.log('Plugin test results:', report);
This comprehensive plugin development guide enables you to build production-ready extensions for @DevilsDev/rag-pipeline-utils. For usage examples, see the Usage Guide, or explore CLI Reference for plugin management commands.