Build:
cmake --preset linux-ninja-release && cmake --build --preset linux-ninja-release
This directory contains example implementations of feedback validation plugins for ThemisDB's LoRA continuous learning system.
The feedback system supports optional validation through plugins. Plugins can:
- Accept/reject/flag feedback
- Detect spam, PII, or low-quality content
- Add custom analytics data
- Modify feedback before storage
A standalone Python script that demonstrates:
- Spam keyword detection
- PII detection (email, phone, SSN, credit card)
- Quality scoring
- Plugin statistics
Usage:
# Validate feedback (reads JSON from stdin)
echo '{"question": "test", "answer": "test"}' | python3 feedback_validator.py validate
# Get statistics
python3 feedback_validator.py statsConfiguration (optional file: feedback_validator_config.json):
{
"spam_keywords": [
"buy now",
"click here",
"casino"
]
}Built-in C++ plugins:
NoOpFeedbackPlugin- No validationBasicSpamDetectionPlugin- Simple keyword matching
Usage in C++:
#include "llm/i_feedback_plugin.h"
#include "llm/feedback_store.h"
// Create plugin
auto plugin = std::make_shared<themis::llm::BasicSpamDetectionPlugin>();
// Configure
nlohmann::json config = {
{"spam_keywords", {"spam1", "spam2", "spam3"}}
};
plugin->initialize(config);
// Set plugin on feedback store
feedback_store->setValidationPlugin(plugin);
// All feedback will now be validated
auto feedback = /* create feedback */;
auto stored = feedback_store->createFeedback(feedback);- Copy
feedback_validator.pyas a template - Modify the
validate()method with your logic - Deploy the script alongside ThemisDB
- Configure ThemisDB to call your script
Script Interface:
Input (stdin):
{
"question": "User question",
"answer": "Model answer",
"correction": "User correction (optional)",
"comment": "User comment (optional)",
"user_id": "user123",
"adapter_id": "themis_help_lora_v2",
"is_positive": true,
"metadata": {}
}Output (stdout):
{
"result": "accept|reject|flag|modify",
"reason": "Optional reason",
"confidence": 0.95,
"plugin_data": {
"quality_score": 0.8,
"custom_field": "value"
}
}- Create a class that implements
themis::llm::IFeedbackPlugin - Implement required methods:
getName(),getVersion(),getDescription()initialize(config)validate(feedback)shutdown()
- Compile into a shared library or link statically
- Register plugin with FeedbackStore
Example:
class MyCustomPlugin : public themis::llm::IFeedbackPlugin {
public:
std::string getName() const override {
return "my_custom_plugin";
}
std::string getVersion() const override {
return "1.0.0";
}
std::string getDescription() const override {
return "My custom validation logic";
}
bool initialize(const nlohmann::json& config) override {
// Load configuration, models, etc.
return true;
}
ValidationResponse validate(const FeedbackData& feedback) override {
ValidationResponse response;
// Your validation logic here
if (/* some condition */) {
response.result = FeedbackValidationResult::REJECT;
response.reason = "Failed validation";
} else {
response.result = FeedbackValidationResult::ACCEPT;
}
response.confidence_score = 0.95f;
return response;
}
void shutdown() override {
// Cleanup
}
};# Test with sample feedback
cat > test_feedback.json <<EOF
{
"question": "How do I enable sharding?",
"answer": "Use SHARD BY clause",
"is_positive": true,
"user_id": "test_user",
"adapter_id": "test_adapter"
}
EOF
python3 feedback_validator.py validate < test_feedback.jsonTEST(MyPluginTest, BasicValidation) {
auto plugin = std::make_shared<MyCustomPlugin>();
nlohmann::json config = {/* config */};
ASSERT_TRUE(plugin->initialize(config));
themis::llm::FeedbackData feedback;
feedback.question = "test";
feedback.answer = "test";
auto result = plugin->validate(feedback);
EXPECT_EQ(result.result,
themis::llm::FeedbackValidationResult::ACCEPT);
plugin->shutdown();
}Detect and reject spam feedback using ML models or keyword matching.
Detect and redact personally identifiable information (PII) such as:
- Email addresses
- Phone numbers
- Social Security Numbers
- Credit card numbers
Assign quality scores to feedback based on:
- Length and completeness
- Grammar and spelling
- Relevance to the question
- User history
Filter inappropriate content:
- Profanity
- Hate speech
- Violent content
- NSFW material
Implement custom business logic:
- Only accept feedback from verified users
- Require minimum quality threshold
- Flag feedback for specific adapters
- Apply different rules by user role
- Keep validation fast: Validation runs synchronously during feedback submission
- Cache ML models: Load models once in
initialize(), not per validation - Use async for expensive checks: Consider post-storage hooks for heavy processing
- Profile your plugin: Measure validation time and optimize bottlenecks
- Accept by default: Only reject when confident
- Provide clear reasons: Help users understand rejections
- Use confidence scores: Express uncertainty in your decisions
- Log important events: Use logging for debugging and monitoring
- Handle errors gracefully: Don't crash on unexpected input
- Test thoroughly: Test with edge cases and malformed input