LLMExperiment

Conduct controlled experiments on LLM behavior

Social

Overview

Conducts rigorous experiments to characterize LLM behaviors and biases.

  • Experimentally-controlled prompts with variable substitution
  • Multiple temperature settings for comparison
  • Configurable repetitions for statistical validity
  • Custom metrics tracking (length, sentiment, patterns)
  • Statistical analysis including t-tests and variance
  • Response diversity and consistency measurement
  • Automated insight generation from results
  • Comprehensive experiment reports with visualizations
  • Concurrent execution for faster experiment completion

Use cases: Bias studies, cognitive studies, logical performance analysis, consistency testing

Configuration

Parameter Type Description
prompt_templates
List<String> The base prompt templates to test (use {variable} for case-insensitive substitution)
prompt_variables
Map<String, List<String>> Variables to substitute in the prompt template with their possible values
metrics
["response_length", "response_time"]
List<String> Specific metrics to track (e.g., response_length, sentiment, contains_keywords)
temperature_values
[0.1, 0.7]
List<Double> List of temperature values to test
repetitions
3
Int Number of times to repeat each experimental condition
statistical_analysis
true
Boolean Whether to analyze statistical significance of results
significance_level
0.05
Double Significance level for statistical tests (e.g., 0.05 for 95% confidence)
Task Info
Class
LLMExperimentTask
Category
Social
Package
com.simiacryptus.cognotik.plan.tools.social