SegmentedImageGenerationTask
Generates ultra-high-resolution images via recursive upscaling and semantic segmentation. Identifies key regions of interest and refines them individually to bypass standard model resolution limits.
{
"output_file": "cityscape_ultra.png",
"prompts": [
"A futuristic cyberpunk city at night",
"Detailed neon signs and flying vehicles",
"Reflections in puddles and facial details"
],
"upscale_factor": 2.0,
"min_region_size": 128,
"tile_overlap": 0.15,
"retarget_subimages": true
}
Live Results Showcase
Explore actual artifacts generated by this task in the test workspace.
Configuration Parameters
| Field | Type | Description |
|---|---|---|
output_file* |
String | The output file path for the final high-res image. |
prompts* |
List<String> | List of prompts, one for each level of detail. The first is for the base image. |
input_file |
String | Optional input file path to use as the base image instead of generating one. |
upscale_factor |
Double | Upscale factor per level (e.g., 2.0 for 2x size). Default: 2.0. |
min_region_size |
Int | Minimum width/height (in pixels) of a region to trigger refinement. Default: 128. |
max_aspect_ratio |
Double | Maximum aspect ratio for regions (e.g., 3.0 means max 3:1). Default: 3.0. |
tile_overlap |
Double | Overlap between tiles as a fraction of tile size (0.0-1.0). Default: 0.15. |
retarget_subimages |
Boolean | Whether to attempt to re-align sub-images to the base image to prevent drift. Default: true. |
extension |
String | Output image file extension (e.g., 'jpg', 'png'). Default: "png". |
Task Lifecycle
1. Initialization & Base Generation
The task either loads an input_file or generates a root image using the first prompt in the prompts list. This serves as the compositional foundation.
2. Semantic Segmentation (Vision LLM)
A ParsedImageAgent analyzes the current image. It identifies rectangular regions (GenerationRegions) that contain significant detail or objects that would benefit from upscaling.
3. Recursive Refinement
For each identified region, the task crops the area, applies an Img2Img refinement using the next prompt in the sequence, and recursively repeats the process until the maximum depth (prompt count) is reached.
4. Intelligent Compositing
Refined regions are aligned using ImagePatchLocalization and blended back into the high-resolution canvas using a feathering algorithm to ensure seamless transitions.
Direct Task Instantiation
val task = SegmentedImageGenerationTask(
orchestrationConfig = config,
planTask = SegmentedImageGenerationConfig(
output_file = "ultra_render.png",
prompts = listOf(
"A lush rainforest with a hidden temple",
"Ancient stone carvings and moss textures",
"Individual leaves and water droplets"
),
upscale_factor = 2.0
)
)
Embedded Execution (UnifiedHarness)
To invoke this task within an automated agentic workflow (e.g., CI/CD or CLI), use the runTask method:
import com.simiacryptus.cognotik.plan.tools.file.SegmentedImageGenerationTask.Companion.SegmentedImageGeneration
import com.simiacryptus.cognotik.plan.tools.file.SegmentedImageGenerationTask.SegmentedImageGenerationConfig
harness.runTask(
taskType = SegmentedImageGeneration,
typeConfig = TaskTypeConfig(), // Default static settings
executionConfig = SegmentedImageGenerationConfig(
output_file = "high_res_render.png",
prompts = listOf(
"A futuristic space station orbiting a gas giant",
"Detailed docking bays and solar panels",
"Individual rivets and glowing status lights"
),
upscale_factor = 2.0,
retarget_subimages = true
),
workspace = File("./output"),
autoFix = true
)
Prompt Segment
This text is injected into the LLM orchestrator to describe the task's capabilities:
SegmentedImageGeneration - Generates ultra-high-resolution images via recursive upscaling and semantic segmentation
* Use for: Creating complex scenes where specific objects need high detail.
* Mechanism: Generates a base image, uses AI to identify regions needing detail, upscales them, and refines recursively.