Cognotik | SegmentedImageGenerationTask

SegmentedImageGenerationConfig.json JSON

{
  "output_file": "cityscape_ultra.png",
  "prompts": [
    "A futuristic cyberpunk city at night",
    "Detailed neon signs and flying vehicles",
    "Reflections in puddles and facial details"
  ],
  "upscale_factor": 2.0,
  "min_region_size": 128,
  "tile_overlap": 0.15,
  "retarget_subimages": true
}

→

Session UI / Generated Assets Visual Output

Depth 0: Segmentation Map

Refined Region: Neon Sign

Final: 4096 x 4096

✔ Generated ultra-high-resolution image saved to cityscape_ultra.png

Live Results Showcase

Explore actual artifacts generated by this task in the test workspace.

Configuration Parameters

Field	Type	Description
`output_file`*	String	The output file path for the final high-res image.
`prompts`*	List<String>	List of prompts, one for each level of detail. The first is for the base image.
`input_file`	String	Optional input file path to use as the base image instead of generating one.
`upscale_factor`	Double	Upscale factor per level (e.g., 2.0 for 2x size). Default: `2.0`.
`min_region_size`	Int	Minimum width/height (in pixels) of a region to trigger refinement. Default: `128`.
`max_aspect_ratio`	Double	Maximum aspect ratio for regions (e.g., 3.0 means max 3:1). Default: `3.0`.
`tile_overlap`	Double	Overlap between tiles as a fraction of tile size (0.0-1.0). Default: `0.15`.
`retarget_subimages`	Boolean	Whether to attempt to re-align sub-images to the base image to prevent drift. Default: `true`.
`extension`	String	Output image file extension (e.g., 'jpg', 'png'). Default: `"png"`.

Task Lifecycle

1. Initialization & Base Generation

The task either loads an input_file or generates a root image using the first prompt in the prompts list. This serves as the compositional foundation.

2. Semantic Segmentation (Vision LLM)

A ParsedImageAgent analyzes the current image. It identifies rectangular regions (GenerationRegions) that contain significant detail or objects that would benefit from upscaling.

3. Recursive Refinement

For each identified region, the task crops the area, applies an Img2Img refinement using the next prompt in the sequence, and recursively repeats the process until the maximum depth (prompt count) is reached.

4. Intelligent Compositing

Refined regions are aligned using ImagePatchLocalization and blended back into the high-resolution canvas using a feathering algorithm to ensure seamless transitions.

Direct Task Instantiation

val task = SegmentedImageGenerationTask(
    orchestrationConfig = config,
    planTask = SegmentedImageGenerationConfig(
        output_file = "ultra_render.png",
        prompts = listOf(
            "A lush rainforest with a hidden temple",
            "Ancient stone carvings and moss textures",
            "Individual leaves and water droplets"
        ),
        upscale_factor = 2.0
    )
)

Embedded Execution (UnifiedHarness)

To invoke this task within an automated agentic workflow (e.g., CI/CD or CLI), use the runTask method:

import com.simiacryptus.cognotik.plan.tools.file.SegmentedImageGenerationTask.Companion.SegmentedImageGeneration
import com.simiacryptus.cognotik.plan.tools.file.SegmentedImageGenerationTask.SegmentedImageGenerationConfig
harness.runTask(
    taskType = SegmentedImageGeneration,
    typeConfig = TaskTypeConfig(), // Default static settings
    executionConfig = SegmentedImageGenerationConfig(
        output_file = "high_res_render.png",
        prompts = listOf(
            "A futuristic space station orbiting a gas giant",
            "Detailed docking bays and solar panels",
            "Individual rivets and glowing status lights"
        ),
        upscale_factor = 2.0,
        retarget_subimages = true
    ),
    workspace = File("./output"),
    autoFix = true
)

Prompt Segment

This text is injected into the LLM orchestrator to describe the task's capabilities:

SegmentedImageGeneration - Generates ultra-high-resolution images via recursive upscaling and semantic segmentation
* Use for: Creating complex scenes where specific objects need high detail.
* Mechanism: Generates a base image, uses AI to identify regions needing detail, upscales them, and refines recursively.