SeleniumSessionTask
Automate browser interactions using stateful Selenium WebDriver sessions. Execute JavaScript, manage session lifecycles, and extract token-optimized HTML content for LLM processing.
Stateful
Headless Chrome
Side-Effect: External
ExecutionConfig.json
JSON
{
"url": "https://news.ycombinator.com",
"commands": [
"return document.title;",
"return Array.from(document.querySelectorAll('.titleline > a')).map(a => a.innerText).slice(0, 3);"
],
"simplifyStructure": true,
"createTranscript": true,
"includeCssData": false,
"keepObjectIds": false,
"preserveWhitespace": false
}
→
Session Output
Markdown/UI
# Command 1 Result:
"Hacker News"
# Command 2 Result:
[
"Show HN: Cognotik Design System",
"The future of browser automation",
"Why density matters in UI"
]
"Hacker News"
# Command 2 Result:
[
"Show HN: Cognotik Design System",
"The future of browser automation",
"Why density matters in UI"
]
✔ Transcript generated:
workspaces/session_transcript.md
Live Results Showcase
Explore actual artifacts generated by this task, including session transcripts and scrubbed HTML snapshots.
Execution Configuration
| Field | Type | Description |
|---|---|---|
url |
String |
The URL to navigate to. Required if sessionId is null. |
commands |
List<String> |
Sequential JavaScript commands to execute via executeScript. Can be async. |
sessionId |
String? |
ID for reusing an existing stateful session. Required if url is blank. |
timeout |
Long |
Command timeout in milliseconds. Default: 30000 (30s). |
closeSession |
Boolean |
If true, terminates the driver after execution even if a sessionId was provided. |
includeCssData |
Boolean? |
Include CSS data (styles, classes) in page source. Default: false. |
simplifyStructure |
Boolean |
Collapses nested HTML and removes noise to reduce token usage. Default: true. |
keepObjectIds |
Boolean |
Whether to keep object IDs in the HTML output. Default: false. |
preserveWhitespace |
Boolean |
Whether to preserve whitespace in text nodes. Default: false. |
createTranscript |
Boolean |
Generates a detailed Markdown log of the session. |
Lifecycle & Management
The SeleniumSessionTask manages a pool of headless Chrome instances with a focus on stability and resource efficiency:
- Session Pooling: Limits active sessions to
10to prevent resource exhaustion. - Auto-Cleanup: Automatically removes inactive or crashed sessions during the initialization of new tasks.
- Token Optimization: Uses
HtmlSimplifierto scrub scripts, styles, and redundant attributes, significantly reducing the context window footprint. - Observability: Integrates with Chrome DevTools to capture network requests and console logs, which are mirrored to the system logs.
Prompting & Examples
When using this task in an orchestration loop, the following JavaScript patterns are recommended:
// Get all links on a page
"return Array.from(document.querySelectorAll('a')).map(a => a.href);"
// Click and wait for navigation
"document.querySelector('#submit-btn').click();"
// Async operations
"return new Promise(r => setTimeout(() => r(document.title), 2000));"
Kotlin Boilerplate
Invoke this task directly using the UnifiedHarness in an embedded environment:
import com.simiacryptus.cognotik.plan.tools.session.SeleniumSessionTask.Companion.SeleniumSession
import com.simiacryptus.cognotik.plan.tools.session.SeleniumSessionTask.SeleniumSessionTaskExecutionConfigData
// 1. Define the Job
val executionConfig = SeleniumSessionTaskExecutionConfigData(
url = "https://news.ycombinator.com",
commands = listOf(
"return document.title;",
"return Array.from(document.querySelectorAll('.titleline > a')).map(a => a.innerText);"
),
createTranscript = true,
simplifyStructure = true
)
// 2. Run via Harness
harness.runTask(
taskType = SeleniumSession,
typeConfig = TaskTypeConfig(name = "WebScraper"),
executionConfig = executionConfig,
workspace = File("./scrapes"),
autoFix = true
)
url = "https://target.com",