feat: add Yutori integration (browsing agent, research, scouts)#20384
feat: add Yutori integration (browsing agent, research, scouts)#20384deviparikh wants to merge 22 commits intoPipedreamHQ:masterfrom
Conversation
Adds Yutori to the Pipedream registry: 11 actions covering browsing, research, and full scout lifecycle management, plus a polling trigger that fires when any scout produces new findings. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
|
Thank you so much for submitting this! We've added it to our backlog to review, and our team has been notified. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a Yutori integration: an authenticated app client, 11 actions for browsing/research tasks and scout CRUD/management, a polling source emitting scout updates with stateful timestamping and pagination, a test event fixture, and a package.json dependency update. Changes
Sequence DiagramssequenceDiagram
participant Workflow as Workflow/User
participant Action as Action (step)
participant App as Yutori App
participant API as Yutori API
Workflow->>Action: invoke action (e.g., create-scout / run-research-task)
Action->>App: call method (createScout/createResearchTask/createBrowsingTask)
App->>API: HTTP POST /scouts or /research_tasks or /browsing_tasks
API-->>App: returns resource
App-->>Action: result
Action->>Workflow: return result and export "$summary"
sequenceDiagram
participant Source as Polling Source
participant DB as State DB
participant App as Yutori App
participant API as Yutori API
participant Sink as Event Sink
Source->>DB: _getLastTimestamp()
DB-->>Source: lastTimestamp or null
Source->>App: getUpdates(since=sinceTimestamp / cursor)
App->>API: GET /updates?since=... or ?cursor=...
API-->>App: {updates[], cursor}
loop paginate (up to MAX_PAGES)
App->>API: GET /updates?cursor=cursor
API-->>App: {updates[], cursor}
end
Source->>Sink: emit events (oldest first)
Source->>DB: _setLastTimestamp(newTs) [only if not truncated]
DB-->>Source: ok
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/yutori/actions/create-scout/create-scout.mjs`:
- Around line 49-55: The "Flat" option currently shows label "Flat" while its
value is "zapier", which can confuse users; update the options array entry that
has value "zapier" (the object with label "Flat" and value "zapier" inside the
options list) to a clearer label such as "Flat (Zapier‑compatible)" so the UI
and payload value are unambiguous; similarly, if needed, adjust the description
string (the description property near default "scout") to mention "Flat
(Zapier‑compatible)" so all references match.
In `@components/yutori/actions/list-scouts/list-scouts.mjs`:
- Around line 26-28: The condition in the run method always evaluates true
because pageSize has a default (50); remove the conditional and always set
params.page_size to this.pageSize (in the async run({ $ }) function) so
params.page_size is consistently populated — update the params construction
around pageSize instead of using if (this.pageSize).
In `@components/yutori/actions/mark-scout-done/mark-scout-done.mjs`:
- Around line 9-13: The annotations object in mark-scout-done.mjs currently sets
destructiveHint: false but this action permanently halts a running scout; change
annotations.destructiveHint to true to reflect that behavior (locate the
annotations block in the mark-scout-done module), update any related
comments/tests/docs referencing the old value, and run the action's consumer
checks to ensure downstream UI/AI agents pick up the new hint.
In `@components/yutori/actions/run-browsing-task/run-browsing-task.mjs`:
- Around line 40-45: The condition in the run method that checks "if
(this.maxSteps)" is redundant because maxSteps has a default (50) and is always
truthy; remove the conditional and always set payload.max_steps from
this.maxSteps when building the payload in async run({ $ }) (i.e.,
unconditionally assign max_steps alongside task and start_url), updating the
payload creation in run to include this.maxSteps directly.
In `@components/yutori/sources/new-scout-update/new-scout-update.mjs`:
- Around line 35-37: The _getLastTimestamp() method currently returns
this.db.get("lastTimestamp") || null which treats 0 as missing; change it to use
the nullish coalescing operator (this.db.get("lastTimestamp") ?? null) so zero
is preserved, and anywhere you check the retrieved value (the later check around
the sinceTimestamp handling at lines referenced) replace the falsy check if
(!sinceTimestamp) with an explicit null/undefined check (e.g., if
(sinceTimestamp == null) or if (sinceTimestamp === null)) so numeric 0 is not
treated as absent.
- Around line 57-67: The polling loop that calls this.yutori.getUpdates trusts
response.next_cursor and can loop forever if the API repeats or never clears the
cursor; modify the loop in new-scout-update.mjs to (1) track seen cursors (e.g.,
a Set seenCursors) and break if a cursor repeats, (2) enforce a hard page limit
(e.g., const MAX_PAGES = 100) and break when reached, and (3) also break if the
returned page is empty; apply these checks around the existing cursor handling
and updates.push(...) so getUpdates, cursor, updates, sinceTimestamp and now are
used unchanged but the loop will terminate safely.
In `@components/yutori/yutori.app.mjs`:
- Around line 57-61: Path parameters like taskId and scoutId are interpolated
raw into endpoint paths (e.g., in getBrowsingTask and other methods that call
this._request with paths like `/browsing/tasks/${taskId}` or
`/browsing/scouts/${scoutId}`), which can break routing if they contain reserved
characters; update all such callers to URL-encode dynamic segments using
encodeURIComponent before constructing the path so the request path is safe and
correct (locate every method that builds a path with `${taskId}` or `${scoutId}`
and replace the raw interpolation with the encoded value).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6abf41cb-d89a-44fc-8d21-5ebd6263ab55
⛔ Files ignored due to path filters (1)
components/yutori/yutori.svgis excluded by!**/*.svg
📒 Files selected for processing (15)
components/yutori/actions/create-scout/create-scout.mjscomponents/yutori/actions/delete-scout/delete-scout.mjscomponents/yutori/actions/get-browsing-task-result/get-browsing-task-result.mjscomponents/yutori/actions/get-research-task-result/get-research-task-result.mjscomponents/yutori/actions/get-scout-updates/get-scout-updates.mjscomponents/yutori/actions/get-scout/get-scout.mjscomponents/yutori/actions/list-scouts/list-scouts.mjscomponents/yutori/actions/mark-scout-done/mark-scout-done.mjscomponents/yutori/actions/restart-scout/restart-scout.mjscomponents/yutori/actions/run-browsing-task/run-browsing-task.mjscomponents/yutori/actions/run-research-task/run-research-task.mjscomponents/yutori/package.jsoncomponents/yutori/sources/new-scout-update/new-scout-update.mjscomponents/yutori/sources/new-scout-update/test-event.mjscomponents/yutori/yutori.app.mjs
There was a problem hiding this comment.
♻️ Duplicate comments (1)
components/yutori/sources/new-scout-update/new-scout-update.mjs (1)
60-73: 🧹 Nitpick | 🔵 TrivialConsider breaking on empty page for efficiency.
The pagination loop correctly guards against infinite loops with
MAX_PAGESandseenCursors. However, if the API returns an empty page with a cursor, the loop will continue fetching until hittingMAX_PAGES. Breaking early on empty pages could reduce unnecessary API calls.♻️ Optional optimization
const page = response?.updates ?? []; + if (page.length === 0) break; updates.push(...page); cursor = response?.next_cursor ?? null;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/yutori/sources/new-scout-update/new-scout-update.mjs` around lines 60 - 73, The pagination loop in the do/while that calls this.yutori.getUpdates can continue fetching when the API returns an empty page with a non-null cursor; after retrieving page (const page = response?.updates ?? []), if page.length === 0 break the loop to avoid unnecessary API calls, while preserving existing guards (pages, MAX_PAGES, seenCursors and cursor handling) and still pushing non-empty pages into updates; ensure this check occurs before setting cursor = response?.next_cursor to keep seenCursors logic consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@components/yutori/sources/new-scout-update/new-scout-update.mjs`:
- Around line 60-73: The pagination loop in the do/while that calls
this.yutori.getUpdates can continue fetching when the API returns an empty page
with a non-null cursor; after retrieving page (const page = response?.updates ??
[]), if page.length === 0 break the loop to avoid unnecessary API calls, while
preserving existing guards (pages, MAX_PAGES, seenCursors and cursor handling)
and still pushing non-empty pages into updates; ensure this check occurs before
setting cursor = response?.next_cursor to keep seenCursors logic consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d4017b7c-be19-425e-ab38-0d52904bdf3c
📒 Files selected for processing (5)
components/yutori/actions/create-scout/create-scout.mjscomponents/yutori/actions/list-scouts/list-scouts.mjscomponents/yutori/actions/run-browsing-task/run-browsing-task.mjscomponents/yutori/sources/new-scout-update/new-scout-update.mjscomponents/yutori/yutori.app.mjs
|
Hi, please submit an app integration request in our issues (https://github.com/PipedreamHQ/pipedream/issues). The app needs to be integrated into the platform with proper authentication mechanisms before any components can be reviewed and merged. |
|
Thanks for the heads up! We've submitted the app integration request here: #20387. Happy to provide any additional information needed to get the authentication set up. We'll keep this PR open and ready to go once the app is integrated. |
|
Resolves #20387 |
GTFalcao
left a comment
There was a problem hiding this comment.
Hi @deviparikh , a few comments:
- there are conflicts to be resolved since the app was integrated after this PR was submitted;
- if possible, please run
npx pnpm installandnpx eslint --fix components/yutorias these are needed for the PR's automated checks to succeed; - the logo image file can be safely removed from the PR, it has already been added to the integrated app;
- Resolve add/add conflict in package.json (take version 0.0.1 from upstream, keep our dependencies block, merge keywords) - Resolve add/add conflict in yutori.app.mjs (keep our full implementation, discard upstream placeholder stub) - Remove yutori.svg (logo already added by Pipedream team) - Fix source description to start with "Emit new" per Pipedream linting guidelines
|
Thanks for the thorough review @GTFalcao! Addressed all three items in 90ba9c6:
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/yuroti/yuroti.app.mjs (1)
1-11:⚠️ Potential issue | 🔴 CriticalCritical: The "yuroti" directory and files should not exist - use "yutori" instead.
The codebase currently contains two directories:
components/yuroti/(incorrect spelling) andcomponents/yutori/(correct spelling with full implementation). The Yutori platform documentation confirms the correct spelling is "Yutori" (https://yutori.com, https://docs.yutori.com).Remove the
components/yuroti/directory and all its files. All integration code should use thecomponents/yutori/directory and naming throughout (directory name, filenames, package.json, and app ID).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/yuroti/yuroti.app.mjs` around lines 1 - 11, This file and directory use the incorrect "yuroti" spelling; delete the entire components/yuroti/ directory and its files, rename or recreate them under components/yutori/, and update the exported app identifier and any references from "yuroti" to "yutori" (e.g., in components/yuroti/yuroti.app.mjs change export default { app: "yuroti", ... } to use "yutori" and move the file to components/yutori/yutori.app.mjs), then search the repo for any remaining "yuroti" occurrences (filenames, package.json, imports, app IDs, and tests) and replace them with "yutori" to ensure all integration code uses the correct spelling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/yutori/sources/new-scout-update/new-scout-update.mjs`:
- Line 61: When pagination stops because MAX_PAGES was hit, do not advance
lastTimestamp (which would drop unprocessed older updates); instead only update
lastTimestamp = updates[0].timestamp + 1 when pagination completed normally (no
next_cursor) or when you did not hit the MAX_PAGES limit. Change the two places
that set lastTimestamp (the block using pages/next_cursor around pages++ and the
later block at lines 89-93) to check whether pagination was truncated (e.g.,
pagesReached = pages >= MAX_PAGES && next_cursor) and skip advancing
lastTimestamp when pagesReached is true; otherwise advance as before.
---
Outside diff comments:
In `@components/yuroti/yuroti.app.mjs`:
- Around line 1-11: This file and directory use the incorrect "yuroti" spelling;
delete the entire components/yuroti/ directory and its files, rename or recreate
them under components/yutori/, and update the exported app identifier and any
references from "yuroti" to "yutori" (e.g., in components/yuroti/yuroti.app.mjs
change export default { app: "yuroti", ... } to use "yutori" and move the file
to components/yutori/yutori.app.mjs), then search the repo for any remaining
"yuroti" occurrences (filenames, package.json, imports, app IDs, and tests) and
replace them with "yutori" to ensure all integration code uses the correct
spelling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b398ec4d-2642-4b53-b527-7f092a6ddc6c
📒 Files selected for processing (7)
components/awardco/awardco.app.mjscomponents/beehiv/beehiv.app.mjscomponents/certs365/certs365.app.mjscomponents/sunshine_conversations/sunshine_conversations.app.mjscomponents/yuroti/yuroti.app.mjscomponents/yutori/package.jsoncomponents/yutori/sources/new-scout-update/new-scout-update.mjs
If MAX_PAGES is reached while next_cursor is still set, leave lastTimestamp unchanged so the next poll retries from the same window instead of silently dropping unprocessed updates.
|
Re: the |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/yutori/sources/new-scout-update/new-scout-update.mjs`:
- Around line 69-71: The current cursor-advance logic assumes getUpdates returns
numeric, reverse-chronological timestamps and uses updates[0].timestamp + 1;
instead, normalize and validate timestamps from this.yutori.getUpdates by
converting each update.timestamp to a numeric epoch (e.g., Date.parse/new
Date(...).getTime()), handle ISO strings or numbers uniformly, and compute the
next cursor by taking the maximum numeric timestamp across all merged paginated
responses (after normalizing) and adding 1; also re-sort or validate ordering if
you rely on order elsewhere and replace any uses of updates[0].timestamp with
the computed maxTimestamp to advance state (references: this.yutori.getUpdates,
updates array, and the cursor-advance logic that uses updates[0].timestamp).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f5bde445-b9e0-4d1e-8185-8a36b8cc332a
📒 Files selected for processing (1)
components/yutori/sources/new-scout-update/new-scout-update.mjs
… ordering Add toEpochMs() to handle both numeric and ISO string timestamps. Sort updates explicitly oldest-first before emitting instead of relying on API returning reverse-chronological order. Derive next lastTimestamp from max(all timestamps) rather than updates[0].
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/yutori/sources/new-scout-update/new-scout-update.mjs`:
- Around line 97-103: The emitted event metadata uses update.timestamp which may
be an ISO string; convert it to a numeric epoch (milliseconds since Unix epoch)
before emitting to ensure consistent ordering. In the loop over orderedUpdates
where this.$emit(...) is called, coerce update.timestamp to a Number (e.g.,
Date.parse or new Date(...).getTime()) and pass that numeric value as ts instead
of the raw update.timestamp so all emitted events have numeric ts values.
- Around line 69-81: The loop assumes response.updates is valid and may advance
state even when the API returns an unexpected shape; add a defensive check after
calling this.yutori.getUpdates to ensure response?.updates is an array (e.g.,
Array.isArray(response.updates)); if not, log a warning/error including the raw
response and break or return so you do not advance cursor/lastTimestamp; keep
using response?.next_cursor only when the updates array is valid. Reference
symbols: this.yutori.getUpdates, response, response.updates, cursor, updates,
lastTimestamp/sinceTimestamp.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 41d6f1fd-fbf3-4367-a674-1d13141798f3
📒 Files selected for processing (1)
components/yutori/sources/new-scout-update/new-scout-update.mjs
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/yutori/sources/new-scout-update/new-scout-update.mjs`:
- Around line 83-87: The toEpochMs helper currently treats numeric-looking
strings like "1700000000000" as non-parsable (Date.parse returns NaN) and
throws; update toEpochMs to explicitly handle numeric timestamp strings by
checking if value is a string of digits (e.g., /^\d+$/) and returning
Number(value) for those cases, otherwise fall back to Date.parse and the
existing error path; reference the toEpochMs function to locate and update this
logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: efe85777-1832-495a-aa8a-d7c932f828bb
📒 Files selected for processing (1)
components/yutori/sources/new-scout-update/new-scout-update.mjs
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
GTFalcao
left a comment
There was a problem hiding this comment.
Thanks for your updates @deviparikh , looks mostly good to me.
I'm just wondering why creating/deleting a scout is being done as actions rather than a webhook source which would seem more appropriate. Let me know if there's a specific reason for that or whether it can be simplified into a webhook source instead, and we can move forward.
Also, from some whitespace patterns I can tell eslint is not being applied properly - which is why the automated checks are failing. Normally it runs on a husky hook if you have dependencies installed on the repo, though you can manually run it with npx eslint --fix components/yutori as I mentioned before. If you're having issues with this, please give me write access to your fork and I can run it for you.
There was a problem hiding this comment.
This seems like it would make more sense as a source, not as an action, if the purpose is to create a webhook - the scout creation would be handled on source deployment, using the source's own http URL.
Is there a specific reason why this is being created as an action instead?
There was a problem hiding this comment.
With the source model, this would be done when disabling the source.
There was a problem hiding this comment.
| "version": "0.1.0", |
…te on disable) - add activate() hook: creates the scout from source props on deploy - add deactivate() hook: deletes the scout when source is disabled - switch polling from getUpdates() (all scouts) to getScoutUpdates(scoutId) - add scout config props: query, outputInterval, userTimezone, userLocation, skipEmail - default skipEmail=true since Pipedream steps handle notifications - bump version to 0.1.0 - update create-scout description to clarify standalone vs. source use cases
Scout lifecycle (create on deploy, delete on disable) is now handled entirely by the new-scout-update source hooks.
…facing README These actions only make sense when scouts are managed independently on the Yutori platform. With the lifecycle source model, scout management happens entirely in Pipedream via deploy/disable.
|
Thanks, this makes sense. I updated the integration to follow the source lifecycle model you suggested:
I also fixed the source polling logic to match the per-scout updates API and made cleanup tolerant of the scout already being deleted. |
|
@GTFalcao let me know if you have any additional feedback. |
|
Merged the latest Also re-ran |
GTFalcao
left a comment
There was a problem hiding this comment.
Thanks for your attention to making the required changes! The components look good. The remaining comments are simple to address (I left commitable suggestions for almost all). Moving this forward to the QA stage
Co-authored-by: Guilherme Falcão <48412907+GTFalcao@users.noreply.github.com>
Co-authored-by: Guilherme Falcão <48412907+GTFalcao@users.noreply.github.com>
…h-task-result.mjs Co-authored-by: Guilherme Falcão <48412907+GTFalcao@users.noreply.github.com>
Co-authored-by: Guilherme Falcão <48412907+GTFalcao@users.noreply.github.com>
Co-authored-by: Guilherme Falcão <48412907+GTFalcao@users.noreply.github.com>
…g-task-result.mjs Co-authored-by: Guilherme Falcão <48412907+GTFalcao@users.noreply.github.com>
Resolves #20387
Summary
This PR adds Yutori as a new native integration to the Pipedream registry.
Yutori is reimagining how people interact with the web. The Yutori API is an AI web agent platform. Give it a task and it navigates websites, fills forms, extracts data, and completes multi-step workflows using a real cloud browser — or runs deep research across 100+ sources. You can also set up Scouts: recurring monitors that watch any part of the web on a schedule and alert you when something relevant happens.
Components included
Actions (11):
Triggers / Sources (1):
hoursBacklookback on first run and usesdedupe: "unique"with cursor-based paginationChecklist
version: "0.0.1"on all componentsannotationsblock on all components (openWorldHint,destructiveHint,readOnlyHint)dedupe: "unique"on polling sourceyutori.svg) included.DS_Storeor other build artifactsSummary by CodeRabbit