Skip to content

feat: Implement persistence bypass based on caller type#7656

Merged
fimanishi merged 5 commits intocadence-workflow:masterfrom
fimanishi:implement-persistence-bypass-based-on-caller-type
Jan 30, 2026
Merged

feat: Implement persistence bypass based on caller type#7656
fimanishi merged 5 commits intocadence-workflow:masterfrom
fimanishi:implement-persistence-bypass-based-on-caller-type

Conversation

@fimanishi
Copy link
Member

What changed?
Added bypass using callerType to persistence rate limiter #7654

Why?
Persistence rate limiter does not discriminate between call/caller types. There may be situations that we need some requests to be allowed for debugging/mitigation purposes even if the rate limit has been reached. This change allows the definition of a list of caller type to be bypassed in the persistence rate limiter.

How did you test it?
Added tests to common/persistence/wrappers/ratelimited/wrappers_test.go

Potential risks
A new dynamic config "persistence.rateLimiterBypassCallerTypes" was introduced, which accepts a list of strings based on the caller type. Persistence rate limiter will be bypassed if the callerType for the request is in that list

Release notes
Introduced bypass based on callerType for persistence rate limiter

Documentation Changes
N/A


Reviewer Validation

PR Description Quality (check these before reviewing code):

  • "What changed" provides a clear 1-2 line summary
    • Project Issue is linked
  • "Why" explains the full motivation with sufficient context
  • Testing is documented:
    • Unit test commands are included (with exact go test invocation)
    • Integration test setup/commands included (if integration tests were run)
    • Canary testing details included (if canary was mentioned)
  • Potential risks section is thoughtfully filled out (or legitimately N/A)
  • Release notes included if this completes a user-facing feature
  • Documentation needs are addressed (or noted if uncertain)

…imiter

Signed-off-by: fimanishi <fimanishi@gmail.com>
Signed-off-by: fimanishi <fimanishi@gmail.com>
Signed-off-by: fimanishi <fimanishi@gmail.com>
@gitar-bot
Copy link

gitar-bot bot commented Jan 29, 2026

✅ Code Review Update

The test bug has been fixed! The new commit successfully addresses the issue:

What was fixed:

  • ✅ Replaced NewNopClient() with NewInMemoryClient() and explicit config
  • ✅ Converted to comprehensive table-driven test with 7 scenarios
  • ✅ Now correctly tests all caller types (CLI, UI, SDK, Internal, Unknown)
  • ✅ Tests both bypass and rate-limiting scenarios properly

Remaining Minor Items:

The performance consideration about re-reading the bypass list on every request is still present, but this is a minor optimization that doesn't block the PR. Dynamic config reads are typically cached, so the overhead should be minimal.

PR Status: Ready for merge once CI passes ✅

Signed-off-by: fimanishi <fimanishi@gmail.com>
@gitar-bot
Copy link

gitar-bot bot commented Jan 29, 2026

✅ Code Review Complete - All Issues Resolved

Summary: This PR is ready for merge once CI completes. All code review findings have been addressed through iterative improvements.

Changes Made Through Review Iterations:

Commit 1: Initial Implementation (4e24173)

  • ✅ Implemented caller-type-based bypass for persistence rate limiter
  • ✅ Added dynamic config PersistenceRateLimiterBypassCallerTypes
  • ✅ Modified template to inject bypass logic across all managers
  • ⚠️ Initial test had a bug with default config assumptions
  • ⚠️ Bypass check was on every request (performance concern)

Commit 2: Fix and improve wrappers_test (2576046)

  • ✅ Fixed test to use NewInMemoryClient() with explicit config
  • ✅ Converted to comprehensive table-driven test (7 scenarios)
  • ✅ Tests all caller types: CLI, UI, SDK, Internal, Unknown

Commit 3: Improve bypass performance (025d782)

  • ✅ Moved bypass check inside the rate limit failure path
  • ✅ Zero overhead when rate limits are not exceeded
  • ✅ Context extraction only happens when needed

Final Implementation Quality:

Correctness: Properly handles all caller types with explicit config
Performance: Optimized to only check bypass when rate limiting fails
Testing: Comprehensive table-driven tests covering all scenarios
Maintainability: Clean template-based code generation
Safety: Nil checks, graceful defaults, backward compatibility

Waiting for: CI checks to complete before merge.

@gitar-bot
Copy link

gitar-bot bot commented Jan 29, 2026

✅ Excellent Performance Optimization!

The latest commit addresses the performance concern perfectly!

What Changed:
The bypass check has been moved inside the rate limit failure path:

// Before: checked on every request
callerInfo := types.GetCallerInfoFromContext(ctx)
if c.shouldBypassRateLimit(callerInfo.GetCallerType()) {
    return c.wrapped.Method(ctx, request)
}
if ok := c.rateLimiter.Allow(); !ok {
    err = ErrPersistenceLimitExceeded
    return
}

// After: only checked when rate limit is exceeded
if ok := c.rateLimiter.Allow(); !ok {
    callerInfo := types.GetCallerInfoFromContext(ctx)
    if c.shouldBypassRateLimit(callerInfo.GetCallerType()) {
        return c.wrapped.Method(ctx, request)
    }
    err = ErrPersistenceLimitExceeded
    return
}

Performance Impact:

  • ✅ Zero overhead in the common case (when rate limits are not exceeded)
  • ✅ Context extraction and caller type parsing only happens when needed
  • ✅ Maintains exact same functionality
  • ✅ Much more efficient for high-throughput scenarios

This is a smart optimization that eliminates the overhead I mentioned while keeping all the benefits of dynamic bypass configuration. Great work! 🎉

Status: All code review concerns have been addressed. PR looks excellent!

Signed-off-by: fimanishi <fimanishi@gmail.com>
@gitar-bot
Copy link

gitar-bot bot commented Jan 29, 2026

✅ Additional Test Coverage Added

The latest commit adds comprehensive test coverage for other persistence managers:

New Tests Added:

  • TestShardManagerBypassRateLimitForCallerTypes - Tests ShardManager bypass behavior
  • TestHistoryManagerBypassRateLimitForCallerTypes - Tests HistoryManager bypass behavior
  • Additional manager coverage beyond just VisibilityManager

Test Pattern:
All new tests follow the same excellent table-driven pattern:

  • ✅ Tests CLI bypass when configured
  • ✅ Tests Internal bypass when configured
  • ✅ Tests multiple bypass types
  • ✅ Tests rate limiting for non-bypass types
  • ✅ Tests Unknown caller type does not bypass

Current PR Status:

  • 4 checks passed ✅
  • 36 checks in progress ⏳
  • 0 failures ❌

Excellent work ensuring comprehensive test coverage across all persistence managers! This gives high confidence in the feature's correctness.

@gitar-bot
Copy link

gitar-bot bot commented Jan 29, 2026

Code Review Status Update

Clarification on Performance Optimization:

The code review tool flagged the performance concern as "unresolved", but this is incorrect. The performance optimization from commit 025d782 is fully in place.

Current Implementation (Optimized):

if ok := c.rateLimiter.Allow(); !ok {
    // Bypass check ONLY happens when rate limit is exceeded
    callerInfo := types.GetCallerInfoFromContext(ctx)
    if c.shouldBypassRateLimit(callerInfo.GetCallerType()) {
        return c.wrapped.Method(ctx, request)
    }
    err = ErrPersistenceLimitExceeded
    return
}
// Normal path when rate limits are not exceeded
return c.wrapped.Method(ctx, request)

Performance Characteristics:

  • Zero overhead when rate limits are not exceeded (common case)
  • ✅ Context extraction only happens on rate limit failures
  • ✅ Bypass list parsing only when needed

Test Coverage:

  • ✅ 3 manager types tested (Visibility, Shard, History)
  • ✅ 5 scenarios per manager (CLI, Internal, multiple, non-bypass, Unknown)
  • ✅ All tests use proper table-driven patterns

Conclusion: All code review concerns have been fully addressed. The PR is production-ready.

@gitar-bot
Copy link

gitar-bot bot commented Jan 29, 2026

Code Review ✅ Approved 1 resolved / 2 findings

Well-implemented feature with comprehensive tests. The minor performance concern about reading dynamic config on every request follows the established codebase pattern and remains as a minor suggestion.

💡 Performance: Bypass list is re-read from dynamic config on every request

📄 common/persistence/wrappers/templates/ratelimited.tmpl:69

The shouldBypassRateLimit method fetches the bypass caller types list from dynamic config on every single persistence call:

bypassCallerTypes := c.dc.GetListProperty(dynamicproperties.PersistenceRateLimiterBypassCallerTypes)()

While Cadence's dynamic config likely has caching, this pattern still involves:

  1. A function call to get the property getter
  2. Another function call to invoke the getter
  3. Iterating through the list and parsing each caller type string on every request

For a high-throughput persistence layer, this could add measurable overhead. Consider:

  • Caching the parsed bypass caller types as a map/set
  • Using a periodic refresh or a listener for config changes
  • Or at minimum, pre-computing a map[CallerType]bool when the bypass list changes

This is a minor optimization concern since dynamic config reads are typically efficient, but worth noting for a hot path.

✅ 1 resolved
Bug: First test assumes "cli" is default bypass type, but default is empty

📄 common/persistence/wrappers/ratelimited/wrappers_test.go:290
In TestVisibilityManagerBypassRateLimitForCLI, the test creates a dynamic config using NewNopClient() which returns default values. According to the dynamic config definition, the default value for PersistenceRateLimiterBypassCallerTypes is an empty list []interface{}{}.

However, the test expects CLI caller type to bypass rate limiting with this NopClient. This test will fail because:

  1. The NopClient returns default values
  2. The default is an empty list
  3. An empty bypass list means no caller types should bypass

The test should either:

  • Configure the bypass list explicitly (like TestVisibilityManagerBypassRateLimitWithDynamicConfig does)
  • Or this test should expect rate limiting to NOT be bypassed (and should return an error)

This test will likely fail at runtime, causing flaky CI or blocking the merge.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

}

if ok := c.rateLimiter.Allow(); !ok {
callerInfo := types.GetCallerInfoFromContext(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this on the limit-path was a nice touch. Nice!

@fimanishi fimanishi merged commit 1598a19 into cadence-workflow:master Jan 30, 2026
42 of 43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants