[FEATURE][SECURITY]: Content size and type security limits for resources and prompts

# [FEATURE][SECURITY]: Content Size and Type Security Limits for Resources and Prompts

## Goal

Implement configurable content validation for resources and prompts with size limits, content type restrictions, and security validation when content is submitted via the API. This prevents abuse, DoS attacks, and injection of malicious content.

## Why Now?

1. **Security Compliance**: OWASP recommends validating all input sizes and types
2. **DoS Prevention**: Large content submissions can exhaust memory and storage
3. **Data Quality**: Type restrictions ensure only valid content is stored
4. **Injection Protection**: Pattern detection blocks XSS and other injection attacks
5. **Resource Management**: Rate limiting prevents content creation abuse

---

## 📖 User Stories

<details>
<summary><strong>US-1: Security - Enforce Content Size Limits</strong></summary>

**As a** security engineer
**I want** content size limits enforced on all submissions
**So that** large uploads cannot exhaust system resources

**Acceptance Criteria:**

```gherkin
Scenario: Resource content size limit
  Given CONTENT_MAX_RESOURCE_SIZE=102400 (100KB)
  When a user creates a resource with 200KB content
  Then the request should be rejected with 413 Payload Too Large
  And the response should indicate the size limit

Scenario: Prompt template size limit
  Given CONTENT_MAX_PROMPT_SIZE=10240 (10KB)
  When a user creates a prompt with 20KB template
  Then the request should be rejected with 413 Payload Too Large

Scenario: Content within limits
  Given CONTENT_MAX_RESOURCE_SIZE=102400
  When a user creates a resource with 50KB content
  Then the resource should be created successfully
```

**Technical Requirements:**
- Check content size before processing
- Return 413 with clear error message
- Log oversized submission attempts
- Apply limits consistently across create and update operations

</details>

<details>
<summary><strong>US-2: Security - Restrict Content Types</strong></summary>

**As a** security engineer
**I want** only allowed content types accepted for resources
**So that** dangerous file types cannot be uploaded

**Acceptance Criteria:**

```gherkin
Scenario: Allowed MIME type
  Given CONTENT_ALLOWED_RESOURCE_MIMETYPES=text/plain,text/markdown
  When a user creates a resource with mimeType "text/plain"
  Then the resource should be created successfully

Scenario: Disallowed MIME type
  Given CONTENT_ALLOWED_RESOURCE_MIMETYPES=text/plain,text/markdown
  When a user creates a resource with mimeType "text/html"
  Then the request should be rejected with 400 Bad Request
  And the response should list allowed types

Scenario: MIME type detection from URI
  Given a resource with URI "notes.md"
  When no explicit mimeType is provided
  Then the system should detect "text/markdown"
  And validate against allowed types
```

**Technical Requirements:**
- Configure allowed MIME types per entity type
- Auto-detect MIME type from URI if not provided
- Validate declared type matches detected type
- Default to safe types (text/plain, text/markdown)

</details>

<details>
<summary><strong>US-3: Security - Block Malicious Patterns</strong></summary>

**As a** security engineer
**I want** content scanned for malicious patterns
**So that** XSS and injection attacks are blocked

**Acceptance Criteria:**

```gherkin
Scenario: Block script injection
  Given content containing "<script>alert(1)</script>"
  When a user attempts to create a resource
  Then the request should be rejected with 400 Bad Request
  And a security warning should be logged

Scenario: Block JavaScript URLs
  Given content containing "javascript:void(0)"
  When a user attempts to create a resource
  Then the request should be rejected

Scenario: Block event handlers
  Given content containing "onclick=alert(1)"
  When a user attempts to create a resource
  Then the request should be rejected

Scenario: Clean content passes
  Given content with "This is normal markdown content"
  When a user creates a resource
  Then the resource should be created successfully
```

**Technical Requirements:**
- Configure blocked patterns list
- Scan content case-insensitively
- Log security violations with user context
- Allow admins to customize blocked patterns

</details>

<details>
<summary><strong>US-4: Security - Validate Prompt Templates</strong></summary>

**As a** security engineer
**I want** prompt templates validated for safe syntax
**So that** template injection attacks are prevented

**Acceptance Criteria:**

```gherkin
Scenario: Balanced template braces
  Given a prompt template with unbalanced braces "{{user}}"
  When a user creates the prompt
  Then the prompt should be created successfully (balanced)

Scenario: Unbalanced braces rejected
  Given a prompt template "Hello {{user" (missing closing)
  When a user attempts to create the prompt
  Then the request should be rejected with validation error

Scenario: Dangerous template patterns blocked
  Given a prompt template "{{__import__('os')}}"
  When a user attempts to create the prompt
  Then the request should be rejected as security violation
```

**Technical Requirements:**
- Validate template syntax (balanced braces)
- Block dangerous patterns (eval, exec, import, __methods__)
- Allow safe Jinja2 variables
- Log template validation failures

</details>

<details>
<summary><strong>US-5: Operator - Rate Limit Content Creation</strong></summary>

**As a** platform operator
**I want** content creation rate limited
**So that** rapid creation cannot overwhelm the system

**Acceptance Criteria:**

```gherkin
Scenario: Rate limit content creation
  Given CONTENT_CREATE_RATE_LIMIT_PER_MINUTE=3
  And user "alice" has created 3 resources in the last minute
  When "alice" attempts to create another resource
  Then the request should be rejected with 429 Too Many Requests
  And Retry-After header should be included

Scenario: Concurrent operation limit
  Given CONTENT_MAX_CONCURRENT_OPERATIONS=2
  And user "alice" has 2 create operations in progress
  When "alice" starts a third create operation
  Then the operation should be queued or rejected
```

**Technical Requirements:**
- Track creation rate per user
- Implement sliding window rate limiting
- Track concurrent operations
- Return appropriate retry timing

</details>

---

## 🏗 Architecture

### Content Validation Flow

```mermaid
flowchart TD
    A[Content Submission] --> B{Size Check}
    B -->|Exceeds Limit| C[413 Payload Too Large]
    B -->|Within Limit| D{MIME Type Check}
    D -->|Disallowed| E[400 Bad Request]
    D -->|Allowed| F{Pattern Scan}
    F -->|Malicious| G[400 Security Violation]
    F -->|Clean| H{Encoding Check}
    H -->|Invalid UTF-8| I[400 Invalid Encoding]
    H -->|Valid| J[Store Content]
```

### Content Security Service

```mermaid
classDiagram
    class ContentSecurityService {
        -dangerous_patterns: List~Pattern~
        +validate_resource_content(content, uri, mime_type)
        +validate_prompt_content(template, name)
        -detect_mime_type(uri, content)
        -validate_content(content, mime_type, context)
        -validate_prompt_template_syntax(template, name)
    }

    class ContentRateLimiter {
        -operation_counts: Dict
        -concurrent_operations: Dict
        +check_rate_limit(user, operation)
        +record_operation(user, operation)
        +end_operation(user)
    }
```

---

## 📋 Implementation Tasks

### Phase 1: Configuration
- [ ] Add `CONTENT_MAX_RESOURCE_SIZE` setting
- [ ] Add `CONTENT_MAX_PROMPT_SIZE` setting
- [ ] Add `CONTENT_ALLOWED_RESOURCE_MIMETYPES` setting
- [ ] Add `CONTENT_BLOCKED_PATTERNS` setting
- [ ] Add rate limiting settings
- [ ] Document in `.env.example`

### Phase 2: Content Security Service
- [ ] Create `mcpgateway/services/content_security.py`
- [ ] Implement size validation
- [ ] Implement MIME type detection and validation
- [ ] Implement pattern scanning
- [ ] Implement encoding validation

### Phase 3: Rate Limiting
- [ ] Create content rate limiter
- [ ] Track per-user creation rate
- [ ] Track concurrent operations
- [ ] Return proper rate limit headers

### Phase 4: Service Integration
- [ ] Integrate validation in Resource service
- [ ] Integrate validation in Prompt service
- [ ] Add validation to update operations
- [ ] Handle validation errors consistently

### Phase 5: Monitoring
- [ ] Log security violations
- [ ] Add metrics for validation failures
- [ ] Track rate limit hits
- [ ] Create security dashboard

### Phase 6: Testing
- [ ] Unit tests for size validation
- [ ] Unit tests for MIME type validation
- [ ] Unit tests for pattern detection
- [ ] Integration tests for full flow
- [ ] Security tests for bypass attempts

---

## ⚙️ Configuration Example

```bash
# Maximum content sizes (in bytes)
CONTENT_MAX_RESOURCE_SIZE=102400    # 100KB for resources
CONTENT_MAX_PROMPT_SIZE=10240       # 10KB for prompt templates

# Allowed MIME types (comma-separated)
CONTENT_ALLOWED_RESOURCE_MIMETYPES=text/plain,text/markdown
CONTENT_ALLOWED_PROMPT_MIMETYPES=text/plain,text/markdown

# Content validation
CONTENT_VALIDATE_ENCODING=true      # Validate UTF-8 encoding
CONTENT_VALIDATE_PATTERNS=true      # Check for malicious patterns
CONTENT_STRIP_NULL_BYTES=true       # Remove null bytes

# Rate limiting
CONTENT_CREATE_RATE_LIMIT_PER_MINUTE=3   # Max creates per minute
CONTENT_MAX_CONCURRENT_OPERATIONS=2       # Max concurrent operations

# Security patterns to block (comma-separated)
CONTENT_BLOCKED_PATTERNS=<script,javascript:,vbscript:,onload=,onerror=,onclick=,<iframe
```

---

## ✅ Success Criteria

- [ ] Content exceeding size limits is rejected
- [ ] Only allowed MIME types are accepted
- [ ] Malicious patterns are detected and blocked
- [ ] Prompt templates are validated for safe syntax
- [ ] Rate limiting prevents creation abuse
- [ ] Security violations are logged
- [ ] Clear error messages returned
- [ ] No performance regression

---

## 🏁 Definition of Done

- [ ] Content security service implemented
- [ ] Size validation working
- [ ] MIME type validation working
- [ ] Pattern scanning working
- [ ] Rate limiting integrated
- [ ] Resource service updated
- [ ] Prompt service updated
- [ ] Unit tests with >90% coverage
- [ ] Security tests pass
- [ ] Code passes `make verify`
- [ ] Documentation updated

---

## 🔗 Related Issues

- Related: Content upload security best practices
- Related: Input validation requirements


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE][SECURITY]: Content size and type security limits for resources and prompts #538

[FEATURE][SECURITY]: Content Size and Type Security Limits for Resources and Prompts

Goal

Why Now?

📖 User Stories

🏗 Architecture

Content Validation Flow

Content Security Service

📋 Implementation Tasks

Phase 1: Configuration

Phase 2: Content Security Service

Phase 3: Rate Limiting

Phase 4: Service Integration

Phase 5: Monitoring

Phase 6: Testing

⚙️ Configuration Example

✅ Success Criteria

🏁 Definition of Done

🔗 Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE][SECURITY]: Content size and type security limits for resources and prompts #538

Description

[FEATURE][SECURITY]: Content Size and Type Security Limits for Resources and Prompts

Goal

Why Now?

📖 User Stories

🏗 Architecture

Content Validation Flow

Content Security Service

📋 Implementation Tasks

Phase 1: Configuration

Phase 2: Content Security Service

Phase 3: Rate Limiting

Phase 4: Service Integration

Phase 5: Monitoring

Phase 6: Testing

⚙️ Configuration Example

✅ Success Criteria

🏁 Definition of Done

🔗 Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions