Conversation
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
bbf0d35 to
da3e991
Compare
- Rename docs/advanced_features/router.md to sgl_model_gateway.md - Add comprehensive documentation for new features: - Tokenization endpoints (/v1/tokenize, /v1/detokenize) - Tokenizer management APIs (/v1/tokenizers) - Parser endpoints (/parse/reasoning, /parse/function_call) - gRPC embedding support - Expanded metrics documentation (40+ Prometheus metrics) - OpenTelemetry tracing integration - Update sgl-model-gateway/README.md with: - New feature highlights for tokenization and parsing APIs - Detailed tokenization endpoint examples - Parser endpoint documentation with supported parsers - Expanded observability section with metric categories - OpenTelemetry configuration details - Update docs/index.rst to reference new filename - Add TLS (HTTPS) documentation for gateway server - --tls-cert-path and --tls-key-path configuration - rustls with ring crypto provider details - Add mTLS documentation for worker communication - --client-cert-path, --client-key-path, --ca-cert-path flags - Multiple CA certificate support - TCP keepalive configuration - Update both README.md and sgl_model_gateway.md with: - Full TLS configuration examples - Parameter reference tables - Security configuration guidance - Add TLS Configuration section to Configuration Reference - Update Table of Contents with TLS subsections Add new Production Recommendations section covering: - Security: TLS checklist and production security best practices - High Availability: - Multi-replica architecture diagram - Trade-offs table (radix tree, circuit breaker, rate limiting) - Cache hit reduction (10-20%) with multiple replicas - Horizontal vs vertical scaling guidance - Session affinity recommendations - Performance: - gRPC mode recommendation for high throughput - Performance tuning table with parameter recommendations - Benefits of native Rust tokenization - Kubernetes Deployment: - Pod labeling examples for service discovery - Regular and PD mode worker deployments - RBAC configuration for pod watching - PD mode with bootstrap port annotations - Monitoring with PromQL: - Request rate and latency queries - Worker health monitoring - Circuit breaker status tracking - Inference performance metrics (TTFT, TPOT) - Rate limiting and queuing metrics - MCP tool execution monitoring - Example Prometheus alerting rules Add comprehensive documentation for writing custom WASM middleware modules including authentication, rate limiting, and request logging examples. Documents the WIT interface, deployment process, and runtime configuration. Add comprehensive documentation for the Python and Go bindings: Python Bindings: - Installation (development/production/PyPI) - Basic usage with Router and RouterArgs - CLI commands (smg launch, smg server) - Full RouterArgs configuration reference - PD disaggregation and K8s service discovery examples Go Bindings: - Two-layer architecture explanation (Go API + Rust FFI) - Installation and build requirements - Non-streaming and streaming usage examples - ClientConfig and ChatCompletionRequest options - OpenAI-compatible server example - Testing instructions
jiaming1130
pushed a commit
to zhuyijie88/sglang
that referenced
this pull request
Dec 25, 2025
GuoYechang
pushed a commit
to GuoYechang/sglang
that referenced
this pull request
Jan 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rename docs/advanced_features/router.md to sgl_model_gateway.md
Add comprehensive documentation for new features:
Update sgl-model-gateway/README.md with:
Update docs/index.rst to reference new filename
Add TLS (HTTPS) documentation for gateway server
Add mTLS documentation for worker communication
Update both README.md and sgl_model_gateway.md with:
Add TLS Configuration section to Configuration Reference
Update Table of Contents with TLS subsections
Security: TLS checklist and production security best practices
High Availability:
Performance:
Kubernetes Deployment:
Monitoring with PromQL:
Add comprehensive documentation for writing custom WASM middleware modules
including authentication, rate limiting, and request logging examples.
Documents the WIT interface, deployment process, and runtime configuration.
Checklist