Multi-language speech-to-text transcription service with team collaboration, billing, and full localization
Transform audio into accurate transcripts in 5 languages with team-based access control, flexible billing plans, and comprehensive transcript management.
- Multi-language STT: Support for 5 languages (English, German, Spanish, French, Japanese)
- Whisper-powered: Uses OpenAI's Whisper model for accurate speech-to-text transcription
- Timestamp Support: Precise segment-level timestamps (start/end times in milliseconds)
- Speaker Diarization: Optional speaker identification and labeling (best-effort)
- Async Job Processing: Upload audio and receive real-time progress updates via Server-Sent Events (SSE)
- Multiple Input Sources: Upload files directly or provide remote URLs
- Progress Tracking: Live updates with detailed stages (downloading, decoding, transcribing, translating, formatting, saving)
- Automatic Retries: Configurable retry policy with exponential backoff for failed jobs
- Translation Engine: TranslateGemma (55 languages) for post-transcription translation
- Multi-tenant Architecture: Complete team isolation with role-based access control
- Three Role Types:
- ADMIN: Full access, team management, billing control
- MEMBER: Create jobs, view team transcripts, edit content
- VIEWER: Read-only access to team resources
- Email Invitations: Invite team members with customizable roles
- Team Settings: Configure default language and team preferences
- Three Flexible Plans: | Plan | Price | Uploads | Languages | Members | Translations | |:-----|:------|:--------|:----------|:--------|:-------------| | FREE | $0/mo | 5/mo | 2 | 1 (Admin) | 5/mo | | STANDARD | $10/mo | 25/mo | 5 | 5 | 25/mo | | PRO | $30/mo | Unlimited | 5 | Unlimited | Unlimited |
- Stripe Integration: Secure payment processing for subscriptions and credits
- Credit System: Purchase additional credits on FREE plan ($1 per upload)
- Usage Tracking: Monitor monthly usage and billing history
- Invoice Management: Download invoices and view payment history
- Multiple Export Formats: TXT, JSON, SRT (subtitles), VTT (WebVTT)
- Edit History: Track all changes with full audit trail
- Segment Editing: Edit individual transcript segments with timestamps
- Revert Capability: Restore original transcripts at any time
- Translation Support: Translate transcripts to target languages (plan-dependent)
- System Dashboard: Overview of users, teams, jobs, and storage
- User Management: Suspend, unsuspend, delete, or impersonate users
- Team Monitoring: View team activity and manage team resources
- Job Analytics: Monitor job volume, error rates, and usage trends
- Audit Logs: Complete system activity tracking with detailed logs
- Health Monitoring: Real-time system health metrics (API, queue, worker)
- Full Localization: UI and API responses in 5 languages
- Auto-detection: Automatic language detection for transcripts
- Localized Errors: Error messages in user's preferred language
- Multi-language Marketing: Static marketing site in all supported languages
- Email/Password Auth: Secure authentication with email verification
- Google OAuth: Single sign-on with Google (PKCE-enabled for enhanced security)
- JWT Tokens: Access tokens with automatic refresh token rotation
- Password Reset: Secure password recovery via email
- Admin Authentication: Separate authentication system for admin panel
- Multi-tenant Isolation: Strict data isolation between teams
graph TB
subgraph Frontend
Web[apps/web]
Admin[apps/admin]
Marketing[apps/marketing]
end
subgraph Backend
API[apps/api]
Worker[apps/worker]
end
subgraph Shared
UI[packages/ui]
end
subgraph Core_Layer
Core[packages/core]
end
subgraph Storage_Layer
DB[packages/storage/db]
Redis[packages/storage/redis]
end
subgraph STT_Layer
STT[packages/stt]
end
%% Frontend connections
Web -->|auth| Core
Web -->|HTTP, SSE| API
Web --> UI
Admin -->|admin auth| Core
Admin -->|HTTP, SSE| API
Admin --> UI
Marketing -->|public| API
%% Backend connections
API --> Core
API --> DB
API --> Redis
API --> STT
Worker --> Redis
Worker --> Core
Worker --> DB
Worker --> STT
%% Data flow
subgraph Data_Persistence
PG[(PostgreSQL)]
REDIS[(Redis)]
end
DB --> PG
Redis --> REDIS
Core --> DB
Core --> Redis
style Web fill:#e1f5ff
style Admin fill:#e1f5ff
style Marketing fill:#e1f5ff
style API fill:#ffe1e1
style Worker fill:#ffe1e1
style Core fill:#e8f5e9
style Storage_Layer fill:#fff3e0
style STT_Layer fill:#fff3e0
style UI fill:#f3e5f5
- Multi-tenancy: All data is team-scoped with strict isolation at the query level
- Separation of Concerns: Business logic in
packages/core, thin API routes - Async Processing: Background job processing via Redis queue with dedicated worker
- Event-Driven Updates: Real-time progress via Server-Sent Events (SSE)
- Soft Deletion: Configurable grace period before hard deletion of resources
- Upload → User uploads audio via web app
- Validation → API checks plan limits and creates job record
- Enqueue → Job added to Redis queue
- Process → Worker picks up job and processes with STT engine
- Progress → Real-time updates sent via SSE
- Complete → Results saved to database, user notified
| Technology | Purpose | Version |
|---|---|---|
| React | UI framework | Latest |
| Vite | Build tool & dev server | Latest |
| Astro | Static site generation (marketing) | Latest |
| TypeScript | Type safety | Latest |
| Tailwind CSS | Styling | Latest |
| React Router | Client-side routing | Latest |
| Zustand | State management | Latest |
| i18next | Internationalization | Latest |
| shadcn/ui | Component library | Latest |
| Technology | Purpose | Version |
|---|---|---|
| Python | Runtime | 3.11+ |
| FastAPI | Web framework | Latest |
| uvicorn | ASGI server | Latest |
| SQLAlchemy | ORM | Latest |
| Alembic | Database migrations | Latest |
| Pydantic | Data validation | Latest |
| Redis | Queue & caching | Latest |
| Technology | Purpose |
|---|---|
| PostgreSQL | Primary database |
| Redis | Job queue & caching |
| Stripe | Payment processing |
| Railway | Deployment platform |
| Engine | Purpose | Details |
|---|---|---|
| Whisper | Speech-to-text | OpenAI's Whisper model for accurate multi-language transcription |
| TranslateGemma | Translation | Google's Gemma-based translation model supporting 55 languages |
| Tool | Purpose |
|---|---|
| Bun | TypeScript package manager |
| uv | Python package manager |
| just | Task runner |
| Ruff | Python linting & formatting |
| Biome | TypeScript linting & formatting |
| Playwright | E2E testing |
| pytest | Python testing |
| Vitest | TypeScript testing |
poly-script/
├── apps/
│ ├── web/ # React web app (Vite)
│ │ ├── src/
│ │ │ ├── components/ # UI components
│ │ │ ├── pages/ # Route pages
│ │ │ ├── hooks/ # Custom React hooks
│ │ │ ├── stores/ # Zustand stores
│ │ │ └── lib/ # Utilities
│ │ └── package.json
│ │
│ ├── admin/ # Admin panel (Vite)
│ │ └── src/ # Similar structure to web
│ │
│ ├── marketing/ # Marketing site (Astro)
│ │ ├── src/
│ │ │ ├── pages/ # Static pages (5 languages)
│ │ │ ├── components/ # Astro components
│ │ │ └── layouts/ # Page layouts
│ │ └── astro.config.mjs
│ │
│ ├── api/ # FastAPI backend
│ │ ├── main.py # App entry point
│ │ ├── src/
│ │ │ ├── routes/ # API endpoints
│ │ │ ├── middleware/ # Auth, CORS, logging
│ │ │ └── locales/ # API translations
│ │ ├── scripts/ # Maintenance & admin scripts
│ │ └── pyproject.toml
│ │
│ └── worker/ # Background worker
│ ├── main.py # Worker entry point
│ ├── src/ # Worker logic
│ └── pyproject.toml
│
├── packages/
│ ├── ui/ # Shared UI components (TypeScript)
│ │ ├── src/
│ │ │ ├── components/ # shadcn/ui components
│ │ │ └── lib/ # Utilities
│ │ └── package.json
│ │
│ ├── core/ # Business logic (Python)
│ │ ├── poly_core/
│ │ │ ├── services/ # Auth, billing, teams, jobs
│ │ │ ├── schemas/ # Pydantic models
│ │ │ └── utils/ # Helpers
│ │ └── pyproject.toml
│ │
│ ├── storage/
│ │ ├── db/ # Database layer (Python)
│ │ │ ├── poly_db/
│ │ │ │ ├── models/ # SQLAlchemy models
│ │ │ │ ├── repositories/ # Data access
│ │ │ │ └── migrations/ # Alembic migrations
│ │ │ └── pyproject.toml
│ │ │
│ │ └── redis/ # Redis layer (Python)
│ │ ├── poly_redis/
│ │ │ ├── client.py # Redis client
│ │ │ └── queue.py # Queue primitives
│ │ └── pyproject.toml
│ │
│ └── stt/ # STT engines (Python)
│ ├── src/
│ │ └── poly_stt/
│ │ ├── engines/ # STT implementations
│ │ └── normalizer.py # Output normalization
│ └── pyproject.toml
│
├── justfile # Task runner recipes
├── pyproject.toml # Python workspace config
├── package.json # Bun workspace config
└── infra/ # Deployment and environment templates
├── .env.example # Environment template
└── railway-prod.json # Production service configuration
Install the following tools before setting up the project:
- Bun - Fast JavaScript runtime & package manager
- uv - Fast Python package manager
- just - Command runner (optional but recommended)
- PostgreSQL - Database (v14+)
- Redis - Queue & cache (v6+)
-
Clone the repository
git clone https://github.com/conceptcodes/poly-script.git cd poly-script -
Install dependencies
# Install all dependencies (TypeScript + Python) just install # Or manually: bun install # TypeScript dependencies uv sync # Python dependencies (workspace)
-
Configure environment
cp infra/.env.example .env # Edit .env with your configurationRequired environment variables:
DATABASE_URL- PostgreSQL connection stringREDIS_URL- Redis connection stringJWT_SECRET- Secret for JWT signingADMIN_JWT_SECRET- Separate secret for admin tokensSTRIPE_SECRET_KEY- Stripe API keySMTP_HOST,SMTP_USER,SMTP_PASSWORD- Email configurationGOOGLE_CLIENT_ID,GOOGLE_CLIENT_SECRET- OAuth credentials
-
Set up database
# Create PostgreSQL database createdb polyscript # Run migrations just db-migrate # Or: cd packages/storage/db && uv run alembic upgrade head
-
Create admin user (for admin panel access)
cd apps/api uv run python scripts/create_admin.py \ --email [email protected] \ --password YourSecurePassword \ --full-name "Admin User"
-
Start development servers
# Start all services (use separate terminals) just web # Web app (http://localhost:5173) just admin # Admin panel (http://localhost:5174) just marketing # Marketing site (http://localhost:3000) just api # API server (http://localhost:8000) just worker # Background worker
-
Verify installation
- Visit http://localhost:8000/docs for API documentation
- Visit http://localhost:5173 for the web app
- Visit http://localhost:5174 for the admin panel
This project uses just as the task runner. Run just to see all available commands.
Common commands:
# Development servers
just web # Start web app
just admin # Start admin panel
just marketing # Start marketing site
just api # Start API server
just worker # Start background worker
just frontend # Start all frontend apps (parallel)
# Building
just build-ts # Build all TypeScript apps
just build-web # Build web app only
just build-admin # Build admin panel only
# Testing
just test # Run all tests
just test-ts # TypeScript tests
just test-py # Python tests
just test-e2e # E2E tests (Playwright)
# Code quality
just lint # Lint all code
just lint-fix # Fix lint issues
just format # Format all code
just format-check # Check formatting
just typecheck # TypeScript type checking
# Database
just db-migrate # Run migrations
just db-migration "name" # Create new migration
# CI/CD
just ci # Run full CI checks
just pre-commit # Pre-commit hook (lint, format, test)
# Utilities
just clean # Clean build artifacts
# Maintenance
# Run these periodically to clean up expired data
just api-maintenance # Run API maintenance scriptThe system includes a maintenance script to handle cleanup of expired resources (tokens, invitations, etc.):
# Run maintenance tasks
cd apps/api
uv run python scripts/run_maintenance.py- Unit tests: Test individual functions and services
- Integration tests: Test API endpoints and database interactions
- E2E tests: Test complete user flows with Playwright
# All tests
just test
# TypeScript tests (Vitest)
just test-ts
cd apps/web && bun run test:watch # Watch mode
# Python tests (pytest)
just test-py
just test-pkg packages/core # Test specific package
uv run pytest -k "test_auth" # Run specific tests
# E2E tests (Playwright)
just test-e2e
cd apps/web && bun run test:e2e tests/auth.spec.ts # Specific test# Python coverage
uv run pytest --cov=poly_core --cov-report=html
# TypeScript coverage
cd apps/web && bun run test:coverageOnce the API is running, access interactive documentation:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
All endpoints are versioned under /v1/:
/v1/auth/*- Authentication/v1/teams/*- Team management/v1/jobs/*- Transcription jobs/v1/transcripts/*- Transcript management/v1/billing/*- Billing & subscriptions/v1/admin/*- Admin endpoints
The API uses JWT tokens for authentication:
- Login → Receive access token + refresh token
- Include token in requests:
Authorization: Bearer <token> - Refresh when access token expires
See infra/.env.example for a complete list. Key variables:
DATABASE_URL- PostgreSQL connection stringREDIS_URL- Redis connection string
JWT_SECRET- Secret for user JWT tokensJWT_EXPIRY- Token expiry (default: 15m)ADMIN_JWT_SECRET- Separate secret for admin tokensGOOGLE_CLIENT_ID- Google OAuth client IDGOOGLE_CLIENT_SECRET- Google OAuth secret
STRIPE_SECRET_KEY- Stripe secret keySTRIPE_PUBLISHABLE_KEY- Stripe publishable keySTRIPE_WEBHOOK_SECRET- Webhook signing secretSTRIPE_PRICE_ID_STANDARD- Standard plan price IDSTRIPE_PRICE_ID_PRO- Pro plan price ID
SMTP_HOST- Email server hostSMTP_PORT- Email server portSMTP_USER- Email usernameSMTP_PASSWORD- Email password (use app password for Gmail)SMTP_FROM- From email address
APP_URL- Web app base URL (for email links)ADMIN_URL- Admin panel base URLCORS_ORIGINS- Allowed CORS origins (comma-separated)DEFAULT_LANGUAGE- Default UI language (default: en)MAX_UPLOAD_SIZE_MB- Max audio file size (default: 100)
STORAGE_BACKEND- Storage backend:localors3STORAGE_PATH- Local storage path (if local backend)AWS_S3_BUCKET- S3 bucket name (if s3 backend)AWS_REGION- AWS regionAWS_ACCESS_KEY_ID- AWS access keyAWS_SECRET_ACCESS_KEY- AWS secret key
This project is configured for deployment on Railway.
Prerequisites:
- Railway account
- PostgreSQL database (Railway addon)
- Redis database (Railway addon)
Deployment steps:
-
Install Railway CLI
npm install -g @railway/cli railway login
-
Link project
railway link
-
Set environment variables
# Set all required variables from infra/.env.example railway variables set DATABASE_URL=<value> railway variables set REDIS_URL=<value> # ... set all other variables
-
Deploy
railway up
-
Run migrations
railway run just db-migrate
-
Create admin user
railway run python apps/api/scripts/create_admin.py \ --email [email protected] \ --password <secure-password>
The infra/railway-prod.json file contains the production service configuration.
- API:
GET /v1/health - Admin:
GET /v1/admin/system/health
We welcome contributions! Please follow these guidelines:
- Python: Follow PEP 8, use Ruff for linting/formatting (line length: 100)
- TypeScript: Use Biome for linting/formatting
- Imports: Use absolute imports with workspace aliases
- Naming:
- Python:
snake_casefor functions/variables,PascalCasefor classes - TypeScript:
camelCasefor functions/variables,PascalCasefor components
- Python:
Use Conventional Commits:
feat(web): add new feature
fix(sst): resolve bug
style(marketing): formatting changes
refactor(worker): code restructuring
test(admin): add tests
- Create a feature branch
- Make changes following code style guidelines
- Add tests for new functionality
- Run
just pre-committo verify all checks pass - Submit PR with clear description
MIT License - see LICENSE file for details
Built with: