An opinionated Spring Boot starter for building data-processing microservices with job orchestration, data enrichment, quality gates, lineage tracking, and event-driven capabilities.
| Section | Description |
|---|---|
| Getting Started | Installation, prerequisites, and first steps |
| Core Capabilities | Data Jobs, Enrichers, Quality, Lineage, Transformation |
| Infrastructure | Resiliency, observability, persistence, events |
| Reference | API docs, configuration, architecture, examples |
| GenAI Integration | Python bridge for fireflyframework-genai |
- Java 25
- Maven 3.9+
- Spring Boot 3.x
- Familiarity with reactive programming (Project Reactor)
<parent>
<groupId>org.fireflyframework</groupId>
<artifactId>fireflyframework-parent</artifactId>
<version>26.02.06</version>
<relativePath/>
</parent>
<dependencies>
<dependency>
<groupId>org.fireflyframework</groupId>
<artifactId>fireflyframework-starter-data</artifactId>
<version>26.02.06</version>
</dependency>
</dependencies>| Goal | Guide |
|---|---|
| Understand the architecture | Architecture Overview |
| Build a data job microservice | Data Jobs Guide |
| Build a data enricher microservice | Data Enrichers Guide |
| See working code examples | Examples |
| Step-by-step walkthrough | Getting Started |
Orchestrated workflows for batch and async data processing with lifecycle management (start, check, collect, result, stop).
| Topic | Link |
|---|---|
| Complete guide (async and sync) | Data Jobs Guide |
| Overview and concepts | Data Jobs Overview |
Key features:
- Abstract base classes (
AbstractResilientDataJobService,AbstractResilientSyncDataJobService) - Standardized REST endpoints via
AbstractDataJobController - Configurable per-stage timeout enforcement
- Job execution result persistence and audit trails
Third-party provider integration for fetching and enriching data from external sources (credit bureaus, financial data, business intelligence).
| Topic | Link |
|---|---|
| Step-by-step guide | Data Enrichers Guide |
| Overview and concepts | Data Enrichers Overview |
Key features:
- Pluggable
DataEnricherandEnricherOperationframework with tenant isolation - Fallback chains with primary/secondary provider failover (
@EnricherFallback) - Smart enrichment controller with strategy-based routing
- Enrichment discovery controller for runtime operation catalog
- Per-provider resiliency configuration (circuit breaker, retry, rate limiter, bulkhead)
- Enrichment caching with configurable key generation and TTL
- Cost tracking, estimation, and preview/dry-run support
- SSE streaming for real-time batch enrichment results
Rule-based validation engine with configurable severity levels and evaluation strategies.
| Topic | Link |
|---|---|
| Framework guide | Data Quality |
Key features:
- Fail-fast and collect-all evaluation strategies
- Built-in rules: null checks, range validation, pattern matching, custom logic
- Quality gate integration for enrichment pipelines
Provenance tracking across enrichment operations and transformation pipelines.
| Topic | Link |
|---|---|
| Tracking guide | Data Lineage |
Key features:
- Automatic lineage recording across enrichment operations
- Pluggable
LineageTrackerwith in-memory default implementation - Records with source, destination, timestamp, and metadata
Composable, reactive transformation chains for post-enrichment data processing.
| Topic | Link |
|---|---|
| Transformation guide | Data Transformation |
Key features:
- Reactive
DataTransformerinterface withTransformationChaincomposition - Built-in transformers:
FieldMappingTransformer,ComputedFieldTransformer - Custom transformer support via functional interface
Fault tolerance patterns applied automatically and configurable per provider.
| Topic | Link |
|---|---|
| Patterns and configuration | Resiliency |
Includes circuit breaker, retry with exponential backoff, rate limiting, and bulkhead isolation. Supports global defaults and per-provider overrides.
Monitoring, metrics, distributed tracing, and health checks.
| Topic | Link |
|---|---|
| Observability guide | Observability |
| Logging guide | Logging |
Includes Micrometer integration with OpenTelemetry, automatic metrics for all operations, health check endpoints, and structured JSON logging.
Job execution results and audit trail storage.
| Topic | Link |
|---|---|
| Persistence guide | Persistence |
Automatic event publishing for job and enrichment lifecycle events.
| Topic | Link |
|---|---|
| Architecture overview | Architecture |
| Configuration | Configuration |
Includes CQRS integration, EDA auto-configuration, and orchestration engine support (Saga, TCC, Workflow).
| Document | Description |
|---|---|
| Architecture | Hexagonal architecture, design patterns, component diagram |
| Configuration | All configuration properties with defaults and examples |
| API Reference | REST endpoint specifications for all controllers |
| MapStruct Mappers | Data mapping conventions and mapper configuration |
| Testing | Testing strategies, utilities, and examples |
| Examples | Real-world usage patterns and recipes |
Python bridge package (fireflyframework-genai-data) for native integration with fireflyframework-genai.
| Topic | Link |
|---|---|
| Integration guide | GenAI Bridge |
Provides:
DataStarterClientfor HTTP communication with Java data services- Agent tools (
DataEnrichmentTool,DataJobTool,DataOperationsTool) andDataToolKit - Pipeline steps (
EnrichmentStep,QualityGateStep) for GenAI pipelines DataLineageMiddlewarefor automatic lineage tracking in agent runs- Pre-built agent template (
create_data_analyst_agent)
+------------------------------------------------------------------+
| Your Application |
| |
| +----------------+ +----------------+ +----------------+ |
| | Data Jobs | | Data Enrichers | | Quality & | |
| | | | | | Lineage | |
| | - Async | | - Credit | | - Rules | |
| | - Sync | | - Company | | - Tracking | |
| +-------+--------+ +-------+--------+ +-------+--------+ |
| | | | |
| +-------+--------------------+--------------------+--------+ |
| | fireflyframework-starter-data (Core) | |
| | | |
| | - Abstract base classes - Fallback chains | |
| | - Observability (automatic) - Cost tracking | |
| | - Resiliency (per-provider) - Transformation chains | |
| | - Event publishing - Preview & SSE | |
| +-------+--------------------+--------------------+--------+ |
| | | | |
| +-------+--------+ +-------+--------+ +-------+--------+ |
| | Orchestrators | | Providers | | GenAI Bridge | |
| | | | | | | |
| | - Airflow | | - REST APIs | | - Tools | |
| | - AWS SF | | - SOAP APIs | | - Steps | |
| | - Mock | | - gRPC APIs | | - Agents | |
| +----------------+ +----------------+ +----------------+ |
+------------------------------------------------------------------+
See Architecture for detailed design patterns and component documentation.
Copyright 2024-2026 Firefly Software Foundation. All rights reserved.