Blog

Use Case: Fyrefuse as a DataOps Solution for Telco Companies

Written by Fabrizio Rocco | Jun 26, 2024 9:08:59 AM

In this Use Case we’ll present Fyrefuse as a DataOps solution to embrace Big Data transformation in modern business environments. Fyrefuse is an end-to-end DataOps platform that provides Data management, Governance and collaboration tools to automate continuous data delivery.

The client, a mobile and fixed telco operator with customers in four continents, has developed a low-latency Service Delivery Platform (SDP) based on microservices in order to embrace digital transformation. In the pursuit of improving customer experience, data team ingested 24 Tb of data daily from legacy systems, struggling to resolve a bottleneck due to lack of reusability and automation. Fyrefuse allowed users to build, schedule and monitor robust reusable pipelines with zero coding swiftly, regaining agility through streamlined processes and standardization.

The use case setup included the metadata catalog’s population, the configuration of a codeless ingestion pipeline (from source to target) and the execution of the pipeline instance. Our cloud lab configuration is clustered on Kubernetes 1.14.

Population of Data Explorer

The first stage of the use case deals with the Data Explorer population with metadata from legacy and SDP. Client’s legacy systems, containing data such as Billing, CRM, Finance, ERP, and more, were connected to Fyrefuse through a set of ready-to-use connectors in minutes. In this case, we used MongoDB connector for batch data ingestion, and Kafka and PostgreSQL for streaming data. The ingested data were written in target CSV files.

Once source and target data cores connected, a standard Fyrefuse template was used to populate the Data Explorer with understandable and enriched metadata.

Data Consumers explore the metadata in the Data Explorer but the actual data can be obtained only after a Data Manager approves their data request and under proper anonymisation policies. Fyrefuse also offers Data Manager and Data Consumers a possibility to communicate and collaborate.

Point-and-Click Pipeline configuration

The second stage requires the configuration of a point-and-click ingestion pipeline. Pipeline designer contains a sequence of jobs to be configured allowing for some embedded data transformation steps before ingestion of data from a source into a target. In this use case, we build a complex pipeline leveraging a Join type of job.

A pipeline can be created in Pipeline designer either ad-hoc or in association with an approved Data Request. In our case, we created a stream pipeline linked to an approved data request. Fyrefuse allows users to create, reuse, log and monitor pipelines in an automated fashion leaving more time to focus on innovation rather than manual routines.

How it works:

Once the data is obtained by the source, a first job joins data from MongoDB. All successful results are then written on the first CSV output file (By configuring the output file is possible to change the repository output).

In case no matching results found in the primary Join job, a new Join is executed. This time using data from a legacy DB to execute the Join (Legacy DB is not a MongoDB but a SQL DB). All successful results are then written on the second CSV output file (By configuring the output file is possible to change the repository output).

Finally, in case no matched data is found, a final output file will be produced to report original data ingested not matched.

Pipeline Execution

The final step of the use case is the execution of the pipeline previously created.

Every run of the pipeline, that Fyrefuse calls ‘Instance’, is documented in detail in order to keep track of Data Operations. In Fyrefuse, pipelines are saved as reusable templates and can be automatically scheduled to run on a specific day or on a regular basis.

Benefits

  • Each Data Operation can be monitored in detail and it is provided with logs and information on each stage.
  • Integrate all the stakeholders in the environment to improve team collaboration. Keep the team in sync with the data lifecycle
  • Analysts, business users or even external data consumers
  • We make sure every Client reaches their vision.

The Product Fyrefuse has a very wide spectrum of features aimed to empower Fast data teams with Data Governance and Data lifecycle management.

User’s Roles

Fyrefuse orchestrates secure data management by defining different user roles with specific permissions.

There is also the possibility to add customized User profiles that can be created or edited through a permission management pane that lists over 60 atomised permissions.

Check our article about Team Collaboration on Fyrefuse’s blog.

Data connectors

Fyrefuse connects to different systems and environments through a set of prebuilt or developed ad-hoc connectors designed for applications, databases, file stores and data warehouses.

The image below lists the various data connections methods available in Fyrefuse.

Reusable Pipelines

Fyrefuse allows users to build robust reusable pipelines with zero coding swiftly. This leaves more time to focus on innovation rather than manual routines.

Data Pipelines design is codeless and does not require any technical skills, yet it allows most common data preparation and transformation tasks, which are called Jobs.

Jobs are the data preparation tasks that are executed before delivering data into a target Data Core.

Common jobs range from Match, Join, Split, Concatenate to many others.

Data Exploration

Fyrefuse keeps all metadata in a single place, so data stakeholders are aligned on what data is available, where it comes from, and what it means.

Users have two alternative modes to explore the metadata contained in the data sources. One is the Data Catalog that provides a positional exploration, the other is the Business Glossary, that makes exploration easy to non-technical users.

Architecture

Fyrefuse delivers DataOps across Amazon Web Services (AWS), Google Cloud Platform (GCP) and Microsoft Azure providing the following benefits:

  • One-click Deployment
  • Monitor performance, resources and logs in a single environment.
  • Make high-impact scalability changes fast, safely and simply.

According to the customer requirements, Fyrefuse can be deployed taking advantage of other architectural solutions, as well:

  • On-prem deployment, using hardware available to the customer
  • Multi-cloud platform deployment, using service from different platforms
  • Deploy with hybrid solution, with on-prem and cloud-based deployment

Fyrefuse is divided into manager and portal.

The manager is the open source component upon which Fyrefuse is built.

It enables independent orchestration and engine for both batch and streaming. It is built using modern technologies like Scala, AKKA and deployed on Kubernetes.

Conclusion

Fyrefuse contextualizes data at scale in real-time, enabling data engineers, data stewards and data analysts to make better decisions and improve team collaboration.