Skip to content

agile-lab-dev/witboost-databricks-dab-python-tech-adapter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

witboost

Designed by Agile Lab, Witboost is a versatile platform that addresses a wide range of sophisticated data engineering challenges. It enables businesses to discover, enhance, and productize their data, fostering the creation of automated data platforms that adhere to the highest standards of data governance. Want to know more about Witboost? Check it out here or contact us!

This repository is part of our Starter Kit meant to showcase Witboost integration capabilities and provide a "batteries-included" product.

Databricks DAB Python Tech Adapter

Overview

This project implements a Tech Adapter that validates, provisions, and unprovisions Databricks resources using Databricks Asset Bundles (DAB) within Witboost deployment workflows.

What's a Tech Adapter?

A Tech Adapter is a microservice which is in charge of deploying components that use a specific technology. When the deployment of a Data Product is triggered, the platform generates it descriptor and orchestrates the deployment of every component contained in the Data Product. For every such component the platform knows which Tech Adapter is responsible for its deployment, and can thus send a provisioning request with the descriptor to it so that the Tech Adapter can perform whatever operation is required to fulfill this request and report back the outcome to the platform.

You can learn more about how the Tech Adapters fit in the broader picture here.

What's a DAB?

Databricks Asset Bundles are a tool to provide a way to include metadata alongside your project's source files and make it possible to describe Databricks resources such as jobs and pipelines as source files. Ultimately a bundle is an end-to-end definition of a project, including how the project should be structured, tested, and deployed. This makes it easier to collaborate on projects during active development.

To know more see Databricks Asset Bundle documentation

Software stack

This microservice is written in Python 3.11, using FastAPI for the HTTP layer. Project is built with uv and supports packaging as Wheel and Docker image, ideal for Kubernetes deployments (which is the preferred option).

Building

Requirements:

Installing

uv sync

Type check: is handled by mypy:

uv run mypy src/

Tests: are handled by pytest:

uv run pytest --cov=src/ tests/. --cov-report=xml

It's possible to perform some integration tests

export GITLAB_TOKEN=<personal gitlab token>
export DATABRICKS_HOST=<databricks-host>
export DATABRICKS_CLIENT_ID=<databricks client id>
export DATABRICKS_CLIENT_SECRET=<databricks client secret>
uv run pytest . --runit

Artifacts & Docker image: the project leverages Poetry for packaging. Build package with:

uv build 

The Docker image can be built with:

docker build .

More details can be found here.

Note: the version for the project is automatically computed using information gathered from Git, using branch name and tags. Unless you are on a release branch 1.2.x or a tag v1.2.3 it will end up being 0.0.0. You can follow this branch/tag convention or update the version computation to match your preferred strategy.

CI/CD: the pipeline is based on GitLab CI. It's configured by the .gitlab-ci.yaml file in the root of the repository.

Running

To run the server locally, use:

source env/bin/activate # only needed if venv is not already enabled
uvicorn src.main:app --host 127.0.0.1 --port 8091

By default, the server binds to port 8091 on localhost. After it's up and running you can make provisioning requests to this address. You can also check the API documentation served here.

Deploying

This microservice is meant to be deployed to a Kubernetes cluster with the included Helm chart and the scripts that can be found in the helm subdirectory. You can find more details here.

License

This project is available under the Apache License, Version 2.0; see LICENSE for full details.

About Witboost

Witboost is a cutting-edge Data Experience platform, that streamlines complex data projects across various platforms, enabling seamless data production and consumption. This unified approach empowers you to fully utilize your data without platform-specific hurdles, fostering smoother collaboration across teams.

It seamlessly blends business-relevant information, data governance processes, and IT delivery, ensuring technically sound data projects aligned with strategic objectives. Witboost facilitates data-driven decision-making while maintaining data security, ethics, and regulatory compliance.

Moreover, Witboost maximizes data potential through automation, freeing resources for strategic initiatives. Apply your data for growth, innovation and competitive advantage.

Contact us or follow us on:

About

The Databricks DAB Python Tech Adapter. Part of the Witboost Starter Kit: https://github.com/agile-lab-dev/witboost-starter-kit

Resources

License

Stars

Watchers

Forks

Contributors