ForestForest

Reflecting on My Journey with a Custom Databricks Framework

5 min readProgramming

For the past two years, I dedicated myself to developing and maintaining a custom data framework for Databricks. This framework was a significant part of my professional journey, designed to solve real-world challenges in standardizing ETL pipelines, reducing development effort, and simplifying processes through metadata-driven configurations.

However, I recently left the company where this framework was implemented, and I've stepped away from its architecture and design. Despite this, my thoughts keep returning to the framework — its successes, challenges, and the unanswered questions about its future. Here, I'd like to reflect on my experience and share some considerations for frameworks like this in a rapidly evolving tech landscape.


The Framework's Purpose and Accomplishments

When I first conceptualized the framework, my goals were ambitious but clear:

  • Standardization Across Teams:
    Developers often have unique coding styles, which can lead to inconsistencies. The framework aimed to create a unified approach to writing ETL pipelines, ensuring cleaner, more maintainable code.

  • Streamlining Development:
    By offering reusable functions and transformations, the framework reduced the amount of code engineers had to write, saving valuable time.

  • Simplifying Complexity for End-Users:
    A metadata-driven design allowed users to replace complex scripting with YAML configuration files, making the framework accessible to those with limited programming expertise.

These objectives resonated with the teams using it, and the framework was successfully adopted by over 20 clients. It delivered measurable value, making processes more efficient and reducing the workload for developers.


Challenges That Persist

Despite its success, the framework had limitations that became more apparent as Databricks evolved:

  • Technological Stagnation:
    The framework was built on older versions of Databricks and lacked support for newer features like Delta Live Tables, Unity Catalog, Serverless Compute, and Liquid Clustering.

  • Compatibility Gaps:
    Its reliance on outdated architectural patterns, such as Delta tables and mount points, made upgrades challenging and less attractive to new clients.

  • Technical Debt:
    Over time, the codebase grew to over 50,000 lines. A lack of documentation and delayed test case implementation led to maintenance challenges and increased the risk of errors.

  • Client Hesitation:
    As Databricks introduced powerful native features, many potential users questioned whether a custom framework was necessary.

  • Scalability and Adaptability:
    The framework struggled to adapt to the pace of Databricks' innovations, limiting its relevance in a rapidly changing ecosystem.


Looking Forward: Key Questions

Even though I am no longer directly involved in the framework/accelerator, I still find myself pondering its future. Here are some key questions I've been reflecting on:

  1. Is Custom Development Still Relevant?
    In a world where platforms like Databricks are evolving rapidly, is there still a place for custom frameworks, or should teams rely on native capabilities?

  2. What About Balance?
    How do we balance the benefits of a custom solution with the cost of maintaining it, especially when it becomes outdated?

  3. Should We Think Modular?
    Could smaller, modular components tailored to specific use cases be a better alternative to large, monolithic frameworks?

I believe the accelerator framework remains relevant. Even if teams are using older technologies, they still face the challenge of upgrading. The framework acts as a middle tier, abstracting complexities so clients don't need to make significant changes — in our case, they can continue to rely on YAML configurations. It's the framework itself that evolves to keep up with technological advancements. The key lies in improving the framework's design to better serve as this adaptable middle layer.


Future Considerations for Framework Development

If I were to revisit this project — or any similar endeavor — here's what I would prioritize:

  1. Modular Design Principles:
    Separate core functions, extensibility features, and utility tools into distinct components. This approach would make the framework easier to maintain and adapt.

  2. Modern Engineering Practices:

    • Start with a clear design process.
    • Integrate robust testing from the outset.
    • Ensure thorough documentation to reduce technical debt.
  3. Continuous Learning and Adaptation:
    Frameworks must evolve alongside the platforms they support. Staying updated on the latest technologies is essential to maintain relevance.

  4. Smaller, Targeted Frameworks:
    Instead of building a one-size-fits-all solution, focus on creating smaller tools tailored to specific needs.


Closing Thoughts

Leaving the company and stepping away from the framework has given me the opportunity to reflect on its journey and its place in the broader tech ecosystem. Frameworks like this have the potential to transform workflows and save time, but they also carry significant risks if not designed and maintained thoughtfully.

As I look to the future, I'm eager to explore new ideas, frameworks, and tools that align with evolving technologies. The lessons learned from this experience will undoubtedly shape my approach to any similar challenges ahead.

The next steps in this conversation will delve into critical areas such as code sharing, open source considerations, balancing the needs of open source with business objectives, and fostering a community around shared tools and knowledge. These topics hold the key to unlocking the full potential of frameworks like this and ensuring their sustainability.

For those facing similar crossroads, I'd love to hear your thoughts. How do you balance custom development with native capabilities? What strategies have you found effective for managing technical debt and ensuring adaptability?

Let's continue the conversation and build better solutions together.