Media

AI Summit 2024

Today, December 11, 2024, I had the honor of participating in a panel at the AI Summit New York, discussing LLMs Application Solution Lifecycle: Development, validation and implementation

The panel addressed the complexities involved in the development, validation, and deployment of LLM and RAG applications. One of the key topics was the importance of structuring data effectively before indexing, focusing on encoding, chunking, and embedding. We explored how aligning prompts with embedded document structures plays a critical role in enhancing model performance and ensuring more accurate and efficient retrieval.

...

DBT Meetup Nov 2024

On November 21th 2024 I was speaking at the DBT Meetup in New York. The focus was streamlining the integration between dbt and Airflow.

At BMG, enabling Analytics Engineers to schedule their dbt models efficiently has been a key focus. As the Data Platform lead, I’ve worked on reducing dependencies between Analytics and Data Engineering teams while maintaining a centralized approach.

Our first try was to use Astronomer Cosmos, a tool that simplifies rendering dbt DAGs in Airflow. While it eased the development process, challenges like long DAG-bag load times emerged. To overcome this, we transitioned to offline DAG rendering, boosting scalability and performance by decoupling dbt and Airflow dependencies.

...

AI Summit 2023

On December 6th 2023 I participated in a panel discussion on the AI Summit alongside Joel Beckerman and Josephine Hua Pan under the title “Dawn of AI: The End of Human in the Entertainment Industry”. We were discussing the impact of AI on the entertainment industry.

We discussed how that AI is a valuable tool for enhancing creative processes rather than replacing human creators. AI acts as a tool to generating new forms of content and streamlining workflows.

...

Running Airflow

I had the honor to contribute to the Astonomer Blog on medium with an article on Running a Multi-Tenant Airflow Cluster

The post explores how we leverage Apache Airflow in a multi-tenant setup to orchestrate diverse workloads like royalty processing, financial analytics, and marketing insights. By utilizing Google Cloud Composer, we balances cost, stability, and operational overhead across multiple teams, ensuring seamless data workflows. Key practices include workload isolation, use of short-lived credentials, atomic tasks, and cost allocation through Kubernetes namespaces. The post also covers my thoughts on continuous deployment, DAG testing, and challenges around for maintaining and upgrading Airflow environments.

...
Copyright (c) 2025 Nico Hein