Blog

Understanding Support Patterns

Understanding Support Patterns #

Finding patterns in the support cases helps to find structural issues and bottlenecks in the data platform.

Just after BigQuery released the ML.GENERATE_TEXT functions (mid 2024) and made Gemini accessible though BigQuery, I was curios to to test this new feature. And what better use case then understanding Support Patterns and issues with the support on the fly.

Thus, I tried these features by analyzing a sample of our internal support tickets.

...

BigQuery Storage Optimization

BigQuery Storage Optimization #

Over time data easily accumulates. Purging no longer needed data (Bad Data) can save cost and can also reduce the carbon footprint of any Data Warehouse.

In this post, I am describing a simple to identify unused and therefore potentially obsolete data on a table level in BigQuery. This method is easy to reproduce and may help you to also reduce your BigQuery storage cost.

...

Ingestion with dbt & DuckDB

Streamlined Data Ingestion with dbt and DuckDB #

Efficient file processing is crucial in data engineering. I just did a small experiment that explores integrating dbt (data build tool) with DuckDB (dbt-duckdb), enhanced by an Excel plugin. This combination of tools appears to be a simple framework for local and remote file processing and exposes itself as a powerful framework for ingestion tasks.

Why Combine dbt and DuckDB? #

By leveraging dbt and DuckDB together, we can ensure:

...

GCS Storage Optimization

GCS Storage Optimization #

With data volumes continuously growing, optimizing Google Cloud Storage usage can lead to significant cost savings. To tackle this challenge, I developed a Python utility that helps summarize and analyze the stored data, making it easier to identify large files and folders on GCS.

While identifying the total storage cost on a bucket is relatively straight forward using the GCP billing report, identifying large files and folders on buckets can be a tedious task. This utility helps to quickly identify large blobs / files and folders.

...

Economic Cooperative Systems

Economic Cooperative Systems #

I am reflecting on the potential of AI and what has to be build.

I am taking a procurement process as the baseline for a cooperative system. The perspective of the buying model:

  1. Identifying Needs: The whole process starts with a need. This need can be identified by a model itself recognizing that it might not be the best to solve the problem itself.
  2. Supplier Research and Selection: reach out to a marketplace / exchange and fin potential suppliers based on something like a request for Proposal/Quotation (RFP/RFQ). The offered prices in conjunction with a vendor self assessed confidence for a response are received as a bid in the exchange.
  3. Approval / PO issuance: The buying model can now pick it’s supplier based on it’s configuration of price & quality. (There might be more than one quality metric.)
  4. Assign the task to the model identified.
  5. Receipt and Inspection: the buying model can now assess the quality of the response or might leave this assessment to the user.
  6. Payment: a payment can be disputed if the response is of a objective quality below a certain measure. (There might be quality assurance / audit models within the system to ensure orderly conduct of business.) If the objective measures are ok, a payment has to be made. Feedback can be given, that yields future decisions, blacklisting of models etc.
  7. Keeping and Audit: Given experience, there must be a record for each transaction with all it’s parameters. This allows for further
  8. Performance Review and Relationship Management: The supplier’s performance is reviewed, and feedback is provided. This step also involves maintaining and managing the relationship with the supplier for future transactions.and auditing.

It is easy to derive the vendor perspective from the above.

...
Copyright (c) 2025 Nico Hein