The Emerging Role of AI Data Engineers - The…

Jan 15, 2025

Your AI initiatives are only as good as the data powering them—AI Data Engineers make it all possible.

5 Comments

Jan 16, 2025

I think you’re just describing the natural evolution of the Data Engineering role. The field has dealt with unstructured data for a while now, it’s just that using LLMs to parse/document is newer. To me, that’s just utilization of a new tool, in the same way Airflow is more recent than Cron.

In my experience, there are a lot of data engineering roles that want or require unstructured/NoSQL. Even if you aren’t working with video on the scale of Netflix, dealing with PDFs is fundamentally similar (clearly not the same).

Reply (1)

Ananth Packkildurai

Jan 16, 2025

The Skill Set differs slightly from traditional data engineering [SQL-centric frameworks like dbt, etc]. Unstructured data processing requires unique skills like understanding concurrent programming and chunking techniques, which are uncommon for SQL-centric data pipelines.

Reply (1)

Chris Kornaros

Jan 16, 2025

I mean traditional data engineering is incredibly broad, you seem to just be describing ETL for structured data. True data engineering is a broad term that has always included ways to handle unstructured data (even if it wasn’t as common or easy as it is now).

It feels a bit disingenuous to slap AI on the job title and then post that this is all brand new. For example, multithreading/concurrency and chunking are both common techniques in data engineering for structured data. In no way is that unique to unstructured data, which is my entire point, you’re just describing how data engineers can use a new tool or implement it in their workflow. You haven’t made the case for this being a completely different role, other than slapping AI on the job title.