4 Comments
User's avatar
⭠ Return to thread
Ananth Packkildurai's avatar

The Skill Set differs slightly from traditional data engineering [SQL-centric frameworks like dbt, etc]. Unstructured data processing requires unique skills like understanding concurrent programming and chunking techniques, which are uncommon for SQL-centric data pipelines.

Expand full comment
Chris Kornaros's avatar

I mean traditional data engineering is incredibly broad, you seem to just be describing ETL for structured data. True data engineering is a broad term that has always included ways to handle unstructured data (even if it wasn’t as common or easy as it is now).

It feels a bit disingenuous to slap AI on the job title and then post that this is all brand new. For example, multithreading/concurrency and chunking are both common techniques in data engineering for structured data. In no way is that unique to unstructured data, which is my entire point, you’re just describing how data engineers can use a new tool or implement it in their workflow. You haven’t made the case for this being a completely different role, other than slapping AI on the job title.

Expand full comment