Member-only story
Handling Unstructured Data
Using executable BPMN diagrams and a workflow engine heading towards Intelligent Automation.
At this point, I believe it is fair to say that we have a good handle on the processing of structured data. As an industry, we have plenty of tools to process, store and analyze rows and columns of data. We have even more tools to paint visuals, create dashboards, and produce reports. Much is written about performing exploratory data analysis (EDA), Machine Learning, Statistics, and SQL topics every day. There are even more courses and books developed and released each month. But what about documents not in neat rows and columns, such as PDF or web pages? These are also structured, but they do not have rows & columns as a consistent structure throughout. The term unstructured is often used loosely in these contexts. I believe that there is really no such concept as unstructured data. Everything is organized in patterns; otherwise, we as humans won’t understand anything.
A lot of information is produced in documents every day, published, posted, and shared. I get a tonne of emails every day. So I got to thinking about creating a context where all those ‘unstructured’ documents might turn into information, be processed, read, parsed, and organized into ‘rows & columns’ in a seamless way. The more I thought about it, the…

