Alberto Pan headshot.
Feature

Q&A: Alberto Pan, CTO at Denodo, on GenAI Unlocking Data Access

5 minute read
Myles Suer avatar
By
SAVED
How is AI changing the data market?

In an interview with Alberto Pan — EVP and CTO at Denodo, the Palo Alto, California-based data management company — we discussed how generative AI will impact the data space. While GenAI has primarily focused on unstructured data, which makes up 80% of all data, the real value in corporate settings lies in accessing databases and semi-structured data. Pan believes that GenAI can significantly enhance business intelligence (BI) and self-service data usage. Here, Pan shares his views on data and AI with VKTR:

How do you believe that data consumers will discover, view and access data with the emergence of GenAI?

“I think GenAI may be the technology that finally makes information self-service a reality,” Pan said. “After a lot of effort and investment, self-service data access is still relatively uncommon today in big organizations and relatively restricted to power users. GenAI has the potential to be the technology that finally changes this. With GenAI, all tools of the data management stack can potentially get more useful for data consumers: data can be automatically classified and tagged, last-mile data preparation and queries can be done using natural language. You can also automatically generate different meaningful visual analysis of the data, and probably many other things we are not even anticipating to date.”

In my discussion with Pan, he emphasized that to leverage GenAI effectively, organizations need to build a semantic layer to access databases and semi-structured data.

“While LLMs have focused on unstructured data, a semantic layer allows the deployment of RAGs on both structured and unstructured data, enabling the long-awaited business intelligence self-service,” Pan said. “This layer should be built using business language to align the business and data layers. Structured data can then be vectorized and utilized by RAGs, providing a unified view across multiple databases.”

What needs to be matured, so GenAI can be applied at scale in the enterprise?

Pan thinks there are two strong pre-requirements for successful self-service data access, and GenAI is not going to change this.

“You need a strong data layer that provides access to good quality data using the language of the business,” Pan said. “GenAI is great at understanding the semantics of user requests, but then those requests need to be matched with the actual artifacts in your databases that can be used to answer the user queries. GenAI applications will not be able to work accurately using only technical metadata about all your underlying databases: you need a business data layer for that.

“You need a data governance process that combines enforcement and flexibility. One of the risks of self-service is data access are inconsistencies and multiple versions of the truth. By taking self-service to a new dimension, GenAI can also take these problems to a new dimension if we are not careful. We need to balance decentralized creation of data products with centralized controls in areas such as managing personal data. So I think GenAI will increase the already existing focus on federated governance.

“Another challenge is transparency and explainability: this is a limitation of the current generation of GenAI technology, and it is important to build processes around these technologies to minimize the problems.”

How will GenAI transform how data companies make data ready for use?

Data readiness has slowed many organizations' data agendas down.

“Regarding data quality, I think GenAI will help to automate some tasks, such as cleaning, standardization and identification of sensitive data,” Pan said. “We already have tools for partially automating these processes, but their accuracy now will increase very significantly. At the same time, I think most heavy data preparation and integration tasks will not be 100% automated in the short-term. You will still need data engineers to create a business data layer that can be used effectively for BI, but now these data engineers will be powered with assistants and copilots that will make their work easier and help them to identify errors.”

How will GenAI transform the experience of data scientists?

Clearly, data scientists and the models they create are a big focus for AI. So how will GenAI transform their lives?

“Data discovery will get easier, as information will be automatically classified semantically with a high level of accuracy,” Pan said. “Last-mile data preparation of the type that data scientists usually do will also become much easier, as their tools will be able to automate many of these and provide suggestions for others. GenAI will also be great at suggesting new types of analysis and visualizations, like a very smart online assistant. However, I cannot stress this enough, this is all dependent on a solid data foundation that exposes the data in the appropriate formats and languages.”

Are companies ready for GenAI? What distinguishes those that are ready from those that are not?

Gartner has written extensively about the importance of data being AI-ready. Pan argues that there are clear differences between those that are AI ready or not.

“I think smart companies are very conscious about the need for a business data layer and strong governance,” Pan said. “In addition, there are other factors, like the innovation mindset of the company. GenAI needs rethinking some processes and overcoming some fears, such as the fear of sharing metadata with LLMs, that are different in different customers. While numerous customer pilots are underway, only data-mature organizations — less than 20%, according to MIT-CISR — have deployed solutions, seeing significant impacts on self-service with proper governance. Data-immature organizations face longer preparation times. Shared ownership and governance are essential. So far, customer service sectors are advancing quickly, whereas financial services recognize the value but progress more slowly.”

Was Denodo ready for GenAI, or did it require re-planning to get ready?

Many practitioners and vendors were caught flat footed with the emergence of GenAI. Where was Denodo?

“A core part of our vision has always been to create business-friendly views of the data of the organization for the data consumers, abstracting them from the technical details of the data systems of the organization,” Pan said. “To that respect, GenAI applications are a very similar type of consumer: one that is very fluent in natural language and business language but not necessarily in the technical details of your systems.

“From the point of view of how GenAI can help users to automate tasks, such as tagging sensitive data, creating queries using natural language, getting automatic suggestions and so on, well, this is an area where we were already working, using let’s say conventional AI. The emergence of GenAI made us redesign some of those features to leverage it and became a strong focus area of work in our latest releases.”

What happens to the data catalog in the era of GenAI? Does it move to the data engineer, or is it still relevant to data consumers?

The mission of many data catalogs was to make data accessible to data consumers. So how does GenAI change things?

“Catalogs can still be relevant,” Pan says. “While many of the classifications that used to be created manually will now be generated automatically or semi-automatically, a well-structured data catalog for data discovery will continue to be relevant for data consumers. It is important to note that with the generalization of self-service that GenAI can provoke, the number of consumable data products can potentially increase significantly.”

What else will change in the data management architecture with GenAI?

It seems clear that GenAI, like the internet, will have broader impact than what is perceived from day one.

“One thing we have not mentioned so far is the blurring between structured data, databases and APIs, and unstructured data — docs, media,” Pan said. “Now these types of data are dealt with two different technology stacks, almost completely independent, because previously there were not good ways to link the semantics of both worlds. Now those methods exist with very reasonable accuracy. That will have a strong impact we are only starting to see.”

Learning Opportunities

Parting Thoughts

From this interview, my takeaways are many. GenAI has the potential to make self-service data access a reality by automating data classification, preparation and visualization using natural language. To leverage this effectively, organizations need a semantic layer built on business language, enabling RAGs to handle both structured and unstructured data. This ensures a unified view across databases. Successful deployment, however, requires a strong data layer and governance processes that balance flexibility with control. While data-mature organizations are already reaping benefits, others will stumble. GenAI, without question, will necessitate rethinking data management architecture and how to integrate structured and unstructured data handling.

About the Author
Myles Suer

Myles Suer is an industry analyst, tech journalist and top CIO influencer (Leadtail). He is the emeritus leader of #CIOChat and a research director at Dresner Advisory Services. Connect with Myles Suer:

Main image: Via Denodo.
Featured Research