International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Call for Paper Volume 14 Issue 3 May-June 2026 Submit your research for publication

Large Language Models for Data Catalog Enrichment: A Survey With Operational Evidence From Enterprise Deployments

Authors: Kuladeep Sandra

DOI: https://doi.org/10.37082/IJIRMPS.v12.i3.233063

Short DOI: https://doi.org/hbxvf9

Country: United States

Full-text Research PDF File:   View   |   Download


Abstract: Enterprise data catalogs have failed to achieve adoption despite billion-dollar investments because high-touch human curation does not scale with data volume. In deployments across banking and insurance, the first two catalog implementations stalled: a 2018 Azure Purview rollout reached only 350 registered tables and 12 active users, while a 2020 Collibra deployment grew to 1,200 tables but left 28% without registered owners 18 months after launch. A third implementation succeeded by integrating the catalog with the data access workflow, reaching 3,000 tables in 3 months. This paper surveys how Large Language Models (LLMs) address the residual curation gap. We report on a production pilot enriching 10,000 tables with GPT-4: 88% of generated descriptions were rated good or excellent, owner suggestion accuracy reached 72% exact match, and sensitivity classification achieved 85% agreement with human stewards. Steward review time fell from 8 minutes to 2 minutes per table. Ownership coverage rose from 28% to 89%; description completeness rose from 19% to 84%; active users grew from 8 to 127. We present a reference enrichment architecture, discuss failure modes including hallucination and inappropriate owner inference, and identify open research challenges in quality measurement, generalization, and privacy.

Keywords:


Paper Id: 233063

Published On: 2024-06-14

Published In: Volume 12, Issue 3, May-June 2024

Share this