Establishing Guardrails for AI Tool Use: Formal Safety Constraints Using MCP Schemas

Gaurav Rohatgi

doi:10.37082/IJIRMPS.v13.i6.232848

Establishing Guardrails for AI Tool Use: Formal Safety Constraints Using MCP Schemas

Authors: Gaurav Rohatgi

DOI: https://doi.org/10.37082/IJIRMPS.v13.i6.232848

Short DOI: https://doi.org/hbg8tz

Country: United States

Full-text Research PDF File: View | Download

Abstract: Agentic large language models (LLMs) are increasingly used to perform actions beyond text generation, including querying databases, orchestrating workflows, updating identity configurations, and interacting with enterprise systems. While this evolution enables significant automation benefits, it also introduces safety-critical risks such as unintended state changes, privilege escalation, data leakage, and infinite or destructive tool-execution loops. The emerging Model Context Protocol (MCP) provides a standardized, schema-driven interface for exposing tools to models, creating a uniform enforcement layer that is essential for secure agentic AI in production environments (Model Context Protocol, Documentation). However, current deployments lack a comprehensive, formalized safety framework that constrains tool use at the protocol boundary.
This paper presents a formal guardrail model grounded in MCP tool schemas and runtime safety assertions. The proposed framework integrates three complementary components: (1) static schemas defining strict input/output types, enumerations, ranges, and regex constraints; (2) formal pre-conditions, post-conditions, and invariants governing the semantics of each tool invocation; and (3) dynamic policies such as context-aware authorization, dependency checks, rate limits, and loop-prevention triggers. Together, these constraints prevent both accidental and adversarial misuse by ensuring that LLM-issued tool calls remain within safe operational boundaries.
The design draws inspiration from prior research showing that LLMs perform better when tool interfaces are structured and deterministic. For example, ReAct demonstrates that interleaving reasoning with tool actions reduces hallucination-induced errors (Yao et al., 2022, p.1), while Toolformer shows that models can autonomously learn when and how to invoke APIs when given reliable contract-style interfaces (Schick et al., 2023). Our work extends these findings by introducing formal safety contracts that bind agent behavior at the protocol level. The framework also aligns with foundational AI safety concerns articulated by Amodei et al., who highlight unintended behavior, reward hacking, and unsafe exploration as core risks in autonomous systems.
We evaluate the guardrail model through simulated high-risk scenarios—safe SQL execution, constrained identity management operations, and controlled file-system access. Metrics include safety-interception rate, false-positive rejection rate, and schema-enforcement latency. Results show that schema-driven validation blocks the majority of unsafe requests with minimal execution overhead, demonstrating the viability of MCP as a safety-enforcing substrate for enterprise-grade agentic AI.
The paper concludes by outlining deployment patterns for multi-tenant SaaS and sovereign-cloud environments and recommending future research directions, including automated schema synthesis, policy-learning agents, and formal verification frameworks for MCP tool contracts.

Keywords: Agentic AI, Large Language Models (LLMs), AI Safety, AI Guardrails, Model Context Protocol (MCP), AI Governance

Paper Id: 232848

Published On: 2025-12-28

Published In: Volume 13, Issue 6, November-December 2025

All research papers published in this journal/on this website are openly accessible and licensed under Creative Commons Attribution-ShareAlike 4.0 International License; accordingly, any user can read, download, copy, distribute, print, search, or link to the full texts of the authors/researchers submitted and published articles, crawl them for indexing, pass them as data to any software, or use them for any other lawful purpose. The journal is fulfilling the DOAJ's definition of open access.

About IJIRMPS Indexing & Archiving Publication Ethics Peer Review & Plagiarism	Website/Journal Policies Usage Policy Content Policies Privacy Policy	Contact Us +91-9687-828-838 editor@ijirmps.org

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300 • Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Establishing Guardrails for AI Tool Use: Formal Safety Constraints Using MCP Schemas

Share this

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences E-ISSN: 2349-7300 • Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Establishing Guardrails for AI Tool Use: Formal Safety Constraints Using MCP Schemas

Share this

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300 • Impact Factor - 9.907