MCP Banking Workflows: AI-Powered Model Risk Management Automation
MCP Banking Workflows is a production-ready Model Context Protocol (MCP) server that automates model documentation validation, dependency analysis, and regulatory compliance checking for banking credit risk models. The system addresses a critical bottleneck in Model Risk Management: documentation drift - where model code evolves through versions while PowerPoint presentations, Excel data dictionaries, and Word white papers fall out of sync. Using 10 specialized tools accessible via Claude or any MCP-compatible LLM, the server enables AI assistants to validate cross-file consistency, analyze change impact across a 16-model dependency graph, check SR 11-7 compliance, and generate analyst onboarding briefs - transforming hours of manual cross-referencing into seconds of automated validation. Important Disclaimer: the case stuies and any data for this product is simulated. There is no affiliation and personal information on any data for this project.
The Problem
Model Risk Management teams at banks face a persistent challenge that creates regulatory risk and audit exposure:
Documentation Drift:
- Model code evolves through versions (v3.5 → v4.2)
- PowerPoint presentations shown to validation committees contain outdated thresholds
- Excel data dictionaries don’t match the variables actually used in SAS code
- White papers describe methodology that was changed two versions ago
- Committee meeting minutes reference superseded parameter values
Manual Validation Burden:
- Cross-referencing a single model across SAS code, Excel dictionary, Word documentation, and PowerPoint takes 4-8 hours
- Each model version change triggers documentation updates across 5+ files
- New analysts spend weeks understanding model dependencies and institutional knowledge
- Version comparisons require line-by-line code review
Regulatory Exposure:
- Federal Reserve SR 11-7 requires accurate, complete model documentation
- OCC 2011-12 mandates documented model governance processes
- Inadequate documentation can delay or block model approvals
- Audit findings on documentation gaps cost weeks of remediation effort
Dependency Blindness:
- Banks maintain portfolios of 10-50+ interconnected models
- Changing one model triggers cascading revalidation requirements
- Data source changes affect multiple models simultaneously
- Impact assessment requires manual tracing through model registries
The problem isn’t lack of effort - it’s the inherent difficulty of maintaining consistency across file formats, tracking dependencies across model portfolios, and ensuring regulatory compliance at scale.
The Solution
MCP Banking Workflows introduces an AI-accessible toolkit that enables LLMs to perform model risk management tasks that previously required senior analyst expertise. The server exposes 10 tools via the Model Context Protocol, allowing Claude (or any MCP-compatible assistant) to read, compare, and validate model artifacts autonomously.
Core Architecture
MCP Protocol
Claude/LLM <──────────────────────────────────> Banking MCP Server
│
┌───────────────────────────────────┼───────────────────────────────────┐
│ │ │
v v v
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
│ Model Code │ │ Excel │ │ Word │
│ (.sas, .py) │ │ Dictionaries │ │ Documents │
└─────────────┘ └──────────────┘ └──────────────┘
│ │ │
│ v │
│ ┌──────────────┐ │
└─────────────────────────>│ Model │<───────────────────────────┘
│ Registry │
│ (JSON) │
└──────────────┘
│
┌─────────────────────────────────┼─────────────────────────────────┐
v v v
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
│ PowerPoint │ │ Dependency │ │ SR 11-7 │
│Presentations│ │ Graph │ │ Compliance │
└─────────────┘ │ (16 models)│ │ Checker │
└──────────────┘ └──────────────┘
The server employs a filesystem-first approach: all model artifacts (SAS code, Excel dictionaries, Word documentation, PowerPoint presentations) remain in their native formats. The MCP server provides read-only access via regex parsing, library extraction (openpyxl, python-docx, python-pptx), and structured JSON output - enabling LLMs to reason about model consistency without modifying source files.
Key Features
10 Specialized Tools
| Tool | Description |
|---|---|
list_model_files |
List all SAS and Python model files in the repository |
extract_sas_parameters |
Parse DSCR/LTV thresholds, property segments, and model variables from SAS code |
read_excel_dictionary |
Extract variable definitions, data types, and business rules from Excel data dictionaries |
check_cross_file_consistency |
Validate that model code matches data dictionary (thresholds, variables, segments) |
compare_model_versions |
Diff two model versions to identify changes in variables, thresholds, and performance |
list_word_documents |
List available white papers, meeting minutes, and validation reports |
extract_word_content |
Query Word documents by section (methodology, limitations, performance, etc.) |
map_model_dependencies |
Analyze upstream/downstream dependencies and estimate revalidation effort |
check_sr11_compliance |
Validate model documentation against Federal Reserve SR 11-7 requirements |
check_presentation_consistency |
Detect outdated information in PowerPoint presentations vs current model code |
Cross-File Consistency Validation
Problem Solved: “Does my SAS code match what’s documented in the data dictionary?”
Tool: check_cross_file_consistency
Input: sas_filename="CRE_PD_v4.2.sas", excel_filename="CRE_PD_data_dictionary.xlsx"
Output:
======================================================================
CROSS-FILE CONSISTENCY CHECK
======================================================================
SAS File: CRE_PD_v4.2.sas
Excel File: CRE_PD_data_dictionary.xlsx
======================================================================
1. DSCR THRESHOLD COMPARISON
----------------------------------------------------------------------
SAS Code: 1.25
Excel Dictionary: 1.25
Status: CONSISTENT
2. LTV THRESHOLD COMPARISON
----------------------------------------------------------------------
SAS Code: 0.80
Excel Dictionary: 0.80
Status: CONSISTENT
3. PROPERTY TYPE SEGMENTATION
----------------------------------------------------------------------
SAS segments found: 4
- MULTIFAMILY -> segment 1
- OFFICE -> segment 2
- RETAIL -> segment 3
- INDUSTRIAL -> segment 4
Status: CONSISTENT - 4 property segments defined
4. MODEL VARIABLES
----------------------------------------------------------------------
Variables in SAS model: DSCR, LTV, property_type, loan_age, unemployment_rate
Variables documented in Excel: 12
Status: CONSISTENT - All model variables documented
======================================================================
SUMMARY: 4 of 4 checks passed
Result: ALL CHECKS PASSED
======================================================================
Model Version Comparison
Problem Solved: “What changed between v3.5 and v4.2 of the CRE-PD model?”
Tool: compare_model_versions
Input: model_name="CRE_PD", version_old="v3.5", version_new="v4.2"
Output:
{
"comparison": {
"model_name": "CRE_PD",
"old_version": "v3.5",
"new_version": "v4.2"
},
"changes": {
"variables": {
"added": ["unemployment_rate"],
"removed": [],
"unchanged": ["DSCR", "LTV", "property_type", "loan_age"]
},
"thresholds": {
"dscr": {"old": 1.20, "new": 1.25, "changed": true},
"ltv": {"old": 0.80, "new": 0.80, "changed": false}
},
"segments": {
"old_count": 3,
"new_count": 4,
"changed": true,
"details": "Added INDUSTRIAL as separate segment (previously combined with RETAIL)"
},
"performance": {
"auc_old": 0.823,
"auc_new": 0.842,
"improvement": "+2.3%"
}
},
"summary": {
"total_changes": 3,
"breaking_changes": 0,
"enhancements": [
"Added unemployment_rate variable",
"DSCR threshold changed from 1.20 to 1.25",
"Expanded from 3 to 4 property segments",
"Improved AUC by +2.3%"
],
"recommendation": "v4.2 is a significant enhancement over v3.5 with improved discrimination and new variables"
}
}
Dependency and Impact Analysis
Problem Solved: “If I change the CRE-PD model, what downstream models are affected?”
The server maintains a 16-model dependency graph across 4 layers:
| Layer | Models | Description |
|---|---|---|
| Layer 1 | CRE-PD, CRE-LGD, CNI-PD, RES-MORT-PD, CC-PD, AUTO-PD, SB-PD | Source PD/LGD models |
| Layer 2 | CECL-CRE, CECL-CNI, CECL-RES, CECL-CARDS, CECL-AUTO, CECL-SB | CECL reserve calculations |
| Layer 3 | ECON-CAPITAL | Economic capital aggregation |
| Layer 4 | STRESS-TEST, MODEL-MONITOR | Stress testing and monitoring |
Tool: map_model_dependencies
Input: model_id="CRE-PD-001", analysis_type="impact"
Output:
{
"model_id": "CRE-PD-001",
"downstream_dependencies": {
"direct": {
"count": 3,
"models": ["CECL-CRE", "STRESS-TEST", "MODEL-MONITOR"],
"impact": "Require revalidation if this model changes"
},
"indirect": {
"count": 1,
"models": ["ECON-CAPITAL"],
"impact": "Require impact assessment if this model changes"
}
},
"change_impact_assessment": {
"models_requiring_revalidation": ["CECL-CRE", "STRESS-TEST", "MODEL-MONITOR"],
"total_models_affected": 4,
"percentage_of_portfolio": "25.0%",
"estimated_effort": {
"direct_revalidation": "240-360 hours",
"impact_assessments": "20-40 hours"
},
"risk_rating": "HIGH"
}
}
SR 11-7 Compliance Checking
Problem Solved: “Does this model have complete documentation for the next regulatory exam?”
SR 11-7 is the Federal Reserve’s guidance on Model Risk Management. The tool validates 9 required documentation elements:
- Purpose and Scope
- Data Sources
- Methodology/Theoretical Basis
- Variable Definitions (data dictionary)
- Performance Metrics
- Limitations and Assumptions
- Governance and Approval
- Ongoing Monitoring Plan
- Change Log/Version History
Tool: check_sr11_compliance
Input: model_id="CRE-PD-001"
Output:
{
"model_id": "CRE-PD-001",
"sr11_compliance": {
"overall_status": "SUBSTANTIALLY_COMPLIANT",
"message": "Model has all required documentation but 1 element has incomplete sources",
"score": "8/9 requirements fully met",
"summary": {
"compliant": 8,
"partial": 1,
"missing": 0
}
},
"documentation_found": {
"model_code": ["CRE_PD_v4.2.sas"],
"data_dictionary": ["CRE_PD_data_dictionary.xlsx"],
"white_paper": ["CRE_PD_Model_Documentation_v4.2.docx"],
"meeting_minutes": ["Model_Validation_Committee_Minutes_Nov2024.docx"]
},
"remediation_plan": [
{
"priority": "MEDIUM",
"requirement": "Performance Metrics",
"action": "Complete documentation - missing: validation_report",
"suggested_sources": ["validation_report"]
}
]
}
Presentation Consistency Checking
Problem Solved: “Is this committee presentation accurate, or does it show outdated thresholds?”
Tool: check_presentation_consistency
Input: pptx_filename="CRE_PD_Validation_Presentation.pptx", sas_filename="CRE_PD_v4.2.sas"
Output:
{
"analysis": {
"status": "NEEDS_UPDATE",
"message": "Presentation has 2 inconsistencies that should be corrected",
"total_slides": 12,
"slides_with_issues": 2
},
"source_of_truth": {
"dscr_threshold": 1.25,
"ltv_threshold": 0.80,
"segment_count": 4,
"property_types": ["INDUSTRIAL", "MULTIFAMILY", "OFFICE", "RETAIL"]
},
"inconsistencies": [
{
"slide": 5,
"type": "THRESHOLD_MISMATCH",
"parameter": "DSCR",
"presentation_value": 1.20,
"code_value": 1.25,
"severity": "HIGH",
"recommendation": "Update DSCR threshold from 1.20 to 1.25"
},
{
"slide": 3,
"type": "SEGMENT_COUNT_MISMATCH",
"parameter": "Property Segments",
"presentation_value": 3,
"code_value": 4,
"severity": "MEDIUM",
"recommendation": "Update segment count from 3 to 4"
}
],
"recommendation": "Update the slides listed above before presenting to Model Validation Committee"
}
Prompt Templates
Beyond tools, the server provides rich prompt templates that gather context from multiple sources and structure it for complex analytical tasks:
New Analyst Onboarding
Prompt: new_analyst_onboarding
Input: model_id="CRE-PD-001"
Generates a comprehensive onboarding brief by:
- Calling
check_sr11_compliancefor compliance status - Calling
map_model_dependenciesfor upstream/downstream context - Loading model registry metadata
- Extracting SAS parameters
- Listing available documentation
The LLM receives all this context and produces a structured onboarding document with:
- Executive Summary
- Key Things to Know on Day One
- Important Thresholds and Parameters
- Dependency Map
- Compliance Priorities
- Documentation Roadmap
- 30-60-90 Day Plan
Model Change Impact Assessment
Prompt: model_change_impact_assessment
Input: model_name="CRE_PD", version_old="v3.5", version_new="v4.2"
Generates a comprehensive impact assessment by:
- Comparing code between versions
- Extracting performance sections from both version’s documentation
- Extracting variable documentation from both versions
- Mapping downstream dependencies
The LLM produces an assessment with:
- Variable Changes Analysis
- Threshold and Segment Changes
- Performance Impact
- Loss Forecast Implications
- Downstream Impact
- Documentation Updates Required
- Risk Assessment
Demo Scenarios
Scenario 1: Pre-Committee Validation
Context: A model developer is presenting CRE-PD v4.2 to the Model Validation Committee tomorrow.
Workflow:
- Run
check_presentation_consistencyto verify slides match current code - Run
check_sr11_complianceto confirm documentation completeness - Run
compare_model_versionsto summarize changes from last approved version
Result: Developer discovers slide 5 shows outdated DSCR threshold (1.20 instead of 1.25), corrects it before committee, avoids embarrassing Q&A about inconsistencies.
Scenario 2: Impact Assessment for Data Source Change
Context: The economics team is changing their unemployment rate forecast methodology. Which models are affected?
Workflow:
- Run
map_model_dependencieswithdata_source="unemployment_rate" - Review directly affected models (CRE-PD, STRESS-TEST)
- Trace indirect impacts through dependency graph
Result: Team identifies 3 models requiring revalidation, 2 requiring impact assessment, estimates 280-400 hours of work, can plan resource allocation before implementing change.
Scenario 3: New Analyst Onboarding
Context: A new model developer is joining the team and taking over responsibility for the CRE-PD model.
Workflow:
- Run
new_analyst_onboardingprompt template - LLM generates comprehensive briefing document
- New analyst receives structured 30-60-90 day plan
Result: Instead of 2-3 weeks of tribal knowledge transfer, analyst has a structured document covering everything they need to know, with links to source documentation.
Performance Results
Tool Response Times
| Tool | Typical Response |
|---|---|
list_model_files |
< 100ms |
extract_sas_parameters |
200-500ms |
read_excel_dictionary |
300-700ms |
check_cross_file_consistency |
800ms-1.5s |
compare_model_versions |
1-2s |
extract_word_content |
500ms-1s |
map_model_dependencies |
200-500ms |
check_sr11_compliance |
1-2s |
check_presentation_consistency |
1-3s |
Time Savings (Estimated)
| Task | Manual Time | With MCP Server |
|---|---|---|
| Cross-file consistency check | 2-4 hours | 30 seconds |
| Version comparison | 4-8 hours | 2 minutes |
| SR 11-7 compliance check | 2-3 hours | 1 minute |
| Presentation validation | 1-2 hours | 30 seconds |
| Dependency impact analysis | 4-6 hours | 1 minute |
| New analyst onboarding brief | 2-3 days | 5 minutes |
Tech Stack
| Component | Technology | Purpose |
|---|---|---|
| Protocol | MCP (Model Context Protocol) | Anthropic’s standard for AI tool integration |
| Language | Python 3.10+ | Core development language |
| Excel Parsing | openpyxl | Data dictionary extraction |
| Word Parsing | python-docx | White paper and meeting minutes extraction |
| PowerPoint Parsing | python-pptx | Presentation content extraction |
| SAS Parsing | regex | Threshold and variable extraction |
| Async Runtime | asyncio | Non-blocking tool execution |
| Testing | pytest | Unit and integration tests |
Project Structure
mcp_banking_workflows/
├── servers/
│ └── banking_filesystem_server.py # MCP server (950+ lines, 10 tools)
├── mock_data/
│ ├── model_registry.json # 16 models with full metadata
│ ├── models/ # SAS and Python model code
│ │ ├── CRE_PD_v4.2.sas
│ │ ├── CRE_PD_v3.5.sas
│ │ └── CNI_PD_v5.0.py
│ ├── excel/ # Data dictionaries
│ │ ├── CRE_PD_data_dictionary.xlsx
│ │ └── CNI_PD_data_dictionary.xlsx
│ ├── word/ # White papers, meeting minutes
│ │ ├── CRE_PD_Model_Documentation_v4.2.docx
│ │ └── Model_Validation_Committee_Minutes_Nov2024.docx
│ └── presentations/ # Committee presentations
│ └── CRE_PD_Validation_Presentation.pptx
├── tests/
│ ├── test_sr11_compliance.py
│ ├── test_dependency_mapping.py
│ ├── test_version_comparison_tool.py
│ ├── test_word_extraction.py
│ └── test_presentation_consistency.py
├── requirements.txt
└── README.md
Installation & Usage
Prerequisites
- Python 3.10+
- pip
Setup
# Clone the repository
git clone https://github.com/pmcavallo/mcp_banking_workflows.git
cd mcp_banking_workflows
# Install dependencies
pip install -r requirements.txt
Running the Server
# Direct execution
python servers/banking_filesystem_server.py
# With MCP Inspector (for testing)
# Windows
set CLIENT_PORT=6290 && set SERVER_PORT=6291 && npx @modelcontextprotocol/inspector py servers/banking_filesystem_server.py
Claude Desktop Integration
Add to claude_desktop_config.json:
{
"mcpServers": {
"banking-mrm": {
"command": "python",
"args": ["C:/Projects/mcp_banking_workflows/servers/banking_filesystem_server.py"]
}
}
}
What This Demonstrates
MCP Implementation Patterns
- Tool Design: 10 tools with clear input/output contracts
- Multi-Source Integration: Unified access to SAS, Excel, Word, PowerPoint
- Structured Output: JSON responses that LLMs can reason about
- Prompt Templates: Rich context gathering for complex analytical tasks
Model Risk Management Expertise
- SR 11-7 Compliance: Understanding of Federal Reserve documentation requirements
- Dependency Analysis: Graph-based impact assessment for interconnected model portfolios
- Version Control: Systematic comparison of model iterations
- Cross-Format Validation: Consistency checking across code and documentation
Production AI Patterns
- Async Architecture: Non-blocking tool execution
- Error Handling: Graceful degradation with informative messages
- File Format Agnostic: Native parsing of banking-standard formats
- Extensibility: Clear patterns for adding new tools and prompts
Future Enhancements
Phase 1: Enhanced Parsing
- Support for R and Python model code (beyond SAS)
- Multi-sheet Excel workbook handling
- PDF extraction for regulatory filings
Phase 2: Write Operations
- Generate data dictionary from SAS code
- Auto-update PowerPoint slides with current thresholds
- Create SR 11-7 gap remediation templates
Phase 3: Integration
- Connect to model risk management platforms (SAS Model Manager, Moody’s RiskAuthority)
- Integration with Jira/ServiceNow for issue tracking
- Automated monitoring alerts for documentation drift
Production Readiness Assessment
Status: Functional prototype - enterprise hardening required
MCP Banking Workflows validates core functionality (10 tools, SR 11-7 automation) but requires security, audit, and integration work for regulated production deployment.
| Timeline to Enterprise Deployment: 12-16 weeks | 2-3 engineers |
Critical Production Gaps
R-001: No Audit Trail (CRITICAL)
- Every tool invocation must be logged for regulatory evidence
- Fix: 2 weeks (structured logging + SQLite/PostgreSQL)
R-002: Local File System Only (CRITICAL)
- Cannot integrate with enterprise model registry systems
- Blocks deployment to Collibra, SharePoint, ServiceNow
- Fix: 3-4 weeks (model registry API integration)
R-003: No Authentication (CRITICAL)
- Anyone with file access can run compliance tools
- Enterprise deployment requires SSO and role-based access
- Fix: 2 weeks (SSO integration + RBAC)
R-004: Regex-Based SAS Parser (HIGH)
- Fails on complex macros and nested conditional logic
- Works for 8 tested models but brittle to coding variations
- Fix: 4-6 weeks (robust AST parser or SAS Language Server)
R-005: Single-Threaded Architecture (MEDIUM)
- One validator at a time, demo-only
- Cannot scale beyond single user
- Fix: 2-3 weeks (async request handling)
Key Architecture Decisions
ADR-001: Python 3.12 + FastMCP
- Why: Rapid 5-week development, native FastMCP integration
- Trade-off: Single-threaded execution limits concurrent users
- Alternative rejected: TypeScript/Node.js (weaker banking file libraries)
ADR-003: Local File System Storage
- Why: Zero external dependencies, no authentication complexity
- Trade-off: Cannot scale beyond single workstation
- Production path: Model registry API integration (Collibra, ServiceNow)
ADR-004: Custom SAS Parser
- Why: Fast development (weeks vs. months for full parser)
- Trade-off: Fails on complex macros, brittle to style variations
- Production path: Robust AST parser or SAS Language Server integration
ADR-006: SR 11-7 Compliance Focus
- Why: Aligns with U.S. Federal Reserve guidance and developer’s 5+ years experience
- Trade-off: Not directly applicable to Basel II/III or OCC frameworks
- Coverage: Conceptual soundness, ongoing monitoring, validation, limitations
Test Coverage
Current State: ~15%
- Basic tool functionality tested
- No security tests
- No integration tests
- No load tests
Target for Production: 85%+ unit | 70%+ integration
- 100% OWASP Top 10 security scenarios
- SR 11-7 requirement validation
- 50+ concurrent user load tests
- Audit trail completeness validation
Critical Gaps:
- File system security (directory traversal prevention)
- SAS parser edge cases (macros, conditional logic)
- Model registry integration (API error handling)
- Multi-user concurrent access
Production Hardening Roadmap
Phase 1: Compliance & Security (7-8 weeks)
- Audit logging for every tool invocation (examiner evidence)
- SSO/RBAC authentication (role-based tool access)
- File system security hardening (prevent directory traversal)
- Evidence package generation (regulatory examination support)
Phase 2: Enterprise Integration (7-8 weeks)
- Model registry API integration (Collibra, ServiceNow, SharePoint)
- Robust SAS parser (handle macros, includes, conditional logic)
- Version control integration (automated git diff for model changes)
- Document management system connectors
Phase 3: Scalability & Reliability (5 weeks)
- Multi-user concurrent access (50+ validators simultaneously)
- Performance optimization (handle 500+ model portfolio)
- Monitoring and alerting (compliance status dashboard)
- Disaster recovery and business continuity testing
What This Demonstrates
Production Awareness: Building tools is step one. Understanding audit trails, access controls, and regulatory evidence requirements separates prototypes from enterprise systems. This documentation shows I know the difference between demo code and regulated deployment.
Regulatory Knowledge: SR 11-7 compliance isn’t just documentation - it’s audit trails for examiners, approval workflows for material changes, and evidence packages that survive regulatory scrutiny. Five years of model risk management experience informs every design decision.
Decision Rationale: Every architectural choice documented with trade-offs. FastMCP enables rapid iteration; local files avoid complexity; regex parser balances speed with accuracy. Production deployment requires different trade-offs - this shows I understand when to optimize for speed vs. scale.
License
MIT License - see LICENSE file for details.
“The difference between good and great model risk management isn’t the accuracy of the models - it’s the accuracy of the documentation. In regulated banking, a single inconsistency between code and committee presentation can delay an approval by months. MCP Banking Workflows solves this by giving AI assistants the tools to validate what humans miss.”
