AI in Audit Documentation: Security, Metadata Risks & the Safer Path Forward

AUDIT TECHNOLOGY | DATA SECURITY | CA PRACTICE

AI in Audit Documentation: Security, Metadata Risks & the Safer Path Forward

How CA firms can use AI for working papers without breaking SA 230, the DPDP Act, 2023, or client trust — using RAG, on-premise LLMs, and a few sensible safeguards.

By CA Himanshu Majithiya | Himanshu Majithiya & Co., Ahmedabad

The Question Every CA Is Asking in 2026

A young article assistant in your office finishes a bank reconciliation, opens a public AI chatbot, pastes the client's complete trial balance with PAN and GSTIN, and types: "Summarise key risk areas for my audit memo." Thirty seconds later, a beautifully written paragraph appears on screen — neat, structured, and ready to drop into the working papers.

Behind that thirty-second saving lies a problem most firms have not yet sized up. The data has just travelled to a server outside India, governed by terms of service nobody in the office has read, possibly retained for model improvement, and is now potentially recoverable in some form by another user halfway across the world. Multiply this by every team member, every client file, every audit season — and you have a confidentiality breach happening quietly, in the background, every single day.

Yet ignoring AI is not an option either. The Institute of Chartered Accountants of India (ICAI) has openly encouraged members to adopt AI to strengthen audit quality, and the National Financial Reporting Authority (NFRA) has rolled out 40 revised auditing standards aligned with global norms, effective 1st April 2026. A widely cited internal audit survey indicates that AI usage in audit functions is projected to roughly double from around 39% in 2025 to nearly 80% in 2026. The profession is moving — fast.

So the real question is not "AI or no AI". The real question is: How do we use AI for audit documentation without violating SA 230, the Code of Ethics, and the Digital Personal Data Protection (DPDP) Act, 2023? This blog answers that — in plain language, with the technical references intact for fellow professionals.

1. Why Audit Documentation Is the Sweet Spot for AI

Audit documentation — what we still call "working papers" — has always been the backbone of the audit. SA 230, "Audit Documentation", issued by ICAI's Auditing and Assurance Standards Board (AASB), requires the auditor to prepare documentation sufficient to enable an experienced auditor, having no previous connection with the audit, to understand the nature, timing and extent of audit procedures performed, the audit evidence obtained, and the significant matters arising and the conclusions reached. As Robert H. Montgomery wrote over a century ago — and ICAI's Implementation Guide to SA 230 still quotes — "the skills of an accountant can always be ascertained by an inspection of his working papers."

The catch is well known to every practitioner: documentation is time-consuming, repetitive, and a frequent reason for adverse comments in peer reviews and NFRA inspections. This is precisely why AI fits so naturally.

Where AI genuinely earns its place in working papers:

Audit Activity	Traditional Approach	AI-Augmented Approach
Risk assessment memos (SA 315)	Manually drafted by partner	Drafted by AI from prior-year file + current TB; partner reviews and edits
Sampling rationale (SA 530)	Excel notes, often skeletal	AI proposes population stratification and sample basis; auditor approves
Walkthrough narratives	Hand-written notes typed up	Voice-to-text + AI structuring into a flow chart-style narrative
Variance analysis commentary	Senior writes, partner edits	AI generates first cut from comparative figures; team validates with management
MRL & engagement letter drafting	Recycled from old templates	AI tailors clauses to current scope, regulators, and DPDP requirements
Issue memos & journal entry testing	Repetitive narrative writing	AI summarises exceptions; auditor adds professional judgement
Tax audit Form 3CD clauses (Sec. 44AB)	Manual cross-referencing	AI maps GL entries to clause-wise reporting; auditor verifies

Used carefully, AI does not replace the auditor's professional judgement — it removes the typing burden so the auditor has more time to actually exercise that judgement. ICAI's own technical literature on Digital Assurance and the work of the Digital Accounting and Assurance Board (DAAB) explicitly supports auditors leveraging technology to improve efficiency, while maintaining professional scepticism.

2. The Hidden Risks Nobody Talks About at the Bar Council Lunch

The temptation to paste a client trial balance into a public chatbot is exactly where the trouble starts. There are four distinct risks, and they hit different laws, standards and ethics rules at the same time.

2.1 Confidentiality breach under the Code of Ethics

Clause (1) of Part I of the Second Schedule to the Chartered Accountants Act, 1949 treats disclosure of client information by a CA — without consent or legal compulsion — as professional misconduct. There is no carve-out for "I only pasted it into ChatGPT to summarise." The moment client data leaves your firm's controlled environment and enters a third-party server outside your contractual control, you have potentially breached the duty of confidentiality.

2.2 Statutory liability under the DPDP Act, 2023

The Digital Personal Data Protection Act, 2023 received Presidential assent on 11th August 2023, and the DPDP Rules, 2025 were notified on 13th November 2025. The Act and Rules are being brought into force progressively between November 2025 and roughly mid-2027. Under this law, a CA firm that decides what client personal data is collected, how long it is kept, and with whom it is shared is a Data Fiduciary — the legally responsible party — even if a third-party AI vendor merely processes the data.

Personal data here is broader than most assume. PAN, Aadhaar number, bank statements, salary registers, director KYC, employee payroll, share-holding patterns of individual promoters — every one of these is personal data when linked to an identifiable individual. The maximum penalty under the DPDP Act for failure to take reasonable security safeguards is Rs. 250 crore per instance. Each separate violation is a separate penalty.

⚠ The Section 8(5) Blind Spot

Section 8(5) of the DPDP Act requires Data Fiduciaries to take "reasonable security safeguards" to prevent personal data breach.

Legal commentators have flagged a fast-emerging blind spot: employees pasting client data into public AI tools to draft emails, summarise statements or debug Tally exports — without any malicious intent — may itself amount to a breach by the firm. The law looks at the breach, not the motive.

For a CA firm, this risk is squarely real and squarely yours.

2.3 Metadata leakage — the risk most CAs underestimate

Even if the body of the document is sanitised, files carry metadata — author name, computer name, edit history, prior versions, comment threads, GPS data on photographs of vouchers, EXIF data on scanned PDFs, tracked changes accepted but never truly removed. When a Word file or PDF is uploaded to a generative AI service, all of that metadata travels with it.

In one widely reported pattern, simple metadata extraction from a published audit report has been used to identify the junior team member who actually drafted it, the timestamps of edits made, and even the existence of earlier draft conclusions that differed from the final opinion. For a CA firm, this is reputationally toxic and potentially evidentially damaging in regulatory or litigation contexts.

2.4 Hallucinations and audit evidence quality

AI tools can produce text that looks confident, structured and quotable — but is factually wrong. A non-existent case law, a misquoted Income Tax section, a wrong CARO clause number, a fabricated AS reference. In an audit working paper, a hallucinated citation is not a small error — it is a misstatement of audit evidence under SA 500, and may render the entire conclusion indefensible if challenged.

3. Mapping Risks to Standards: A Quick Reference

To make this practical, the table below maps each AI risk to the ICAI/legal standard it touches. CAs can use this in firm-level training and as part of their internal SQM 1 quality management documentation.

AI Risk	Standard / Law Affected	Practical Consequence
Pasting client data into public chatbots	CA Act 1949 - Sch II Pt I Cl (1); DPDP Act, 2023	Professional misconduct + statutory penalty up to ₹250 cr
File metadata leakage in uploads	DPDP Act Sec 8(5); Code of Ethics	Identification of staff, exposure of edits, potential breach reporting
AI-generated text without review	SA 230 (documentation), SA 500 (evidence)	Inadequate working papers; audit conclusion not supportable
Hallucinated case law / sections	SA 500, SA 700, ICAI Code of Ethics	Misstatement of evidence; reportable in peer review/NFRA
No engagement letter consent for AI use	DPDP Act Sec 6 (consent); SA 210	Engagement defect; data processing without lawful basis
No audit trail of AI edits	SA 230 para 8-11, SQM 1	File assembly defects in peer review
Cross-border transfer of client data	DPDP Act Sec 16; Code of Ethics	Non-compliance with permitted-country list (when notified)

4. The Midway: How to Use AI Without Sending Client Data Anywhere

This is the part most CAs are looking for. Refusing AI is not realistic. Using public tools recklessly is illegal and unethical. The midway exists, and it is technically simpler than it sounds. There are three credible architectures, and a small firm can adopt the right combination depending on engagement risk.

4.1 Option A — Retrieval-Augmented Generation (RAG) on a controlled environment

RAG is the technique where the AI model is not retrained on your data; instead, your documents are stored in a private vector database, and only the small relevant chunks needed to answer a specific question are sent to the model along with the prompt. The European Data Protection Supervisor and major enterprise platforms now treat RAG as the default privacy-respecting pattern for AI in regulated industries, including finance.

In a closed enterprise setting — a CA firm's own server or a private cloud tenancy — the RAG retrieval can remain local, the underlying knowledge base never leaves the firm, and only minimal, anonymised context is ever transmitted to an external model API. Sensitive elements like PAN and Aadhaar can be tokenised or pseudonymised before they reach the model.

How RAG Helps a CA Firm

• Your client documents stay in a vector database that you control — typically inside your firm's network or a closed enterprise cloud tenancy.

• The AI model only sees the small text chunks needed to answer one question, not the entire client file.

• You can add filters that strip PAN, Aadhaar, mobile numbers and bank details before any text leaves the firm.

• Output can be cited back to the source document — improving evidential quality under SA 500 because every AI claim is anchored to a real working paper.

4.2 Option B — On-Premise / Local LLMs

A local Large Language Model is an AI model that runs entirely on your own hardware. There is no API call to OpenAI, Anthropic, or Google. The data never leaves your office network. A capable local model can be run on a workstation with a modern GPU, and the firm controls what is logged, what is retained, and who has access.

For high-confidentiality engagements — listed company audits, forensic assignments, sensitive M&A due diligence, FFMC compliance reviews, defence-sector clients — a local LLM is the most defensible choice. Quality is generally a notch below frontier cloud models, but for the structured, repetitive nature of audit documentation, the gap is often acceptable when set against the regulatory and ethical comfort it provides.

4.3 Option C — Enterprise AI with Contractual Data Protection

Most major AI providers now offer enterprise tiers with explicit no-training, regional-residency and zero-retention contractual commitments. These are markedly different from the public free tier. For a CA firm, this option is acceptable for lower-risk uses — drafting blogs, internal training materials, summarising public regulatory updates, GST circulars, and so on. For client-identifiable data, this option should be paired with rigorous redaction and engagement-letter consent.

The table below compares the three options on the criteria that matter most to a CA firm:

Criterion		Public AI (Free Tier)	Enterprise AI Tier	RAG (Private)	Local LLM
Data leaves firm?		Yes, fully	Yes, contractually limited	Minimal chunks only	Never
DPDP suitable for client PII?		No	Conditional, with consent	Yes, with redaction	Yes
Cost (small firm, indicative)		Free / nominal	Moderate per-user	Moderate setup + low run	Moderate-high hardware
Quality of output		Highest (frontier models)	Highest (frontier models)	High (frontier + your data)	Good, improving rapidly
Ease of adoption		Very easy	Easy	Moderate (needs setup)	Moderate (technical)
Best fit		Marketing, public learning	General drafting on non-PII	Day-to-day audit docs	High-risk / forensic audits
	Practical Reality for a Small/Mid-Size CA Firm You do not have to choose only one. The recommended posture is a layered approach: • Use Local LLM or strict RAG for any document containing client PII, financials, or working papers. • Use Enterprise AI Tier (with explicit no-training contracts) for general drafting where any client reference has been thoroughly redacted. • Restrict Public AI to truly public material — the firm's own blog drafts, social posts, generic learning queries. • Document this policy in your Standard on Quality Management (SQM 1) framework and train the team annually.

5. Twelve Safeguards Every CA Firm Should Implement This Quarter

Whatever AI architecture you adopt, the following twelve safeguards form the practical compliance core. They translate the abstract obligations of SA 230, SQM 1 and the DPDP Act into firm-level standard operating procedures.

Governance & Policy

• Firm AI Acceptable Use Policy — written, signed by all team members, naming the tools approved, the data categories prohibited, and the disciplinary consequences.

• Engagement Letter Update — under SA 210 and DPDP transparency requirements, disclose to clients that AI tools may be used in the engagement, the safeguards in place, and obtain informed consent where personal data is involved.

• Data Classification Register — classify each information type (PAN/Aadhaar, financials, board minutes, employee data) into Red/Amber/Green for AI usage.

Technical Controls

• Pre-upload Redaction — automate stripping of PAN, Aadhaar, mobile numbers, bank IFSC and account numbers before any AI submission.

• Metadata Scrubbing — use document inspection and metadata-removal utilities on every file before it touches a third-party tool.

• Local-First Default — set the default working-paper assistant to a local or RAG-based tool; require partner approval to use anything cloud-based for client data.

• Audit Trail of AI Use — maintain a log of which working paper used AI, who reviewed and approved the output, and what edits were made. This satisfies SA 230 paras 8–11.

Process & Quality Management

• Mandatory Human Review — no AI-generated text enters a working paper without a senior or partner sign-off, dated and initialled.

• Citation Verification — every section reference, case law, AS/Ind AS reference and rate quoted must be independently verified against the official source. AI-generated citations are starting points, not evidence.

• Annual Refresher Training — formal training, attendance recorded, on AI risks and the firm's policy. This is part of your SQM 1 documentation.

• Vendor Due Diligence — for any AI tool used, retain a vendor file showing the data processing agreement, retention policy, region of data residency, and security certifications.

• Incident Reporting Protocol — a written internal procedure for what happens if a team member accidentally uploads client data to a public tool — including the obligation to report a breach to the Data Protection Board within prescribed timelines.

6. A Quick Case Study from Our Firm

At Himanshu Majithiya & Co., we have been running a hybrid RAG-plus-local-LLM workflow for selected categories of work over the past several months. Three observations worth sharing:

First, time savings on first-draft documentation are real — typically 40–60% on standard memos, narratives and management representation letters — but the savings only materialise when the team trusts the safeguards enough to actually use the tool. A policy without trust does not get used; a tool without policy creates risk. Both are needed together.

Second, a privacy note that is often missed elsewhere: tools that store data locally on the user's machine rather than on remote servers — including, for example, Claude Cowork and Claude in Chrome — significantly reduce the data-residency dimension of the risk. They are not a complete answer (the model invocations may still go elsewhere depending on configuration), but they are a meaningful piece of the puzzle for working-paper-grade content.

Third, the tool that pays back its cost the fastest is not the flashiest — it is the boring metadata scrubber that runs automatically on every outgoing file. Investment in metadata hygiene is the single most under-appreciated control in a small CA firm today.

7. The Way Forward — Embrace, but on Your Terms

AI in audit is not a future scenario. It is here, and it is reshaping documentation, sampling, risk assessment and reporting in real time. Refusing it puts the firm at a competitive disadvantage; using it carelessly puts the firm at a regulatory and ethical disadvantage. The midway is not a compromise — it is the only intellectually honest position.

For Indian CAs, the next 12–18 months will be defining. The DPDP Rules are coming into force in phases. NFRA's revised auditing standards take effect from 1st April 2026. ICAI is actively encouraging adoption of CA GPT and other AI tools while emphasising professional scepticism. Firms that build a defensible AI-in-audit framework now — Acceptable Use Policy, RAG/local-LLM choice, redaction, audit trail, training — will have a powerful and entirely lawful advantage by the time peer review and NFRA inspection cycles catch up.

The choice is not between AI and no AI. The choice is between AI used responsibly and AI used recklessly. As CAs, we already know which side of that line our profession is built on.

Talk to Us

If your firm is exploring how to deploy AI in audit documentation safely — whether through RAG, local LLMs, metadata controls, or a complete SQM 1-aligned policy — Himanshu Majithiya & Co. would be glad to share its working framework and templates.

📞 +91 98250 58149 ✉ info@himanshumajithiya.com 🌐 www.himanshumajithiya.com

507-508, Maple Trade Centre, SAL Hospital Road, Thaltej, Ahmedabad – 380059

Disclaimer

This article is published in compliance with the ICAI Code of Ethics and the Council Guidelines on advertisement by Chartered Accountants in practice. The contents are for general information and educational purposes only and do not constitute professional advice or solicitation of work. Readers are advised to obtain independent professional advice before acting on any information contained here. The references to laws, standards and statistics are based on sources publicly available as on the date of publication; readers should verify the current position before relying on them. Tools and products mentioned are illustrative; no endorsement is implied. Data privacy of any specific tool depends on the configuration used by the firm — readers should review vendor terms before deployment.

AI in Audit Documentation: Security, Metadata Risks & the Safer Path Forward

Tags

Need Professional Assistance?