Does DPDP require us to apply its rules retroactively to data collected before the Act?

The Act does not grandfather existing data. The conservative and defensible position is to apply your retention policies to all personal data regardless of collection date, and to obtain fresh consent where your original consent mechanism did not meet the specificity requirements the Act demands. In practice, most teams prioritise new data collection first and address legacy data in a second phase, documented in an internal compliance roadmap.

Do we need explicit consent for analytics data?

It depends on what you are tracking. Aggregate, non-identifying analytics — page counts, feature usage totals — typically do not involve personal data and do not require consent under DPDP. Analytics events that carry user IDs, session tokens, device fingerprints, or other personal data do require consent for the analytics purpose specifically. The practical starting point: audit what identifiers your analytics pipeline retains and strip those that are not strictly necessary for the analysis you actually run.

How should consent interact with data we have shared with third-party processors?

As the data fiduciary, you remain responsible for data processed by your vendors and processors. When consent for a purpose is withdrawn, you need to delete the data from your own systems and ensure processors do the same — either by sending a deletion request or by verifying that their retention practices align with yours contractually. Mapping your processor relationships is a prerequisite for managing this at scale.

What are the financial penalties for DPDP non-compliance?

The Act provides for penalties of up to Rs 250 crore per instance for failure to implement reasonable security safeguards, and up to Rs 200 crore for failing to notify the Data Protection Board of a personal data breach. Penalties for procedural violations — such as inadequate breach notification to affected data principals — can reach Rs 50 crore. The Data Protection Board adjudicates penalties, not courts. The Board is not yet fully constituted as of mid-2026, but enforcement machinery is expected to be active before year-end.

Compliance & LegalMay 15, 20267 min readReviewed May 15, 2026

DPDP for engineers: the code changes that actually matter

Consent schemas, purpose limitation, retention jobs, and audit structures for Indian SaaS teams moving from policy to implementation

By FlowVerify Editorial Team

Most DPDP compliance discussion has happened at the policy layer: appoint a Data Protection Officer, update the privacy notice, train staff on data handling. None of that is wrong. But at some point the compliance officer walks into the engineering standup and asks whether the system is ready, and the answer is usually a long silence. This article is for engineers who have been handed that question and want to know what actually changes in the codebase. DPDP for engineers is not about policy documents — it is about schema migrations, consent flows, retention jobs, and audit structures.

What the DPDP Act actually asks of your codebase

The Digital Personal Data Protection Act, notified in August 2023 with rules finalised in 2025, creates four engineering-relevant obligations. Purpose limitation: data collected for one stated purpose cannot be used for another without fresh consent. Data minimisation: collect only what the stated purpose requires. Storage limitation: delete personal data when the purpose for which it was collected is served, or when consent is withdrawn. Accuracy and security: maintain reasonable data accuracy and protect against unauthorised access, with a 72-hour breach notification obligation to the Data Protection Board.

Each obligation maps to a specific code implication. Purpose limitation requires a consent model tied to a declared use case. Data minimisation requires a field-level audit of your schema. Storage limitation requires an automated deletion mechanism with a defined trigger. Security largely builds on your existing posture, with the addition of a documented breach response process. The first three require the most engineering work and are what this article covers.

Obligation	What it means for your code	Rough effort
Purpose limitation	Track consent per purpose; block cross-purpose data use without re-consent	1–2 weeks
Data minimisation	Audit every personal data field; remove or reduce fields not required for the stated purpose	2–4 weeks (ongoing)
Storage limitation	Build a retention policy and an automated deletion or anonymisation job	2–4 weeks
Security & accuracy	72-hour breach notification process; periodic accuracy checks for critical fields	1 week to document, then ongoing

The four DPDP engineering obligations

Under DPDP, consent must be free, specific, informed, and withdrawable at any time. The current state in most SaaS products: a boolean column in the users table, stored once at signup, never tested for revocability. The minimum viable change is a dedicated consent_records table.

consent_records.sql

CREATE TABLE consent_records (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  principal_id    UUID NOT NULL,
  purpose         VARCHAR(100) NOT NULL,   -- 'account_management', 'marketing', 'analytics'
  mechanism       VARCHAR(50)  NOT NULL,   -- 'signup_form', 'settings_page', 'email_link'
  given_at        TIMESTAMPTZ  NOT NULL,
  withdrawn_at    TIMESTAMPTZ,
  privacy_version VARCHAR(20)  NOT NULL,  -- version of notice shown at consent time
  source_ip       INET
);

The critical design decision: consent records are immutable. When a user withdraws consent, insert a new row with withdrawn_at set — do not update the original row. The current consent state for any (principal_id, purpose) pair is determined by the record with the latest given_at timestamp where withdrawn_at IS NULL. This preserves the full history including the exact privacy notice version shown when consent was first obtained, which is what you need if a withdrawal is later disputed.

One downstream implication: your user deletion flow changes. If you hard-delete users, you lose evidence that they consented in the first place. The standard approach is to retain consent records indefinitely, either in their original form or with principal_id replaced by a one-way hash. Consent records are the one data class where retention for dispute resolution takes precedence over minimisation.

Purpose limitation and the data inventory

DPDP requires that data collected for purpose A cannot be used for purpose B without fresh consent. This is conceptually simple and operationally difficult, because most production codebases have never mapped their data collection to stated purposes. The place to start is a personal data inventory.

Walk every table and column in your schema and tag each field by category:

Direct identifiers: name, email address, phone number, PAN, Aadhaar number, passport number
Indirect identifiers: IP address, device ID, browser fingerprint, precise geolocation
Profile data: job title, organisation name, profile photo, stated preferences
Behavioural data: page views, feature usage events, session duration, search queries
Transactional data: purchase history, invoices, contract records

For each tagged field, record why it is collected and what it is actually used for. The two questions often have different answers. Phone numbers collected 'for two-factor authentication' but also passed to the marketing team for outreach is a purpose limitation violation under DPDP — the second use requires separate, specific consent.

The output does not have to be a formal data catalogue. A YAML or JSON file checked into source control. A static manifest mapping each table and column to its declared purpose — is sufficient. This document becomes the reference for compliance queries and the input to any future minimisation work.

Data minimisation — the schema audit you have been avoiding

Once you have the inventory, data minimisation asks a direct question for each field: 'If we removed this entirely, would the core product function?' The honest answer is yes for more fields than most teams expect.

Common findings when teams run this audit: free-text fields like job title or company name stored verbatim when a normalised enum would serve the product purpose; full phone numbers retained when only the last four digits appear in the UI; date of birth stored when only age verification is required; full postal addresses kept when only a PIN code is used for routing; analytics events carrying user IDs when the aggregate count is all that is ever queried.

Minimisation does not always mean deletion. The options available to you: hashing a field before storage so lookups work but plaintext is never held; truncating precision (city instead of GPS coordinates); replacing a stored field with a derived value computed at query time; or partitioning personal data into a separate schema with its own, shorter retention schedule.

Storage limitation — building the retention job

This is the most operationally complex DPDP obligation. The requirement is clear: delete data when the purpose is served or consent is withdrawn — but 'the purpose is served' is deliberately left for you to define. For a SaaS product, a defensible interpretation is: personal data is retained while the account is active, and for a defined period after closure or consent withdrawal. Document that period in your privacy notice before building the job.

retention_policy.py

RETENTION_POLICIES = {
    "user_profile": {
        "trigger": "account_closed",
        "retention_days": 90,
        "delete_strategy": "hard_delete",
    },
    "audit_logs": {
        "trigger": "account_closed",
        "retention_days": 365,
        "delete_strategy": "anonymise",  # replace principal_id with 'deleted-<hash>'
    },
    "consent_records": {
        "trigger": "never",              # retained for dispute resolution
        "delete_strategy": "none",
    },
    "session_data": {
        "trigger": "session_ended",
        "retention_days": 30,
        "delete_strategy": "hard_delete",
    },
    "marketing_analytics": {
        "trigger": "consent_withdrawn",  # purpose-specific trigger
        "retention_days": 0,
        "delete_strategy": "hard_delete",
    },
}

The deletion job runs daily and processes records whose trigger condition is met and whose retention window has elapsed. Hard delete cascades are the main source of complexity. Deleting a user row breaks foreign key constraints across invoices, support tickets, and audit logs. The three practical options: cascade deletion (use with care; test thoroughly on a staging copy of production data), tombstone replacement (replace the user_id foreign key with a sentinel value like 'deleted-{hash}'), or in-place anonymisation (null out identifying fields, retain the row for aggregate integrity).

One further complication: data held by processors. If you share personal data with a payment gateway, email delivery service, or analytics tool, DPDP's obligations run to those processors under the data fiduciary framework. Your deletion workflow needs either a mechanism to request deletion from processors or contractual guarantees that their retention windows match yours.

The audit log structure DPDP actually needs

DPDP creates a right of access: any data principal can ask what data you hold about them and who has accessed it. Standard application logs — request method, endpoint, status code — do not answer these questions. An audit event that does has a specific shape.

audit_event.json

{
  "event_id": "evt_01J2XYZABC",
  "timestamp": "2026-05-15T09:12:33Z",
  "actor_type": "user",
  "actor_id": "usr_abc123",
  "action": "view",
  "resource_type": "invoice",
  "resource_id": "inv_xyz456",
  "principal_ids_accessed": ["usr_def789"],
  "purpose": "self_service_billing",
  "ip_hash": "sha256:8a3b..."
}

The principal_ids_accessed array is the critical field. When a data principal submits an access request, you query this field to return every event that touched data about them, regardless of which actor triggered it. An administrator viewing a customer's invoice generates an event where actor_id is the admin and principal_ids_accessed includes the customer. Log the IP as a hash rather than plaintext unless you have a specific operational reason to retain the raw value; the hash is sufficient for correlating suspicious activity without holding the IP as personal data.

The purpose field is what lets you demonstrate to an auditor that each data access was for a declared and consented use. Without it, an access log is evidence of activity but not evidence of legitimacy. This is also what enables you to answer a more precise version of the data principal's access request: not just 'here is every event that touched your data' but 'here is every event, grouped by purpose'.

Where to start

If DPDP compliance is genuinely new to your codebase, the priority order below is practical rather than arbitrary. Each stage builds on the previous one, and each can be shipped incrementally without touching core product functionality.

Consent schema. Nothing else is tractable without knowing who consented to what and when. This is a one-week sprint for most teams: schema migration, a backend endpoint to record consent events, a UI hook at the relevant action points.
Structured audit logging. You need this to respond to access requests, and it is the kind of infrastructure that compounds over time. Implementing audit events with principal_ids_accessed for the ten most sensitive operations takes one to two weeks and is purely additive.
Retention jobs. Write the policy document first. That document is what you show auditors, and it goes into your privacy notice. Then build the job for the highest-volume data first. Two to four weeks depending on cascade complexity.
Data minimisation. This is ongoing, not a one-time sprint. Start with the field inventory, then run deletion candidates past engineering and legal. Some fields will be straightforward; others will surface surprising dependencies.

The DPDP rules are now in force. The question for Indian engineering teams has moved from ‘do we need to comply’ to ‘what do we build first’. Starting with consent and audit logging gives you something real to present and, more importantly, gives your team the habit of treating personal data as a first-class concern in the codebase — which is what the Act is ultimately asking for.

Frequently asked questions

ONDC, three years in: the number that matters isn't the headline

May 14, 2026Read full article →

Compliance & LegalMay 15, 20267 min readReviewed May 15, 2026

DPDP for engineers: the code changes that actually matter

Consent schemas, purpose limitation, retention jobs, and audit structures for Indian SaaS teams moving from policy to implementation

By FlowVerify Editorial Team

What the DPDP Act actually asks of your codebase

Obligation	What it means for your code	Rough effort
Purpose limitation	Track consent per purpose; block cross-purpose data use without re-consent	1–2 weeks
Data minimisation	Audit every personal data field; remove or reduce fields not required for the stated purpose	2–4 weeks (ongoing)
Storage limitation	Build a retention policy and an automated deletion or anonymisation job	2–4 weeks
Security & accuracy	72-hour breach notification process; periodic accuracy checks for critical fields	1 week to document, then ongoing

The four DPDP engineering obligations

consent_records.sql

CREATE TABLE consent_records (
  id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  principal_id    UUID NOT NULL,
  purpose         VARCHAR(100) NOT NULL,   -- 'account_management', 'marketing', 'analytics'
  mechanism       VARCHAR(50)  NOT NULL,   -- 'signup_form', 'settings_page', 'email_link'
  given_at        TIMESTAMPTZ  NOT NULL,
  withdrawn_at    TIMESTAMPTZ,
  privacy_version VARCHAR(20)  NOT NULL,  -- version of notice shown at consent time
  source_ip       INET
);

Purpose limitation and the data inventory

Walk every table and column in your schema and tag each field by category:

Direct identifiers: name, email address, phone number, PAN, Aadhaar number, passport number
Indirect identifiers: IP address, device ID, browser fingerprint, precise geolocation
Profile data: job title, organisation name, profile photo, stated preferences
Behavioural data: page views, feature usage events, session duration, search queries
Transactional data: purchase history, invoices, contract records

Data minimisation — the schema audit you have been avoiding

Storage limitation — building the retention job

retention_policy.py

RETENTION_POLICIES = {
    "user_profile": {
        "trigger": "account_closed",
        "retention_days": 90,
        "delete_strategy": "hard_delete",
    },
    "audit_logs": {
        "trigger": "account_closed",
        "retention_days": 365,
        "delete_strategy": "anonymise",  # replace principal_id with 'deleted-<hash>'
    },
    "consent_records": {
        "trigger": "never",              # retained for dispute resolution
        "delete_strategy": "none",
    },
    "session_data": {
        "trigger": "session_ended",
        "retention_days": 30,
        "delete_strategy": "hard_delete",
    },
    "marketing_analytics": {
        "trigger": "consent_withdrawn",  # purpose-specific trigger
        "retention_days": 0,
        "delete_strategy": "hard_delete",
    },
}

The audit log structure DPDP actually needs

audit_event.json

{
  "event_id": "evt_01J2XYZABC",
  "timestamp": "2026-05-15T09:12:33Z",
  "actor_type": "user",
  "actor_id": "usr_abc123",
  "action": "view",
  "resource_type": "invoice",
  "resource_id": "inv_xyz456",
  "principal_ids_accessed": ["usr_def789"],
  "purpose": "self_service_billing",
  "ip_hash": "sha256:8a3b..."
}

Where to start

Consent schema. Nothing else is tractable without knowing who consented to what and when. This is a one-week sprint for most teams: schema migration, a backend endpoint to record consent events, a UI hook at the relevant action points.
Structured audit logging. You need this to respond to access requests, and it is the kind of infrastructure that compounds over time. Implementing audit events with principal_ids_accessed for the ten most sensitive operations takes one to two weeks and is purely additive.
Retention jobs. Write the policy document first. That document is what you show auditors, and it goes into your privacy notice. Then build the job for the highest-volume data first. Two to four weeks depending on cascade complexity.
Data minimisation. This is ongoing, not a one-time sprint. Start with the field inventory, then run deletion candidates past engineering and legal. Some fields will be straightforward; others will surface surprising dependencies.

DPDP for engineers: the code changes that actually matter

What the DPDP Act actually asks of your codebase

Purpose limitation and the data inventory

Data minimisation — the schema audit you have been avoiding

Storage limitation — building the retention job

The audit log structure DPDP actually needs

Where to start

Frequently asked questions

Related reading

ONDC, three years in: the number that matters isn't the headline

Bootstrapped or venture-backed: the Indian SaaS calculus in 2026

Open-source licensing for engineers: a corporate codebase guide

Stay ahead on eSignatures, compliance, and document workflows

ONDC, three years in: the number that matters isn't the headline

DPDP for engineers: the code changes that actually matter

What the DPDP Act actually asks of your codebase

Purpose limitation and the data inventory

Data minimisation — the schema audit you have been avoiding

Storage limitation — building the retention job

The audit log structure DPDP actually needs

Where to start

Frequently asked questions

Related reading

ONDC, three years in: the number that matters isn't the headline

Bootstrapped or venture-backed: the Indian SaaS calculus in 2026

Open-source licensing for engineers: a corporate codebase guide

Stay ahead on eSignatures, compliance, and document workflows

ONDC, three years in: the number that matters isn't the headline

What the DPDP Act actually asks of your codebase

Consent state management — the first schema migration

Purpose limitation and the data inventory

Data minimisation — the schema audit you have been avoiding

Storage limitation — building the retention job

The audit log structure DPDP actually needs

Where to start

Frequently asked questions

Does DPDP require us to apply its rules retroactively to data collected before the Act?

Do we need explicit consent for analytics data?

How should consent interact with data we have shared with third-party processors?

What are the financial penalties for DPDP non-compliance?

Related reading

Stay ahead on eSignatures, compliance, and document workflows

What the DPDP Act actually asks of your codebase

Consent state management — the first schema migration

Purpose limitation and the data inventory

Data minimisation — the schema audit you have been avoiding

Storage limitation — building the retention job

The audit log structure DPDP actually needs

Where to start

Frequently asked questions

Does DPDP require us to apply its rules retroactively to data collected before the Act?

Do we need explicit consent for analytics data?

How should consent interact with data we have shared with third-party processors?

What are the financial penalties for DPDP non-compliance?

Related reading

Stay ahead on eSignatures, compliance, and document workflows