A quiet exit from a tightening contract.
NODWIN currently pays ₹6.4L/year for Slack Pro at a 55% legacy discount. That discount expires at renewal. Slack is simultaneously gating AI behind Business+, a tier that would cost ~₹32L/year at list. Mattermost — self-hosted on our existing KhetiAI fleet — eliminates the contract risk, unlocks AI we control, and costs near-zero in marginal infrastructure.
The thesis
Three forces converge over the next eight months. One: our discounted Pro renewal lands on 05 January 2027 — almost certainly with a price hike. Two: Slack has restructured AI tiering to push everything meaningful into Business+, a 107% jump from Pro. Three: NODWIN already owns the infrastructure — 107GB VRAM, LiteLLM, MS-01 gateway, Flashstor NAS — to run a self-hosted alternative as a first-class service, not a science experiment.
Mattermost is a mature, MIT-licensed Slack analogue used by Samsung, NASA, the U.S. Air Force, and Wealthsimple. Its Mattermost Agents plugin speaks OpenAI-compatible API — meaning our existing LiteLLM endpoint becomes the brain of every NODWIN channel, with full data sovereignty.
Why this brief exists
Not to commit to migration — to authorise a zero-cost parallel pilot between August and November 2026, with the formal decision made before the Slack renewal quote lands. The brief lays out, by stakeholder: what's gained, what's risked, how migration mechanically works, how we test it, how we back it up, and how it maps to ISO 27001 controls.
Each tab can be read standalone. CFO/CCO → Stakeholders tab. IT → Backup + Compliance tabs. Product → Migration How-To + Dual-Run. Founder → this tab + Stakeholders.
What Mattermost actually is
Mattermost is an open-source, self-hostable team messaging platform — effectively Slack with an MIT licence and no vendor lock-in. The source code is fully public and auditable. It is used in production by Samsung, NASA, the U.S. Air Force, Wealthsimple, and several hundred thousand self-hosted teams worldwide.
The complete server, web app, and mobile client in one repo. MIT-licensed. 30,000+ stars. Active release cadence — monthly minor versions, ad-hoc security patches.
End-to-end deployment demo — Docker Compose, Postgres, reverse proxy with TLS, first admin setup. Covers the exact stack we're deploying on EvoX2. ~25 minutes.
The canonical install reference. Covers prerequisites, compose configuration, plugin setup, OAuth, and mobile push. This is what the IT runbook is built on top of.
Eight people. One decision.
Each stakeholder reads this differently. The honest pros and cons for each role — not the sales pitch.
↗ Gains
- Reduced vendor concentration risk — Salesforce/Slack pricing power neutralised
- AI sovereignty — NODWIN's IP and conversations don't transit a US cloud LLM
- Story for the board: tech-forward founder operating ahead of cost curve
- Aligns with NODWIN's broader India-first / self-hosted infrastructure thesis
- Optionality: keep Slack archive while testing exit — no bridge burned
↘ Costs & Risks
- Visible change for 217 employees — change fatigue is real
- External partners (Branded Live, JetSynthesys, Juno Capital) may resist new chat tool
- If something breaks during a critical event (Comic Con, NH7), reputational cost
- Founder bandwidth — will be in the loop on early issues during pilot
↗ Gains
- ₹5L direct saving in Year 1 vs. current Slack Pro spend
- ₹13L renewal-risk hedge if Slack revokes the 55% discount in Jan 2027
- ₹20–25L AI cost avoided vs. forced upgrade to Business+ for AI features
- Capex already sunk — KhetiAI hardware is paid for; marginal cost approaches zero
- Negotiation leverage with Slack rep at renewal — a real BATNA, not a bluff
- No FX exposure — Mattermost OSS has no USD-denominated invoice
↘ Costs & Risks
- Hidden labour cost — est. 80 engineer-hours for setup, ~10 hrs/month ops
- Slack credits forfeit if account fully cancelled — keep Pro as archive to preserve them
- One-time migration friction may cause 2–3 days of productivity drag during cutover
- If pilot fails and we revert, sunk discovery cost ≈ ₹2–3L in time
- No vendor SLA — IT carries operational risk in-house
↗ Gains
- Channel governance moves fully in-house — no Slack admin tier limitations
- AI per-channel summaries replace manual status updates and catch-up overhead
- Custom bots per product vertical — brand intel, release notes, localisation
- Boards (Kanban) built in — removes Trello/Asana dependency for light PM work
- Full message history forever, not capped by Slack Pro's 90-day search limit
↘ Costs & Risks
- Highest change burden — as Slack's de-facto global admin, migration coordination lands here
- Slack Workflow Builder automations need recreation in Mattermost Playbooks
- No Slack Canvas equivalent — collaborative pages must move to Boards or external doc
- Global team coordination across timezones during cutover week is complex
- Niche Slack integrations (if any) may have no direct Mattermost equivalent
↗ Gains
- Familiar interface — channels, threads, DMs, reactions; minimal retraining needed
- AI summaries available in every channel — catch up on busy threads in one click
- One-click Google sign-in — no new password, no account sprawl
- All historical Slack conversations migrated; institutional memory preserved
- Slack remains read-accessible for 60+ days post-cutover — no hard cliff
↘ Costs & Risks
- New URL and app to install — muscle memory for slack.com takes 2–4 weeks to reset
- Cross-org Slack DMs with external partners don't migrate; some will keep Slack open
- During cutover week, team members may be split across platforms temporarily
- If a critical business event coincides with cutover, disruption risk is real
↗ Gains
- Full control over the communication stack — no vendor dependency for security or compliance
- Integrates with existing KhetiAI / LiteLLM infrastructure — no new vendor contracts
- MIT-licensed source code — auditable, forkable, no black-box components
- ISO 27001 control mapping is straightforward — Mattermost exposes all the right hooks
- Post-pilot: Rakesh owns the technical roadmap for the platform, not Slack's PM team
↘ Costs & Risks
- Ops ownership transfers to NODWIN — patching, monitoring, incident response
- Team needs Mattermost-specific knowledge; ramp-up cost before full handover
- Without CTO sign-off, Phase 1 deployment lacks the technical authority it needs
- Standby host, backup drills, and DR runbooks all require CTO-level sign-off for compliance
↗ Gains
- Clear ownership of the deployment — single person accountable for Phase 0–2 execution
- Existing access to the KhetiAI infrastructure reduces onboarding friction
- Founder's Office context means escalation path is short and trusted
- Mattermost System Console is well-documented — ramp-up is measurable
↘ Costs & Risks
- This role is the pilot's single point of failure — without a named deployer, nothing ships
- Est. 80 hours of focused work across Phase 0–2; must be ring-fenced from other duties
- On-call responsibility during the Aug 1 and Sep 5 migration windows
- Needs handover plan for ongoing ops after pilot — can't stay a one-person dependency
Migration by role.
Exact steps for each audience. Hand this tab directly to the person who needs it.
Prerequisites
EvoX2 host with Docker + Tailscale + Traefik already running. Google Workspace admin access to create an OAuth client. NAS reachable from EvoX2.
Step-by-step
- Provision PostgreSQL secrets & storage paths
Create
/opt/mattermost/{config,data,logs,plugins,bleve-indexes,postgres}with proper UIDs (2000:2000 for Mattermost). - Deploy via Docker Compose
Use the compose file in the next code block. Run
docker compose up -dfrom/opt/mattermost/. - Bind to Traefik with TLS
Label the container for
chat.nodwin.tailf49db2.ts.net. Tailscale serve handles HTTPS automatically. - Create the first system admin via CLI
Inside the container:
mmctl user create --email akshat.rathee@nodwin.com --username akshat --system-admin. - Configure Google OAuth in System Console
Authentication → OpenID Connect → paste Client ID + Secret. Restrict domain to
nodwin.com. - Disable email/password sign-in
Force SSO-only. Eliminates password-management burden entirely.
- Install plugins
Mattermost Agents (AI), GitHub, Jira, Calls, Boards. Each ~30 seconds via the System Console marketplace or sideload.
- Wire Mattermost Agents → LiteLLM
Plugin settings → add LLM provider → set base URL to
http://ms01.tailf49db2.ts.net:4000/v1, model name to whatever LiteLLM routes (e.g.claude-sonnet,qwen-72b-local). - Configure retention & compliance exports
Set message retention to indefinite. Enable compliance export to JSONL daily into
/opt/mattermost/exports. - Enable mobile push via Mattermost HPNS
Free hosted push notification service — sufficient for 217 users. System Console → Mobile Push.
- Set up backup cron jobs
See Backup Strategy tab for full scripts.
- Run smoke test before public announcement
Invite 3 users (Akshat + 2 IT). Send messages, upload files, test mobile push, test AI bot, test backup restore.
Docker Compose reference
- Audit your Slack channels first
List the channels you own. Archive any dead ones in Slack before migration to keep the import clean.
- Decide on the team structure
Mattermost separates "teams" (workspaces) from "channels". For NODWIN, recommend one team — "nodwin" — with channels organised by category. Or split:
nodwin,comic-con,esports. - Verify channels post-import
After the Aug 1 migration, log in and confirm every channel you owned in Slack appears, with members, pinned posts, and topic intact.
- Set channel purpose & header
Slack's channel topic migrates to Mattermost's header. Channel description goes to "Purpose". Verify both are sensible.
- Configure notification defaults
In each channel: Channel Settings → Notification Preferences. Recommend "All new messages" for #announcements, "Mentions only" for high-traffic ops channels.
- Pin the critical messages
Pinned messages migrate. Verify the right ones are present. Add a new pinned message: "Welcome to our Mattermost home" with a link to the cutover FAQ.
- Test the AI summariser
Click the "Ask Agents about this channel" button. Confirm it returns sensible output. This is your new superpower — show the team.
- Enable channel-specific bots if useful
For #legal: route to nodwin-legal-bot. For #finance: nodwin-finance-bot. Configure via System Console (admin assists).
- Conduct a "first day" walkthrough with your team
15-minute call — show the UI, the search, the threads, the AI bot. Answer questions live.
- Open the welcome email
You'll receive an email from chat@nodwin.com with a sign-in link.
- Sign in with your NODWIN Google account
One click — no new password. Use the same Google account as Gmail.
- Install the desktop app
Download from mattermost.com/download. Installs in 60 seconds on Windows, macOS, Linux.
- Install the mobile app
Search "Mattermost" on Play Store or App Store. The icon is a blue/teal hash. Disable Slack notifications on your phone after install to avoid double-pinging.
- Set your profile
Same photo as Slack helps continuity. Add timezone, role, and a fun status emoji.
- Find your channels
Public channels you joined in Slack are pre-joined here too. Browse "More Channels" to discover others.
- Configure notifications
Settings → Notifications. Recommend: Desktop = Mentions & DMs; Mobile = Mentions & DMs after 30s of inactivity; Email = Off.
- Try the AI assistant
Click the sparkle icon in any thread → "Summarise this". Or DM @nodwin-assistant directly with a question.
- Send your first message
Post a hello in #welcome. Confirm reactions, emoji, and threading work as expected.
- Keep Slack open in read-only mode for now
During the parallel-run period, you can still read historical Slack messages. New conversations happen here.
Post in #mattermost-help. The IT team and channel champions monitor it. Average response time during pilot: under 15 minutes during work hours.
Two migrations. One cutover.
The Aug 1 migration is the test fire. The Sep 1 migration is the real one — a delta refresh that captures everything said in Slack during August. Both backed by Slack-as-archive.
Why two migrations
A single big-bang cutover means trusting a six-hour window to go perfectly. A two-stage plan lets us migrate, live with the result for a month, then re-import a fresh export that includes everything said in the interim. Zero data loss. Zero rollback drama.
The Aug 1 import is the shakedown: schema correctness, permissions, plugin behaviour, attachment fidelity, search index health. The Sep 1 import is the cutover: same process, run on a clean instance, with all August activity included.
The timeline
- Owner: IT + Akshat
- Exit criteria: 3-user dev workspace fully functional, AI bot responding
- Owner: Akshat + Gautam (billing)
- Expected wait: 24–72 hours for ZIP generation
- Confirm download integrity (SHA256) on receipt
nodwin-test.
- 09:00 — Unzip Slack export, verify file count, run SHA256 check
- 10:00 — Execute
mmetl transform slack --team nodwin-test --file slack-export.zip --output import.jsonl - 11:00 —
mmctl import upload import.zip&mmctl import process - 13:00 — Manual verification: 20 random channels, 50 random messages, 10 attachments, 5 DMs
- 15:00 — Invite 10 early adopters; production smoke test
- 17:00 — Status report; document any defects in
migration-log.md
- Daily Netdata + audit-log review
- Weekly check-in with early adopters
- Issues tracked in dedicated Gitea project
- If Business+ trial expired, briefly re-upgrade via paid route — ~₹2.7L for 1 month, or negotiate goodwill window
nodwin, with the Aug-31 export.
- 08:00 — Drop the
nodwin-testteam (or archive) - 09:00 — Fresh import of Aug-31 export into
nodwinteam - 12:00 — Verification (20 channels, 100 messages, 25 attachments)
- 14:00 — Roll out invites to all 217 users in cohorts of ~50/day over 4 days
- Slack stays active but is internally messaged as "archive only — no new conversations"
- Weekly metrics report: active users, message volume, plugin errors
- Mid-month: AI usage analytics — which bots are being used, by whom
- Decision criteria documented (see § 04.3)
Cutover decision criteria
The go/no-go on 03 Nov is binary. Must hit all six to proceed:
- Uptime ≥ 99.5% over the Sep–Oct window (allows ~7h downtime total)
- Mobile push reliability ≥ 95% measured via test pings
- Zero P0 incidents (data loss, auth failure lasting >30 min, total outage)
- Active-user count ≥ 80% of headcount by week 6 of full rollout
- Backup restore drill completed successfully on a fresh host
- AI bot adoption ≥ 30 unique users — proxy for differentiated value
Rollback plan
If any criterion fails: do nothing. Slack remains active throughout. Communicate to the team that we're continuing on both, send the IT learnings into a retro, and re-evaluate in Q2 2027. Sunk cost: ~80 engineer-hours and the Business+ trial burn. No business disruption.
Three layers. Two destinations. One drill a month.
Mattermost has exactly three things to back up: the Docker stack definition, the file attachments, and the Postgres database. Lose any one and recovery is hard. Lose all three and there is no recovery.
Backup tiers
The compose file, environment variables, plugin binaries, and Mattermost config.json.
Small, slow-changing, version-controlled.
- Source
/opt/mattermost/{docker-compose.yml,.env,config,plugins}- Method
- Private Gitea repo + nightly
tar.gzto NAS - Frequency
- On change (Gitea) + nightly snapshot
- Retention
- Forever (Gitea), 90 days (NAS tarballs)
- Restore time
- ~10 min
Every file uploaded — images, PDFs, voice notes. Grows monotonically; needs lifecycle policy.
- Source
/opt/mattermost/data/(bind mount)- Method
- Hourly
rsyncto NAS + weeklyresticsnapshot to B2 - Frequency
- Hourly (NAS), Weekly (off-site)
- Retention
- 30 daily / 12 weekly / indefinite quarterly
- Restore time
- ~1 hour for 100GB
The system of record. All messages, channels, users, permissions, audit logs. The single most important asset.
- Source
mmuserdatabase on Postgres 15- Method
pg_dump --format=customnightly + streaming replication to GTR9 Pro- Frequency
- Nightly logical dump; continuous replication
- Retention
- 30 daily / 12 monthly / 5 yearly
- Restore time
- ~30 min for 5GB DB
Backup script (cron'd nightly)
Restore drills
Backups that haven't been restored aren't backups. Cadence:
- Monthly — Restore last night's
pg_dumpto a staging container. RunSELECT COUNT(*) FROM Posts; confirm sane count. - Quarterly — Full DR drill on the GTR9 Pro standby. Bring up a complete Mattermost from the latest restic snapshot. Verify a known user can sign in and read a known message from 3 months ago.
- Annually — Off-site DR. Pull from B2, restore on a fresh VPS in a different region. Document RTO actually achieved.
Recovery targets
Mapped to ISO 27001 Annex A.
Every control category that applies to a chat platform, with the concrete implementation choice for our Mattermost deployment. Also covers DPDP Act (India) and GDPR-relevant obligations.
Control mapping
| Annex A | Control | Implementation |
|---|---|---|
| A.5 | Information security policies | Chat communication policy drafted, signed by CEO, reviewed annually. Acceptable use, content classification, and external sharing rules included. Published in #policies channel. |
| A.6 | Organisation of security | Two system administrators designated (Akshat + IT lead). DPO named for DPDP Act compliance. RACI matrix for incidents documented in IT runbook. |
| A.7 | Human resources security | NDA + chat policy acknowledgement on onboarding. Offboarding triggers automated Google Workspace removal → OAuth lockout within 5 minutes. |
| A.8 | Asset management | Inventory: EvoX2 primary, GTR9 Pro standby, ratheesnas, B2 bucket. Data classification: Public / Internal / Confidential / Restricted; channels labelled per scheme. |
| A.9 | Access control | Authentication: SSO via Google Workspace OAuth only — no local passwords stored. MFA: enforced at Google Workspace level for all users. Authorisation: 5-tier role model (System Admin / Team Admin / Channel Admin / Member / Guest). Reviews: quarterly access audit by IT. |
| A.10 | Cryptography | In transit: TLS 1.3 only via Traefik; HSTS enabled. At rest: LUKS-encrypted volumes on EvoX2 + GTR9 Pro hosts; restic AES-256 for backups. Password hashing: not applicable — SSO-only deployment. Any fallback local accounts use Argon2id (Mattermost default). Key management: backup encryption keys stored in 1Password vault with break-glass copies in physical safe. |
| A.11 | Physical security | EvoX2 and GTR9 Pro located in Akshat's office (Pyramid 804L / wall-mount). Locked premises, alarm system. NAS in same facility. Off-site backup (B2) provides physical separation. |
| A.12 | Operations security | Patching: monthly minor versions, immediate for CVEs ≥ 8.0. Logging: Mattermost audit log + Postgres logs ingested by Loki, surfaced in Grafana. Anti-malware: file upload size limit (500MB), MIME-type allowlist, ClamAV scanning plugin. Vulnerability scans: monthly Trivy scan on container images. |
| A.13 | Communications security | All inter-service traffic over Tailscale (WireGuard, zero-trust). Public endpoint via Tailscale Serve with HTTPS. No service exposed on public IP. Egress restricted to LiteLLM gateway only. |
| A.14 | System acquisition & development | Open-source code review (or Mattermost-signed) for all plugins before install. New plugins deployed first to dev instance for one week before production rollout. |
| A.15 | Supplier relationships | Mattermost: open-source, no DPA needed for OSS edition. Google Workspace: existing DPA. Backblaze B2: DPA executed; SOC 2 attestation on file. Tailscale: free tier, no DPA executed; review if enterprise upgrade. |
| A.16 | Incident management | Incident response runbook in /opt/mattermost/runbooks. P0 / P1 / P2 classification. P0 = data loss or full outage > 30 min, requires CEO notification. Post-incident review within 5 working days. |
| A.17 | Business continuity | RPO 1 hour / RTO 4 hours documented and tested. Quarterly DR drill (see Backup tab § 05.3). Annual full off-site recovery test. Standby Mattermost on GTR9 Pro kept current via streaming replication. |
| A.18 | Compliance | DPDP Act: data localised in India (homelab + India-region B2 if elected). User consent captured on first login. Right-to-erasure handled via mmctl user delete.
GDPR: applies to any EU-based users (Jordan Orien NODWIN Japan? Bruce Stein US?) — Article 30 record kept; DPA available on request.
Audit: annual internal audit; external attestation deferred until business case justifies. |
Sensitive-data handling
Passwords & secrets
Mattermost itself stores no user passwords — SSO eliminates the attack surface entirely.
The few internal credentials that exist (Postgres password, LiteLLM API key, restic
passphrase, OAuth client secrets) are managed in 1Password with controlled access. The
.env file on EvoX2 is mode 0600
owned by root, never committed to Gitea.
Channels containing sensitive data
Channels handling investor data, M&A, employee PII, or financial controls are marked Restricted. They are: private, audit-log-enabled, retention-locked, and limited to named individuals. AI bot access disabled by default in Restricted channels — explicit per-channel opt-in required.
DM and private content
Direct messages and private channels are end-to-end-stored encrypted at rest (via host LUKS). They are visible to system administrators only through formal compliance export — not through routine browsing. Compliance export events are themselves audit-logged.
External sharing & guests
Mattermost Connect — the equivalent of Slack Connect for cross-org channels — is disabled at System Console level. Guest accounts are allowed only for named partners (Branded Live, Juno Capital, vendor delivery) and provisioned for the minimum required channels.
Open-source Mattermost does not offer SAML SSO — only OAuth/OIDC. If NODWIN later adopts an enterprise IDP (Okta, Entra) requiring SAML, a Mattermost Enterprise licence becomes necessary (~₹8L/year at current sizing). Google Workspace OAuth covers us today.
Retention policy
Mattermost retention is configured to be intentionally generous, with carve-outs for sensitive content:
- Public & internal channels — retained indefinitely. Storage is cheap; institutional memory is not.
- Private channels — retained 5 years; reviewed at channel archive.
- DMs — retained 2 years; users can self-delete individual messages within 24 hours of posting.
- Audit logs — retained 7 years for compliance.
- File attachments — same retention as the channel that contains them; orphan files purged after 90 days.
Incident classification
| Severity | Definition | Response |
|---|---|---|
| P0 | Complete outage > 30 min; confirmed data loss; security breach | Page IT lead + CEO immediately. Status update every 30 min. Public post-mortem within 5 days. |
| P1 | Partial outage; degraded performance affecting >50% users; auth issue | Page IT lead. Status update every 2 hours. Resolution target: 4 hours. |
| P2 | Minor bug; plugin malfunction; non-critical feature unavailable | Ticket logged. Resolution target: 5 working days. |
| P3 | Cosmetic; documentation; enhancement request | Backlogged. Reviewed in monthly IT planning. |