Since our clients are in USA and the engineering team is in Vietnam (15 hour time difference), we have had our fair share of issues that arise. Here are things we have done to eliminate/minimize them:
- Extensive documentation on system architecture, db design, class design
- Video recordings/screenshots of a done task
- Tooling. Things like Pagerduty, Corelogix to monitor performance and notify us (via international calls) in case something goes bad
- Ci/Cd via Jenkins for all deployments ( staging, prod, mobile) making it easy for client's engineers to push a hotfix
- Extensive backup (servers/db/etc) and easy rollbacks in case something goes back
- Extensive logging, archiving (on S3)
- Regular knowledge transfer (to client's internal software engineers)
- Automatic Qa that is tightly integrated w/ Ci/Cd on Gitlab
- Mid-day deployment. Once the team wakes up. we read all the ready for prod tickets that the client left us the night before and deploy around noon. That leaves us plenty of time to QA prod before we check out
- Regular prod deployment. We deploy at least twice a week to prod. All tickets go to a pre-prod server first where we do rigorous Qa one final time.
Hope that helps. Let me know if I miss something.