JioCinema
- stream live IPL ~20 million concurrent devices
- Tons of audits on each hop of request. Frontend, load balancer, etc.
- Calculate the limit of scaling for each of the service
- War rooms, dashboard and metrics
- Frontend is code freezed much much before
- Use feature flags, so that features which are causing lot of issues can be turned off
- In case of fire, switch off less critical components
- Perform feature flags simulation, performing permutations/combinations
- config service returns feature flags
- autoscaling takes time to kick in, couple of minutes to an hour
- autoscaling is not useful since game would have ended by then
- scale up before the game, and perform back of the envelop calculation
- how many requests (for example 20 million)
- how many database calls per request
- need to ask cloud providers before to give the capacity (nodes)
- Sometimes it could be 75% of India’s bandwidth
- If Database goes down:
- There are some APIs which basically are put in static storage with fixed response
- APIs are routed to static storage instead of origin server
- This is known as Panic mode, User does not feel that the system is down
- Scale down:
- Never scale down between match, since Dhoni might come back for bowling
- After the match scale down on ladders based on number of users
- Do not scale down immediately since There are a lot of users sticking around for some time after the match
- Kafka:
- Producer rate consumer rate
- Plan partitions based on above rates
More Notes
- Database is the most brittle component
- harder to scale?