31 - idnog03 - bergas bimo branarto (gojek) - scaling gojek

Post on 21-Apr-2017

1.089 Views

Category:

Internet

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Bergas Bimo BranartoArlinda JuwitasariRama Notowidigdo

SCALING GOJEK

WHAT MAKES US WHO WE ARE

SPEED INNOVATIONSOCIALIMPACT

OUR VALUES

Product Releases2011: go-ride, go-send, go-shop (all phone order)

January 2015: app release with go-ride, go-send, go-shop

April 2015: go-food

September 2015: go-mart

Okt 2015: go-box, go-massage, go-clean, go-glam, go-busway

Desember 2015: go-tix

Januari 2016: go-kilat (e-commerce partnership, not on app)

April 2016: go-car

11,000,000 downloads in 15 months

0

3000000

6000000

9000000

12000000

January February March April May June July August MARCH

cumulative total app downloads

JABODETABEK123.500 drivers

BANDUNG37,000 drivers

BALI11,200 drivers

surabaya23,000 drivers

MAKASSAR7,100 drivers

ARE WE?WHERE

PALEMBANG510 drivers

MEDAN440 drivers

BALIKPAPAN11,200 drivers

YOGYAKARTA690 drivers

SEMARANG370 drivers

The Growth of Driver Gojek

400

800

1,200

1,600

10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7

The Growth of Customer

100

200

300

400

500

1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6

The growth of Order

350

700

1,050

1,400

1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6

number of order request multipliers

CONWAY’S LAW

Any Organization that designs a system (defined broadly) will produce a design whose structure is a copy

of the organization’s communication structure

SOCIAL

Monolithic

SOCIAL

Backend

Driver

Customer

Internal User

Issue :• Bugs• Unexpected Load growth• Long process time which resulted with server crash during peak

Challenge : - Small tech team (5 devs: for mobile apps (cust&driver), portals (3 different portals),

backend) vs 4 (or more) divisions- Keep server alive in unexpected high loads- So many features (and products) need to be released to keep business running

Transform 1backend - 1

Disk

backend - n

backend - 2proxy

Transform 1

Plus :

At least we’re still alive Minus:

Long process time

DB bottleneck

Transform 2

Diskproxy

service C

core backendcore backend

service C

service B

core backendcore backend

service B

service A

core backendservice A

service A

redis queue

Transform 2

Plus :

- Splitting functionalities to services makes code more efficient (at least for new services)

- Queue enables core backend to push and forget for one way communication

- Process time reduced- Enable throttling in queue workers

Minus:

• Process time still long enough for incoming traffic since it only split non-transactional functionalities

• DB bottleneck

Transform 3 Disk

proxy

service C

core backendcore backend

service C

redis queue

service B

core backendcore backend

service B

Disk

service A

core backendservice A

service A

Disk

rest api

service D

core backendcore backend

service D

Disk

rest api

redis cache

service E

core backendcore backend

service E

Disk

Transform 3

Plus :

- Split some transaction processes to another services: load splitted, process time reduced

- Redis cache: reduce db bottleneck- Each service owns their own db:

reduce db bottleneck

Minus:

• API calls in a flow of more than 2 services cause cascading failures

Transform 4 Disk

proxy

service C

core backendcore backend

service C

kafka queue

service B

core backendcore backend

service B

Disk

service A

core backendservice A

service A

Diskservice D

core backendcore backend

service D

Disk

inline redis cache

service E

core backendcore backend

service E

Disk

service F

core backendservice A

service F

Diskgrpc (http/2)

Transform 4Plus :

- Asynchronous communication between services via kafka: reduce api calls between services, reduces cascade failures

- Shared redis (inline) cache: reduce db queries, reduce api calls between services, reduce cascade failures

- grpc (uses http/2) should reduce network time

Minus:

?

Stack :• Java: Spring MVC and Spark• Go• Jruby on Rails• AngularJS• MySQL• PostgreSQL• MongoDB• Elasticsearch• Redis• Kafka• RabbitMQ

Response Time vs Throughput

25,000

50,000

75,000

100,000

0

400

800

1200

1600

4 5 6 7 8 9 10 11 12 1 2 3 4 5 6

response time (ms) throughput (rpm)

Order Growth vs Response Time

0

400

800

1,200

1,600

0

350

700

1,050

1,400

order response time (ms)

Order Growth vs Throughput

0

25,000

50,000

75,000

100,000

0

350

700

1,050

1,400

order throughput (rpm)

TRUE HAPPINESS IS THE JOURNEY, NOT THE DESTINATION

THANK YOU

top related