Adobe Social Collaboration: A Deep Dive Into Performance and Scalability by SapientNitro

POINT OF view

Adobe Social Collaboration: A Deep Dive Into Performance and Scalability Sruthisagar Kasturirangan, Infrastructure Architect, Infrastructure Practice, SapientNitro, Bangalore

INTRODUCTION Adobe’s Social Collaboration unifies all social networking and collaboration applications within AEM (Adobe Experience Manager) and has gained a lot of attention—in part because today’s consumers are increasingly active on various mobile devices and placing a lot of value on feedback from fellow buyers. And smart content and commerce platforms are capitalizing on Social Collaboration to boost sales and give the end user the best experience possible. In order to understand Adobe’s Social Collaboration better, we dove into a complete analysis of its performance and scalability aspects. We accomplished this by performing tests with Adobe’s provided JMeter scripting framework for running the benchmark tests you’ll see below. The tests include scripts that perform pure write operations so that it’s possible to measure the overall throughput that can be supported in order to eventually arrive at a physical architecture sizing and capacity plan. Through these tests, we are now able to provide a general guidance on the methodology needed in order to size the infrastructure and identify key bottlenecks when integrating Social Collaboration as part of the overall design of a content and collaboration platform. This paper has been written not to contend the results provided by Adobe Systems Incorporated in their documentation but to extend the results for virtualized environments due to the influx in development in the arena of cloud hosting. The following results have been elaborately analyzed and discussed before arriving at the conclusions you’re about to read.

Experimental Setup First, let’s briefly go through the experimental setup we used to conduct those benchmark tests, including the AEM version used, the system configuration, the benchmark architecture, and the test scenario.

POINT OF view

AEM Version AEM 5.6.0

System Configuration

Author & Publish Environments: 8 – CPUs Currently (Logical CPUs) 8 – CPUs Configured Number of Processors: 2 (Allocated) PowerPC_POWER7 – Processor 64 bit – Hardware 7.1.2.1 TL02 – AIX Kernel Version Memory Size: 8192MB Total Paging Space: 2048MB JVM Settings: Maximum Heap Size: 4GB; PermGen: 512MB; IBM J9VM 1.6, GENCON Algorithm

Benchmark Architecture

SINGLE PUBLISH CONFIGURATION REVERSE REPLICATION

AUTHOR NODE

USER REQUESTS

PUBLISH NODE

Test Scenario

The tests below were all performed using Adobe’s out-of-the-box application Geometrixx. Adobe’s benchmark scripts have procedures to create multiple users in the author and publish environments so that a realistic test scenario can be created. In this case, a test forum topic was created with a small description. The user was then pre-authenticated during the warm up and, once authenticated, held the session and performed continuous write operations.

Iterations The various iterations of testing are tabulated and the details of the load model and results are described in the following sections. In particular, the result sections are focused on analyzing the transactions per second as a function of the total number of transactions and average response times (i.e., time taken for last byte).

Load Model

#Generic properties: threads/users. #All timings are in seconds. #startThreadCount is the total number of concurrent threads/users. (For 5 requests per second, set it to 150.) #startupDelay is the ramp-up time for starting threads. (For 150 threads, set it to 60 seconds.) #holdLoadFor is the time the test is run. (For 10 minutes, set it to 600.) #shutdownTime is the time it takes the threads to shut down. (Set it to the same value as startupDelay.) #requestsPerSec is the number of requests per number of seconds.

POINT OF view

Iteration 1

startThreadCount (the total number of concurrent users/threads)=150 startupDelay=60 holdLoadFor=1200 shutdownTime=0 requestsPerSec=2 RPSduration=30 Load Ramp Up Model Expected parallel users count

http://apc.kg/plugins

200

Number of active threads

180 160 140 120 100 80 60 40 20 0 00:00:00

00:02:06

00:04:12

00:06:18

00:08:24

00:10:30

00:12:36

00:14:42

00:16:48

00:18:54

00:21:00

Elapsed Time

Throughput Throttling Expected RPS

http://apc.kg/plugins

Number of requests/sec

9 8 7 6 5 4 3 2 1 0 00:00:00

00:00:03

00:00:06

00:00:09

00:00:12

00:00:15

00:00:18

00:00:21

00:00:24

00:00:27

00:00:30

Elapsed Time

Note: This test was run with Ultimate Thread Group by throttling requests per second to 2.

Results TPS 1.72 1.7 1.68 1.66

TPS

1.64 1.62 1.6 1.58 1.56 1.54 470

755

1044

1341

1644

1940

2234

Transactions

ÂŠ Sapient Corporation, 2013

POINT OF view

AVG_RESPONSE_TIME 3000 2500 2000 AVG_RESPONSE_TIME

1500 1000 500 0 755

470

1044

1341

1644

1940

2234

Transactions

Response Times vs. Elapsed Time 30 000 27 000

Response times in ms

24 000 21 000 18 000

add Topic to Publish Node

15 000

get Topic Page

12 000

setTotalTime

9 000 6 000 3 000 0 00:00:00

00:04:05

00:08:11

00:12:17

00:16:23

00:20:28

00:24:34

00:28:40

00:32:46

00:36:51

00:40:57

http://apc.kg/plugins

Elapsed Time (granularity: 100 ms)

From the graphs above, it is clear that only when the load is throttled in such a way as to limit the TPS (transactions per second) to be around 2 are we able to achieve response times within an acceptable range. Throttling is performed using a JMeter Plugin (Ultimate Thread Group) but this does not indicate the concurrent user sessions. Therefore, additional testing is required to understand the behaviors associated with these changing user patterns.

Iteration 2

startThreadCount (the total number of concurrent users/threads)=150 startupDelay=1200 holdLoadFor=1200 shutdownTime=0 Load Ramp Up Model Expected parallel users count

http://apc.kg/plugins

200

Number of active threads

180 160 140 120 100 80 60 40 20 0 00:00:00

00:04:00

00:08:00

00:12:00

00:16:00

00:20:00 Elapsed Time

00:24:00

00:28:00

00:32:00

00:36:00

00:40:00

Note: This test was run without Ultimate Thread Group and no throttling was applied

ÂŠ Sapient Corporation, 2013

POINT OF view

Results TPS 4.5 4 3.5 3

TPS

2.5 2 1.5 1 0.5 0 482

1223

1971

2748

3454

4192

4953

5736

6432

7166

7935

8734

9500

9882

Transactions

AVG_RESPONSE_TIME 30000 25000 20000 15000

AVG_RESPONSE_TIME

10000 5000 0 482

1223

1971

2748

3454

4192

4953

5736

6432

7166

7935

8734

9500

9882

Transactions

Response Times vs. Elapsed Time 200 000 180 000 160 000

Response times in ms

140 000 120 000

add Topic to Publish Node

100 000

get Topic Page

80 000

setTotalTime

60 000 40 000 20 000 0 00:00:00

00:04:03

00:08:06

00:12:09

00:16:12

00:20:15

00:24:18

00:28:21

00:32:24

00:36:27

00:40:30

http://apc.kg/plugins

Elapsed Time (granularity: 500 ms)

From the graphs above, we can see that the load was not throttled and users were ramped up at the rate of 1 user every 8 seconds. The moment all 150 users were ramped up, the response times grew to a level that were not within acceptable limits for the page performance.

ÂŠ Sapient Corporation, 2013

POINT OF view

Iteration 3

startThreadCount (the total number of concurrent users/threads)=10 startupDelay=100 holdLoadFor=600 shutdownTime=0

Load Ramp Up Model Expected parallel users count

http://apc.kg/plugins

Number of active threads

9 8 7 6 5 4 3 2 1 0 00:00:00

00:01:10

00:02:20

00:03:30

00:04:40

00:05:50

00:07:00

00:08:10

00:09:20

00:10:30

00:11:40

Elapsed Time

Note: This test was run without Ultimate Thread Group and no throttling was applied.

Results TPS 2.55 2.5 2.45 2.4 2.35

TPS

2.3 2.25 2.2 2.15 2.1 2.05 459

946

1429

1774

Transactions

AVG_RESPONSE_TIME 3700 3600 3500 3400 AVG_RESPONSE_TIME 3300 3200 3100 3000 459

946

1429

1774

Transactions

ÂŠ Sapient Corporation, 2013

POINT OF view

Response Times vs. Elapsed Time 10 000 9 000 8 000

Response times in ms

7 000 6 000

add Topic to Publish Node

5 000

get Topic Page

4 000

setTotalTime

3 000 2 000 1000 0 00:00:00

00:01:10

00:02:21

00:03:31

00:04:42

00:05:53

00:07:03

00:08:14

00:09:25

00:10:35

http://apc.kg/plugins

00:11:46

Elapsed Time (granularity: 500 ms)

From the graphs above, we can see that, since the load was not throttled and users were ramped up at the rate of 1 user every 10 seconds, the moment all 10 users were ramped up, the response times grew to a level that were not within acceptable limits for the page performance. In this scenario, it did not make any sense to go below 10 concurrent users. And since the average response times were in the order of 3.5 seconds, it was concluded that a single publish server would be able to support less than 10 concurrent users.

Overall System Utilization Publish 100 90 80 70

CPU Total hdadhdcom03 19-7-2013

60 50 40 30

User%

Sys%

Wait%

20 10 05:30

05:10

05:20

05:00

04:40

04:50

04:30

04:10

04:20

04:00

03:40

03:50

03:30

03:10

03:20

03:00

02:40

02:50

02:30

02:10

02:20

02:00

01:40

01:50

01:30

01:10

01:20

01:00

00:40

00:50

00:20

00:30

00:10

00:00

Author 100 90 80 70 60

CPU Total hdadhdcom01 19-7-2013

50 40 30

User%

Sys%

Wait%

20 10

05:40

05:30

05:10

05:20

05:00

04:40

04:50

04:20

04:30

04:10

04:00

03:40

03:50

03:30

03:10

03:20

03:00

02:50

02:40

02:30

02:20

02:10

02:00

01:40

01:50

01:30

01:20

01:10

01:00

00:40

00:50

00:20

00:30

00:10

00:00

ÂŠ Sapient Corporation, 2013

POINT OF view

CONCLUSION After conducting this series of tests, and then discussing and analyzing them, we’ve arrived at a few key takeaways that we think are worthwhile to consider: 1.

For a total achievable throughput, a single publish and a single author are able to achieve 1.6 TPS within an acceptable response time (those response times below 2 seconds). 2. For a total achievable concurrent user/thread count, a single publish instance is able to handle less than 10 concurrent threads/users performing continuous read operations and updates to maintain response times within SLAs (service-level agreements). 3. Scaling publish servers horizontally, in order to handle higher volumes of updates, is of no value since the bottleneck would lead to reverse replication to the author instance. (Throughput indicated above is for the entire publish layer and not for a single publish layer.) Adobe’s Social Collaboration can help to achieve social media goals and improve strategy, performance, and scalability. It is our hope that this paper has answered some of your questions and helped you better understand this particular social solution.

References 1.

Q Planning and Capacity Guide C http://dev.day.com/docs/en/cq/current/managing/capacity-guide.html 2. CQ Hardware Sizing Guidelines http://wem.help.adobe.com/enterprise/en_US/10-0/wem/managing/hardware_sizing_ guidelines.html 3. Introduction to Adobe’s Social Communities http://dev.day.com/docs/en/cq/current/administering/social_communities.html

ABOUT THE AUTHOR Sruthisagar Kasturirangan is an Infrastructure Architect, Infrastructure Practice, at SapientNitro Bangalore. A graduate from Iowa State University, he moved on to gain extensive experience within leading IT organizations and eventually moved back to his home country to join Sapient Corporation. He has over 11 years of experience in systems administration of Unix Platforms and Application Servers such as WebSphere and Weblogic, and intense exposure on capacity planning and performance tuning of Java Applications.