In this page we have copied Chapter 6 of
SC41-0607-02 AS/400 Performance Capabilities Reference - Version 4, Release 4.
We believe that reading this chapter, expecially the section
Web Server Performance Tips and Techniques,
will contribute to a better understanding of the factors
that may influence system capacity in a Web serving environment.
Chapter 6. Web Serving Performance
Performance information for Web serving on the AS/400 and various types
of web server transactions will be discussed in this section.
There are many factors that can impact overall performance
(e.g., end-user response time, throughput) in the complex Internet
environment, some of which are listed below:
- Web browser
- processing speed of the client system
- performance characteristics of the web browser
- client application performance characteristics
- Communication network
- speed of the communication links
- capacity of any proxy servers
- congestion of network resources
- AS/400 Web server
- AS/400 processor speed
- utilization of key AS/400 resources (CPU, IOPs, memory, disk)
- Web server performance characteristics
- application (e.g. servlet) performance characteristics
The primary focus of this section will be to discuss the performance
characteristics of the AS/400 as a server in a Web serving environment,
providing capacity planning information and recommendations for best
performance. Please refer to "Chapter 5. Communications performance"
for related information.
Data accesses across the Internet differ distinctly from accesses
across 'traditional' communications networks. The additional resources
to support Internet transactions by the CPU, IOPs, and line are
significant and must be considered in capacity planning.
Typically, in a traditional network:
- there is a request and response (between client and server)
- connections/sessions are maintained between transactions
- networks are tuned to use large frames
For web transactions there are a dozen or more line transmissions
(including acknowledgements) per transaction:
- a connection is established/closed for each transaction
- there is a request and response (between client and server)
- networks typically have small frame (MTU) sizes
- one user transaction may contain separate internet transactions
- secure transactions are frequent and consume more resource
The information that follows is based on performance measurements
and analysis don in the internal IBM performance lab.
The raw data is not provided here, but the highlights, general
conclusions, and recommandations are included. Results listed here
do not represent any particular customer environment.
Actual performance may vary significantly from what is provided here.
Note that these workloads, along with other published benchmark data
(from other sources) are always measured in best-case environment
(e.g., local LAN, large MTU sizes). Real Internet networks tipycally
have higher contention, MTU sizes limitations, and intermediate
network servers (e.g., proxy, SOCKS).
6.1 Web serving with the HTTP Server
The Hypertext Transfer Protocol (HTTP) server allows AS/400 systems
attached to a TCP/IP network to provide objects to any Web browser.
At a high level, the connection is made, the request is received and
processed, the data is sent to the browser, and the connection is
ended. The HTTP server jobs and the communications router tasks
are the primary job/tasks involved (there is not a separate user
job for each attached user).
Workload Description and Data Interpretation:
The workload is a program that runs on a client workstation.
The program simulates multiple Web browser clients and repetitively
issues 'URL requests' to the AS/400 Web server.
The number of simulated clients can be adjusted to vary the offered
load. Each of the transaction types listed in the tables serve about
1000 bytes:
- Static Page: serves a static page via the HTTP server.
This information can be accessed from the web server's cache
of specified IFS files.
- CGI (HTML): invokes a CGI program that accesses data
from IFS and serves a simple HTML page via the HTTP server.
This runs in a named activation group.
- CGI (SQL): invokes a CGI program that performs a simple
SQL request and serves the result via HTTP server.
This runs in a named activation group.
- Persistent CGI: invokes a CGI program that receives a handle
supplied by the browser, accesses data
from IFS and serves a simple HTML page via the HTTP server.
- Net.Data (HTML):
invokes the Net.Data program that
serves a simple HTML page via the HTTP server.
- Net.Data (SQL):
invokes the Net.Data program that performs a simple
SQL request and serves the result via HTTP server.
- Servlet: invokes a Java servlet that accesses data
from IFS and serves a simple HTML page via the HTTP server.
Each of the above can be served in a secure or non-secure fashion.
"Relative CPU time" is the average AS/400 CPU time to process
the transaction for each specific scenario.
"AS/400 Capacity (hits(sec/CPW)" is the capacity metric used to
estimate the capacity of any AS/400 model.
Note that transaction/sec/CPW can be used interchangeably with
hits/sec/CPW. An example exists in the conclusions.
"Secure:Nonsecure time ratio" indicates the extra CPU
processing required to execute a given transaction in a secure mode.
The CGI programs were compiled using a "named" activation group.
For more information on program activation groups, refer to
AS/400 ILE concepts, SC41-5606.
|
Table 6.1 V4R4 AS/400 Web serving
Capacity Planning
|
Transaction type: |
Nonsecure |
Secure
|
Capacity metric: hit/sec/CPW |
Relative CPU time |
Capacity metric: hit/sec/CPW |
Secure:Nonsecure CPU timeratio
|
Static page (cached) |
1.86 |
0.6 x |
0.58 |
3.2
|
Static page (not cached) |
1.18 |
1.0 x |
0.48 |
2.5
|
CGI (HTML) |
0.44 |
2.7 x |
0.28 |
1.6
|
CGI (SQL) |
0.43 |
2.7 x |
0.28 |
1.5
|
Persistent CGI |
0.44 |
2.7 x |
0.25 |
1.8
|
Net.Data (HTML) |
0.24 |
4.9 x |
0.19 |
1.3
|
Net.Data (SQL) |
0.15 |
7.9 x |
0.13 |
1.2
|
Servlet |
0.40 |
2.9 x |
0.28 |
1.4
|
Note:
- IBM HTTP server for AS/400: V4R4; 100 Mbps Ethernet;
with TCPONLY(*YES)
- Based on measurement from AS/400 Model 720-2062
- Static page caching done with IBM HTTP server (WRKHTTPCFG)
- All request cached for Net.Commerce
- 1KB data served for each of the transaction types
- Data assumes no access logging
- CGI programs compiled with "named" activation group
- Secure measurements done with Secured Socket Layer (SSL)
with 40-bit RC4 encryption
- transactions/second/CPW can be used interchangeably
with hits/sec/CPW
- CPW is the "Relative System Performance Metric",
found in Chapter 2,
"AS/400 System Capacities and CPW"
- Web server capacities may not scale exactly by CPW, therefore,
results may differ significantly from those listed here
- NA = Not available
|
|
Figure 6.1 As/400 Web Serving V4R4 Relative Capacities
Web Server Performance Tips and Techniques:
- V4R4
provides a performance improvement of up 70%
over that of V4R3 (with similar hardware). This is mostly due to
improvements in the IBM HTTP Server and TCP/IP performance.
For static pages that are not cached, V4R4 provides up to
7% more capacity.
For static pages that are cached, V4R4 provides up to
20% more capacity.
For CGI and Net.Data transactions, V4R4 provides up to
70% more capacity.
V4R3 provided a performance improvement in capacity of
up to 65% over that of V4R2 (with similar hardware). This is
mostly due to the improved efficiency of the IBM HTTP Server
over that of the ICS/400 of V4R2.
For static pages that are not cached, V4R3 provides up to
20% more capacity.
For static pages that are cached, V4R4 provides up to
65% more capacity.
There were also significant improvements for Net.Data and
CGIs with named activations in V4R3.
- Web Server Capacity (Example Calculations):
throughput for web serving is typically discussed
in terms of the number of hits/second or transactions/second.
Typically the CPU will be the resource that determines
overall system capacity. If the IOPs become the resource that limits
system throughput, then the number of IOPs supporting the load
could be increased. For system configurations where the CPU is the
limiting resources, Table 6.1 above can be used for capacity planning.
Use these high-level estimates with caution. They do not take the
place of a complete capacity planning session with actual measurements
of your particular environment. Remember that these example transactions
are fairly trivial. Actual customer transactions may be significantly
more complex and therefore consume additional CPU resources.
Scaling issues for the server, the application, and the database
also may come into consideration when using N-way processors with
higher projected capabilities.
Example 1: Estimating the capacity for a given model an transaction
type:Estimate the system capacity by multiplying CPW (relative
relative system performance metric) for the AS/400 model with
the appropriate hits/second/CPW value (the capacity metric provided
in the table),
Capacity = CPW * hits/sec/CPW
For example, a 170-2386 rated at 460 CPW doing web serving with CGI
programs would have a capacity of 202 trans/sec (460 x 0.44 = 202).
This assumes that the entire capacity of the system would be allocated
for Web serving. If other work is also on the system, you must
pro-rate the CPU allocation. for example, if only 25% of the CPU
is allocated for Web serving, then you would have a web serving
throughput of 50 trans/sec (460 x 0.25 x 0.44 = 50).
Example 2: Estimating how many CPWs are required for a given
web transaction load: Characterize the transaction make-up
of the estimated workload and the required transaction rate
(in transactions/sec). Estimate the CPWs required to support a given
load by dividing the required transaction rate by the appropriate
hit/sec/CPW value (the capacity metric provided in the table).
Required CPWs = transaction rate / hits/sec/CPW .
For example, in order to support 175 CGI trans/sec, 398 CPWs would
be required (175/0.44 = 398 CPWs). I a mixed load is being assessed,
then calculate the required CPWs for each of the components and add
them up. Select an AS/400 model that fits and allow enough room for
future growth.
- Net.Data:
- Net.Data is more disk I/O intensive than typical HTTP trnsactions.
Therefore more HTTP server jobs may be needed to provide the
optimal level of system throughput.
- A Net.Data SQL macro il slower than an SQL CGI.bin.
This is because the Net.Data SQL macro is interpreted while the
SQL CGI.bin is compiled code. There are functional advantages
in using an SQL macro.
- direct reuse of existing SQL statements (no programming
required)
- provides the buil-in ability to format SQL results
- provides the ability to store SQL results in a table
and pass the results to a different language environment
(e.g., REXX).
- CGI and persistent CGI:
Significant (perhaps as much as 6x) performance benefits can be
realized by compiling into a "named" versus a "new" activation group.
It is essential for good performance that CGI-based applications
use named activation groups. Refer to the
AS/400 ILE concepts
for more details on activation groups.
Persistent CGI is specific to applications needing to keep state
information across web transactions. Don't confuse persistent CGI
with a way to improve the performance of your CGI program.
You'll notice in the earlier table that the performance of CGI
is nearly identical to that of the persistent CGI due to the
advantages gained by runnin in a "named" activation group.
- Web Server Cache for IFS files:
Serving static pages that are cached can increase Web server capacity
by about 50%. Ensure that highly used files are selected to be in the
cache (WRKHTTPCFG).
- Page size:
The data in the table assume about 1K bytes being served.
If the pages are larger, more bytes are processed, CPU processing
per transaction significantly increases, and therefore the
transaction capacity metrics would be reduced.
- Response Time (general):
User response time is made up Web browser (client workstation)
time, network time, and server time. A problem in any one of these
areas may cause a significant performance problem for an end-user.
To an end-user, it may seem apparent that any performance problem would
be attributable to the server, even though the problem may lie
elsewhere.
It is common for pages that are being served to have imbedded images
(e.g., GIFs). Each of these separate Internet transactions adds to the
response time since they are treated as independent HTTP requests
and can be retrieved from various servers (some browsers can retrieve
multiple URLs concurrently).
- HTTP and TCP/IP Configuration Tips:
- The number of HTTP server jobs:
The CHGHTTPA command has parameters that specify the minimum and
maximum number of server jobs. This is a system-wide value.
The WRKHTTPCFG also can specify similar values (MaxActiveThreads
and MinActiveThreads). These values would override the values
that are set via CHGHTTPA and would be for a given configuration.
The reason for having multiple server jobs is that when one server
is waiting for aq disk or communication I/O to complete, a
different server job can process another user's request. Also,
for N-ways systems, each CPU may simultaneously process
server jobs. The system will adjust the number of the servers
that are needed automatically (within the bounds of the minimum
and the maximum required).
The values specified are the number of "child or worker" threads.
Typically, 5 server threads are adequate for smaller systems
(100 CPWs or less). For larger systems dedicated to HTTP serving,
increasing the number of servers to 10 or more may provide better
performance. A starting point for the maximum number of threads
can be the CPW value divided by 20. Try not to have more than
is needed as this may cause unnecessary system activity.
- The maximum frame size parameter (MAXFRAME on LIND)
can be increased from 1994 bytes for TRLAN (or other values for
other protocols) to its maximum of 16393 to allow for larger
transmissions. Typically documents are larger than 1994 bytes.
- The maximum transmission unit (MTU) size parameter
(CFGTCP command) for both the route and the interface affect
the actual size of the line flows. Increasing these values
from 576 bytes to a larger size (up to 16388)will most likely
reduce the overall number of transmissions, and therefore,
increase the potential capacity of the CPU and the IPO.
Similar parameters also exist on the Web browser. The negotiated
value will be the minimum of the server and browser (and perhaps
any bridges/routers), so increase them all.
- Increasing the TCP/IP buffer size (TCPRCVBUF and
TCPSNDBUF on the CHGTCPA or CFGTCP command)
from 8K bytes to 64K bytes may increase the performance when
sending larger amounts of data.
If data coming into the server is simply requests,
increasing TCPRCVBUF may not provide any benefit.
- Secure Web serving:
Secure Web serving involves additional overhead to the server.
Additional line flows occur (fixed overhead) and the data is
encrypted (variable overhead proportional to the number of bytes).
Note the capacity factors in the tables above comparing
non-secure and secure serving. For simple transactions (e.g.,
static page serving) the impact of secure serving is 2x or more
based on the number of bytes served. For complex transactions
(e.g., CGI or Net.Data) the overhead is in the range of 15_40%.
- E-Business applications typically yield a variety of complex
transactions. These transactions have sub-transactions made of
static pages, CGI, Net.Data, etc.. Capacity planning for these
is more complex and warrants a carefull analysis of the make-up
of the transactions. The data from the tables can assist with
this analysis.
- Error and Access Logging:
Having logging turned on causes a small amount of system overhead
(CPUtime, extra I/O). Turn loggin off for best capacity.
Use WRKHTTPCFG command to make these changes.
- Name Server Access:
For each Internet transaction, the server accesses the name
server for information (IP address and name translations).
These accesses cause significant overhead (CPU time, comm I/O)
and greatly reduce system capacity.
These accesses can be eliminated by using the WRKHTTPCFG command
and the adding the line "DNSLookup Off".
- HTTP Server Memory Requirements:
Follow the faulting threshold guidelines suggested in the
work management guide by observing/adjusting the memory
in both the machine pool and the pool that the HTTP servers
run in (WRKSYSSTS).
- AS/400 model selection: Use the information provided in
this section along with the characterization of your HTTP load
environment in a capacity planning exercise (perhaps with
BEST/1) to choose the appropriate AS/400 model.
All the tasks, jobs and threads associated with HTTP serving
are 'non-interactive', so AS/400e servers or AS/400 Advanced
Servers would provide the best price/performance (unless
other interactive work is present on the system).
- File System Considerations:
Web serving performance varies significantly based on which
file system is used. Each file system has different overheads
and performance characteristics. Note that serving from the ROOT
or QOPENSYS directories provides the best system capacity. If Web
page development is done from another directory, consider copying
the data to a higher-performing file system for production use.
The web serving performance of the non-thread-safe
file system is significantly less than the root directory.
Using QDLS or QSYS may decrease capacity by 2-5 times.
For a more detail discussion of IFS performance, please refer
to the V4R2 version of this document.
- File Size Considerations: The connect and disconnect costs
are similar regardless of size, but cost for the transmission of
data withe the TCP/IP and the IFS access vary with size.
As file size increases, the IOP is more efficient by being able to
a higher aggregate data rate. However, been larger, the files
require more data frames, thus causing the hit/second capacity
for the IOP to go down accordingly.
- Communications/LAN IOPs:
Since there are a dozen line flows or more per transaction,
the Web serving environment utilizes the IOP more than other
communications environment. Use the performance monitor
(STRPRFMON) and the component report (PRTCPTRPT)
to measure IOP utilization. Attempt to keep the average IOP
utilization at 60% or less for best performance.
IOP capacity depends on file size and MTU size (make sure
you increase the maximum MTU size parameter).
Additional information on communications/LAN IOP performance
can be found in section LAN of this manual.
The 2619 or the 2617 LAN IOPs have a capacity of roughly
70 hits/sec when serving small (e.g., 1K byte) nonsecure pages
(keep in mind that each hit contains a dozen or so line flows).
Using Ethernet or TRLAN IOPs from V4R1 or more recent, have
capabilities in the 100-130 hits/sec range.
If 100Mb Ethernet is used and the TCPONLY parameter in the LIND
has a value od *YES, then capacities up to 250 hits/sec may be
seen.
On larger AS/400 models, the comm/LAN IOP may become the
bottleneck before the CPU does. If additional capacity is needed,
multiple IOPs (with unique IP addresses) could be configured.
The overall worload would have to be 'manually' balanced by
Web browsers requesting documents from a set of interfaces.
The load can also be balanced across multiple IP addresses
by using DNS (domain name server).
|
|