< The TCP, UDP & Asynchronous Pattern | Main | VB .NET Web Class Performance Measurement Program Example >

 


 

 

Chapter 14 Part 2:

Network Performance and Scalability

 

 

What do we have in this chapter 14 Part 2?

  1. Resource Management

  2. Memory

  3. OS Networking Limitations

  4. Bandwidth

  5. Optimizing Web Classes

  6. Managing Threads and Connections

  7. A Simple C# Web Class Performance Measurement Program Example

Resource Management

 

When performing network operations, it is important to post multiple asynchronous receive operations to ensure that the application receives data as fast as possible. However, this practice can cause problems when the number of concurrent connections handled by a server increases. Additionally, as a server accepts more and more connections and performs send operations on each connection, the bandwidth of the local network also must be taken into consideration. For this reason, memory and bandwidth issues are important considerations when designing a scalable server. The following two sections will discuss these concepts in more detail.

 

Memory

 

Using the asynchronous I/O pattern is paramount for high performance, but rules still need to be followed to achieve scalability. For example, posting dozens rather than a few asynchronous receive operations on a Socket or Stream will not drastically increase performance, but it will increase the amount of memory used. The application can run out of resources, which will affect the number of connections an application can handle.

Consider a TCP Socket–based server application that maintains multiple connections and posts varying numbers of asynchronous receive operations. Each BeginReceive posted requires a buffer to receive the data, plus a small structure to maintain context information for the operation (including a reference to the receive buffer). If each receive operation uses a 16-KB receive buffer, plus a 200-byte context structure, then each BeginReceive uses 16,584 bytes. Table 14-1 calculates the memory requirements for various connection counts and operations per connection where each operation requires 16,584 bytes.

 

Table 14-1: Sample Memory Requirements for Asynchronous I/O

 

Total Connections

Operations Per Connection

Total Memory (Bytes)

10

10

1,658,400

1,000

10

165,840,000

50,000

10

8,292,000,000

10

2

331,680

1,000

2

33,168,000

50,000

2

1,658,400,000

 

 

Notice that the operations per connection field can be any combination of asynchronous send and receive operations - the basic idea is that each connection has the given number of operations outstanding. Posting 10 asynchronous operations for each connection limits a typical server to handling around 24,000 connections maximum, a typical server being defined as an architecture running a 32-bit operating system at the maximum memory configuration possible of 4 GB. If the application keeps the number of outstanding operations on each connection down to two, you’ll notice a significant increase in the number of connections that can be handled.

 

OS Networking Limitations

 

The number of network connections the Microsoft Windows NT operating systems (Windows NT 4, Windows 2000, Windows XP, and Windows Server 2003) can establish is limited based on the memory resources available. The operating system reserves a portion of the total memory in what is known as non-paged memory. Non-paged memory contains information and data structures that are never paged out of memory. Usually, the system reserves one-quarter of the total memory for the non-paged pool, with a limit of 256 MB on Windows 2000 and later and 128 MB on Windows NT 4. These limits are for 32-bit versions of the operating system.

Operating system constructs, such as file handles, process information, networking connections, and so on, are examples of information that must always be resident in physical memory. Each TCP connection consumes approximately 2 KB of the non-paged memory. Because of this, a system with 256 MB of the non-paged pool can establish roughly 100,000 connections. Remember that a portion of the non-paged pool is being used by other system components, so networking cannot consume the entire amount. Also, the data buffers used to send and receive data must be locked into the non-paged pool while the network stack processes data. Socket operations will start to fail if a server reaches a point where there is too little free memory. In this case, a SocketException is thrown where the ErrorCode property is the Winsock error code WSAENOBUFS (10055). The Socket should be closed when this occurs to free all associated resources and ensure other operations on different sockets don’t also fail due to insufficient memory.

Lastly, a server should guard against idle connections, which are a form of attack. Consider a request-response, based server that accepts client connections, waits for a request, and issues a response. If a client connects but never sends the request, the server typically posts an asynchronous receive that never completes. If enough malicious clients do this, valid clients can be prevented from connecting as the server will run out of resources which may resembles the DOS/DDOS. A defensive server should keep track of how long each client is idle and close the connection if it exceeds a predefined limit.

 

Bandwidth

 

When performing network operations, the bandwidth of the network directly affects how well the application scales. For example, an FTP-like application that sends or receives large data in bulk is not going to scale well past a few hundred connections on a 10-MB network. When designing network-based applications, responsiveness is an important design goal. If the FTP server allowed 1,000 concurrent users, each connection would rate at 1,250 bytes/second. This isn’t terrible unless the file being retrieved is several megabytes in size. Table 14-2 lists transfer rates for concurrent connections and local network bandwidths.

 

Table 14-2: Bandwidth Per Connection Statistics

 

Total Connections

Network Bandwidth (megabits)

Bytes/Second per Connection

100

10

12,500

1,000

10

1,250

50,000

10

25

100

100

125,000

1,000

100

12,500

50,000

100

250

 

The local bandwidth plays an important role in establishing limits on the number of concurrent asynchronous operations to allow per connection. For example, if a server is designed to handle 1,000 concurrent connections on a 100-MB network, each connection can send and receive at 12,500 bytes/second. Posting 10 asynchronous sends of 8 KB each is a waste of resources as there always will be eight operations waiting on the network to send the data. An efficient design limits two asynchronous sends per connection.

If a network becomes overly congested, packets will be lost or dropped and the TCP protocol will be forced to retransmit packets. This causes more congestion, and it’s likely that TCP will timeout when the recipient fails to acknowledge it received the data, which causes the network stack to abort the connection. When this connection is dropped, any pending socket operation or a subsequently issued socket method will fail with a SocketException. The ErrorCode value will be the Winsock error WSAECONNABORTED (10053). If a server experiences an excessive number of aborted connections, it should disallow additional connections until the number of currently established clients drops below a threshold. Another option is for it to close valid connections to lower network congestion. If no action is taken, it’s probable that other accepted connections will be aborted leading to many failed clients - many more than if the server preemptively closes a number of connections to bring the network congestion down so that the remaining clients can be successfully serviced. If an application efficiently handles sending data, it also must efficiently handle receiving data. As we mentioned earlier, if an application does not receive data fast enough, the TCP window size shrinks, which throttles the sender from sending additional data until the receiver catches up. The rule of thumb for receiving is to ensure at least one asynchronous receive is posted at all times. To do this, an application should have three or four receive operations posted at any given time. This allows the application to process one operation while several are still outstanding and the network stack can fill those buffers as data arrives. Again, it is important that the receive callback does not take too long to process. Of course, this rule applies only to applications that receive data at a high rate. A single asynchronous receive is sufficient for applications that receive small chunks of data infrequently.

 

Optimizing Web Classes

 

The previous topics in this chapter apply when using the Web classes, but since the Web classes implement a higher level protocol, as well as offer more functionality such as authentication, there are additional considerations. In this section, we’ll cover threading with the Web classes, connection management, and performance considerations using the GET and POST verbs, asynchronous I/O, and authentication.

 

Managing Threads and Connections

 

Using the asynchronous I/O pattern offers the most bang for the buck in terms of increasing performance with the Web classes. However, it is important to understand how asynchronous calls are made as the process can influence application design. When an asynchronous call completes, a thread from the thread pool that runs the delegate associated with the operation is consumed. The thread pool is a limited resource that can be exhausted if you’re not careful.

Recall from Chapter 10 that the ServicePointManager enforces a limit of two concurrent connections per application domain per host. This can cause a few problems. First, in the case of ASP.NET, the server can be underutilized as the front-end requests are bottlenecked on the two threads limit. Second, the application may become deadlocked because the ASP.NET worker threads are waiting for requests to complete, except those requests require additional threads for additional operations so they can complete.

In version 1.0 and 1.1 of the .NET Framework, blocking Web calls are implemented using asynchronous calls. The blocking call issues an asynchronous request and then waits until it completes, which means even blocking calls use the thread pool resource. In the newer major release of the .NET Framework, blocking HTTP calls will be true blocking calls.

A couple of solutions prevent a deadlock situation. The first solution is to move the methods being called from ASP.NET into a local dynamic link library (DLL) so that they can be invoked directly. The second solution is to use the ServicePointManager to increase the number of allowed concurrent connections. The global defaults can be modified by changing the DefaultNonPersistentConnectionLimit and DefaultPersistentConnectionLimit properties on the ServicePointManager class. Or, if an application needs to change the connection limits for a specific destination, the ServicePoint.ConnectionLimit can be changed. Increasing the connection limit to 10 or 12 is usually sufficient for ASP.NET and other middle-tier scenarios. The following code sample creates a request and sets the connection limit to the given destination:

 

C#

HttpWebRequest request = HttpWebRequest)WebRequest.Create("http://www.bodo.com");

 

request.ServicePoint.ConnectionLimit = 12;

request.BeginGetResponse(new AsyncCallback(ResponseCallback), request);

Visual Basic .NET

Dim request As HttpWebRequest = WebRequest.Create("http://www.butuh.com")

 

request.ServicePoint.ConnectionLimit = 12

request.BeginGetResponse(AddressOf ResponseCallback, request)

The connection limit plays a significant role in HTTP performance, as shown in Table 14-3. The table lists performance results for different values of the ServicePoint.ConnectionLimit property when making Web GET requests. In addition to the connection limit, HTTP pipelining and the Nagle algorithm play roles in performance. The numbers in Table 14-3 were measured on a 2.4-GHz Pentium 4 client running Windows XP Service Pack 1 with 512-MB memory retrieving a 1-KB file. The server was a quad processor 1.4-GHz AMD 64 server with 2-GB memory running Windows Server 2003 Service Pack 1. Both computers were on an isolated 100-MB network.

 

Table 14-3: Web Class Performance

 

Connection Limit

Pipelining

Nagle

Requests per Second

MB/Second

2

No

No

2277.697

2.332

10

No

No

2760.601

2.826

20

No

No

2822.945

2.890

10

Yes

No

3906.25

4.000

20

Yes

No

3881.988

3.975

10

No

Yes

2712.674

2.777

20

No

Yes

2741.288

2.807

10

Yes

Yes

3633.721

3.720

20

Yes

Yes

3633.721

3.720

 

The performance numbers clearly show that increasing the connection limit to 10 offers a significant performance increase of approximately 20 percent. The results also show that increasing the connection limit beyond 10 does not offer an additional increase. This is likely due to the fact that at 10 concurrent connections a single processor client’s CPU is maxed out and additional concurrent connections are bottlenecked by the CPU. Another significant performance increase occurs when pipelining is enabled on the client. As we’ve mentioned, pipelining a request results in one TCP connection being used to issue multiple requests. Note: Significantly poorer performance results when pipelining is enabled in Internet Information Services (IIS) 6 released with Windows Server 2003. This is a known problem that should be fixed in Service Pack 1.

Table 14-3 also shows numbers for when the Nagle algorithm is disabled. Notice that no significant performance increase is seen with Nagle disabled. This is because an HTTP GET request involves a single request being sent to the server followed by receiving the response and the data. The Nagle algorithm only affects data being sent. Disabling the Nagle algorithm will have a larger influence on HTTP POST requests. It also makes a difference whenever authentication is involved as authentication requires many more roundtrip request-responses.

The sample applications used to measure these results entitled fastgetxx and presented in the following section which include C#, VB .NET and C++.

 

A Simple C# Web Class Performance Measurement Program Example

 

Create a new C# console application project. You may want to use the solution and project names as shown in the following screenshot.

 

A Simple C# Web Class Performance Measurement Program Example: create a new console application project

 

Rename the source file to FastAsyncGetCS to reflect the application to be developed.

 

A Simple C# Web Class Performance Measurement Program Example: rename the C# source file

 

Add/edit the code as given below.

 

 

//

// This sample measures HTTP GET performance through the HttpWebRequest class. Various

// options can be configured such as the connectionlimit, the total number of requests,

// Nagling, unsafe connection sharing, etc. The sample posts the requested number of

// asynchronous requests. When one completes, the response is retrieved, and an

// asynchronous stream receive is posted. The stream receive is reposted until the entire

// data stream has been read.

//

// Usage:

//      fastgetcs -c [int] -n [int] -u [URI] -p

//      -a              Allow unsafe authenticated connections

//      -c  int         Number of concurrent requests allowed

//      -d              Disable Nagle algorithm

//      -n  int         Total number of connections to make

//      -u  URI         URI resource to download

//      -p              Enable pipelining

//      -un string      User name

//      -up string      Password

//      -ud domain      Domain

//

// Sample usage:

//      fastgetcs -c 10 -n 10000 -u http://foo.com/default.html

//

//      fastgetcs -c 10 -n 10000 -u http://foo.com/default.html

//

 

using System;

using System.IO;

using System.Net;

using System.Text;

using System.Threading;

using System.Globalization;

 

namespace FastGetCS

{

    /// <summary>

    /// Simple client designed to max out the network 

    /// using HttpWebRequest.

    /// </summary>

    class FastAsyncGetCS

    {

        public static Uri uriName = null;

        public static ManualResetEvent allDone = null;

        public static int numRequests = 10;

        public static int numConnections = 2;

        public static int numRequestsCompleted = 0;

        public static byte[] readBuffer = null;

        public static long totalBytesReceived = 0;

        public static bool pipeliningEnabled = false;

        public static bool useNagle = true;

        public static bool unsafeAuthentication = false;

        public static NetworkCredential userInfo = null;

 

        /// <summary>

        /// Displays usage information.

        /// </summary>

        static void usage()

        {

            Console.WriteLine("In FastAsyncGetCS.usage()");

            Console.WriteLine("usage: fastgetcs -c [int] -n [int] -u [URI] -p");

            Console.WriteLine(" -a              Allow unsafe authenticated connections");

            Console.WriteLine(" -c  int         Number of concurrent requests allowed");

            Console.WriteLine(" -d              Disable Nagle algorithm");

            Console.WriteLine(" -n  int         Total number of connections to make");

            Console.WriteLine(" -u  URI         URI resource to download");

            Console.WriteLine(" -p              Enable pipelining");

            Console.WriteLine(" -un string      User name");

            Console.WriteLine(" -up string      Password");

            Console.WriteLine(" -ud domain      Domain");

            Console.WriteLine();

        }

 

        /// <summary>

        /// This is the main method which parses the command line and initiates the GET requests.

        /// It then waits for the 'allDone' method to be set at which point it calculates the performance

        /// statistics and displays them to the console.

        /// </summary>

        /// <param name="args">Command line parameters</param>

        [STAThread]

        static void Main(string[ ] args)

        {

            Console.WriteLine("In FastAsyncGetCS.Main()");

            string userName = null, passWord = null, domain = null;

 

            for (int i = 0; i < args.Length; i++)

            {

                try

                {

                    if ((args[i][0] == '-') || (args[i][0] == '/'))

                    {

                        switch (Char.ToLower(args[i][1]))

                        {

                            case 'a':

                                // Allow unsafe authentication

                                unsafeAuthentication = true;

                                break;

                            case 'c':

                                // How many concurrent requests to allow

                                numConnections = System.Convert.ToInt32(args[++i]);

                                break;

                            case 'd':

                                // Disable Nagle algorithm

                                useNagle = false;

                                break;

                            case 'n':

                                // How many client connections to establish to server

                                numRequests = System.Convert.ToInt32(args[++i]);

                                break;

                            case 'p':

                                // Enable pipelining

                                pipeliningEnabled = true;

                                break;

                            case 'u':

                                if (args[i].Length == 2)

                                {

                                    // URI to retrieve

                                    uriName = new Uri(args[++i]);

                                }

                                else

                                {

                                    switch (Char.ToLower(args[i][2]))

                                    {

                                        case 'n':

                                            // User name

                                            userName = args[++i];

                                            break;

                                        case 'p':

                                            // Password

                                            passWord = args[++i];

                                            break;

                                        case 'd':

                                            // Domain

                                            domain = args[++i];

                                            break;

                                        default:

                                            usage();

                                            return;

                                    }

                                }

                                break;

                            default:

                                usage();

                                return;

                        }

                    }

                }

                catch

                {

                    usage();

                    return;

                }

            }

            if (uriName == null)

            {

                usage();

                return;

            }

 

            if ((userName != null) || (passWord != null) || (domain != null))

            {

                userInfo = new NetworkCredential(userName, passWord, domain);

            }

 

            try

            {

                float start = 0;

                float end = 0;

                float total = 0;

 

                readBuffer = new byte[1200];

                allDone = new ManualResetEvent(false);

                // Gets the number of ms elapsed since the system started - start count

                start = Environment.TickCount;

                Console.WriteLine("Counting start in ms: " + start);

 

                GetPages();

                allDone.WaitOne();

                // Gets the number of ms elapsed since the system started - end count

                end = Environment.TickCount;

                Console.WriteLine("Counting end in ms: " + end);

                // total ms count from start to end

                total = end - start;

                Console.WriteLine("Total count in ms: " + total);

                Console.WriteLine("Calculating the total KB/s...");

                long totalKBytes = (long)((totalBytesReceived / 1000) / (total / 1000));

                Console.WriteLine("Then total count in seconds " + total / 1000 + " seconds.");

                Console.WriteLine("Number of request: " + numRequests / (total / 1000) + " requests per second.");

                // Gets a NumberFormatInfo associated with the en-US culture.

                NumberFormatInfo nfi = new CultureInfo("en-US", false).NumberFormat;

                nfi.NumberDecimalDigits = 0;

                Console.WriteLine("It is " + totalKBytes.ToString("N", nfi) + " KB per second.");

            }

            catch (Exception ex)

            {

                Console.WriteLine(ex.ToString());

                allDone.Set();

            }

        }

 

        /// <summary>

        /// Retrieve the given URI the requested number of times. This method initiates an asynchronous

        /// HTTP GET request for the URI.

        /// </summary>

        static void GetPages()

        {

            int i;

 

            Console.WriteLine("In FastAsyncGetCS.GetPages()");

            for (i = 0; i < numRequests; i++)

            {

                HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uriName);

                request.ServicePoint.ConnectionLimit = numConnections;

                request.ServicePoint.UseNagleAlgorithm = useNagle;

                request.UnsafeAuthenticatedConnectionSharing = unsafeAuthentication;

                request.Pipelined = pipeliningEnabled;

                request.Credentials = userInfo;

                if (userInfo != null)

                    request.ConnectionGroupName = userInfo.UserName;

                request.BeginGetResponse(new AsyncCallback(ResponseCallback), request);

            }

            return;

        }

 

        /// <summary>

        /// This is the asynchronous callback invoked when the HTTP GET request completes. It retrieves

        /// the HTTP response object, obtains the data stream, and posts an asynchronous stream read

        /// to retrieve the data. Note that all receives for all requests use the same data buffer since

        /// we don't care about the data -- we just want to measure performance.

        /// </summary>

        /// <param name="result">Asynchronous context result for the operation</param>

        private static void ResponseCallback(IAsyncResult result)

        {

            HttpWebRequest req = (HttpWebRequest)result.AsyncState;

            Console.WriteLine("In FastAsyncGetCS.ResponseCallback()");

 

            try

            {

                // Retrieve the response and post a stream receive

                WebResponse response = req.EndGetResponse(result);

                Stream responseStream = response.GetResponseStream();

                responseStream.BeginRead(readBuffer, 0, readBuffer.Length, new AsyncCallback(ReadCallBack), responseStream);

            }

            catch (Exception ex)

            {

                Console.WriteLine("Exception thrown in ResponseCallback... " + ex.ToString());

                req.Abort();

                Interlocked.Increment(ref numRequestsCompleted);

            }

        }

 

        /// <summary>

        /// This is the asynchronous callback for the asynchronous stream receive operation posted after the

        /// HTTP response is obtained from a request. This method checks the number of bytes returned. If it is

        /// non-zero, another receive is posted as there could be more data pending. If zero is returned, we have

        /// read the entire data stream so we can close it.

        /// </summary>

        /// <param name="asyncResult">Asynchronous context result for the operation</param>

        private static void ReadCallBack(IAsyncResult asyncResult)

        {

            Stream responseStream = (Stream)asyncResult.AsyncState;

            int read = responseStream.EndRead(asyncResult);

            Console.WriteLine("In FastAsyncGetCS.ReadCallBack()");

 

            if (read > 0)

            {

                // Possibly more data pending...post another receive

                totalBytesReceived += read;

                responseStream.BeginRead(readBuffer, 0, readBuffer.Length, new AsyncCallback(ReadCallBack), responseStream);

            }

            else

            {

                // Reached the end of the stream so close it up

                responseStream.Close();

                Interlocked.Increment(ref numRequestsCompleted);

 

                if (numRequestsCompleted >= numRequests)

                {

                    allDone.Set();

                }

            }

            return;

        }

    }

}

 

 

 


< The TCP, UDP & Asynchronous Pattern | Main | VB .NET Web Class Performance Measurement Program Example >