
| \-txBufferSize | True | Publisher | The size of the transmit buffer, specified in packets. | 16 |
| \-txRate | True | Publisher | The rate at which to transmit data, specified in megabytes. | _unlimited_ |
| \-txIterations | True | Publisher | Specifies the number of packets to publish before exiting. | _unlimited_ |
| \-txDurationMs | True | Publisher | Specifies how long to publish before exiting. | _unlimited_ |
| \-reportInterval | True | Both | The interval at which to output a report, specified in packets. | 100000 |
| \-tickInterval | True | Both | The interval at which to output tick marks. | 1000 |
| \-log | True | Listener | The name of a file to save a tabular report of measured performance. | _none_ |
| \-logInterval | True | Listener | The interval at which to output a measurement to the log. | 100000 |
| \-polite | True | Publisher | Switch indicating if the publisher should wait for the listener to be contacted before publishing. | _off_ |
| _arguments_ | True | Publisher | Space separated list of addresses to publish to, specified as addr:port. | _none_ |
h3. Usage Examples
h4. Listener
{noformat}
java -server com.tangosol.net.DatagramTest -local box1:9999 -packetSize 1468
{noformat}
h4. Publisher
{noformat}
java -server com.tangosol.net.DatagramTest -local box2:9999 -packetSize 1468 box1:9999
{noformat}
For ease of use, {{datagram-test.sh}} and {{datagram-test.cmd}} scripts are provided in the Coherence bin directory, and can be used to execute this test.
h2. Example
Let's say that we want to test network performance between two servers servers - _Server A_ with IP address 1{{95.0.0.1}} and _Server B_ with IP address {{195.0.0.2}}. One server will act as a packet publisher and the other as a packet listener, the publisher will transmit packets as fast as possible and the listener will measure and report performance statistics. First start the listener on _Server A_.
{noformat}
datagram-test.sh
{noformat}
After pressing {{ENTER}}, you should see the Datagram Test utility showing you that it is ready to receive packets.
{noformat}
starting listener: at /195.0.0.1:9999
packet size: 1468 bytes
buffer size: 1428 packets
report on: 100000 packets, 139 MBs
process: 4 bytes/packet
log: null
log on: 139 MBs
{noformat}
As you can see by default the test will try to allocate a network receive buffer large enough to hold 1428 packets, or about 2 MB. If it is unable to allocate this buffer it will report an error and exit. You can either decrease the requested buffer size using the {{\-rxBufferSize}} parameter or increase you OS network buffer settings. For best performance it is recommended that you increase the OS buffers. See the following [forum post|http://www.tangosol.net/forums/thread.jspa?threadID=616&tstart=0] for details on tuning your OS for Coherence.
Once the listener process is running you may start the publisher on _Server B_, directing it to publish to _Server A_.
{noformat}
datagram-test.sh servera
{noformat}
After pressing {{ENTER}}, you should see the new Datagram test instance on _Server B_ start both a listener and a publisher. Note in this configuration _Server B's_ listener will not be used. The following output should appear in the _Server B_ command window.
{noformat}
starting listener: at /195.0.0.2:9999
packet size: 1468 bytes
buffer size: 1428 packets
report on: 100000 packets, 139 MBs
process: 4 bytes/packet
log: null
log on: 139 MBs
starting publisher: at /195.0.0.2:9999 sending to servera/195.0.0.1:9999
packet size: 1468 bytes
buffer size: 16 packets
report on: 100000 packets, 139 MBs
process: 4 bytes/packet
peers: 1
rate: no limit
no packet burst limit
oooooooooOoooooooooOoooooooooOoooooooooOoooooooooOoooooooooOoooooooooOoooooooooO
{noformat}
The series of "o" and "O" tick marks appear as data is (O)utput on the network. Each "o" represents 1000 packets, with "O" indicators at every 10,000 packets.
On _Server A_ you should see a corresponding set of "i" and "I" tick marks, representing network (I)nput. This indicates that the two test instances are communicating.
h2. Reporting
Periodically each side of the test will report performance statistics.
h3. Publisher Statistics
The publisher simply reports the rate at which it is publishing data on the network. A typical report is as follows:
{noformat}
Tx summary 1 peers:
life: 97 MB/sec, 69642 packets/sec
now: 98 MB/sec, 69735 packets/sec
{noformat}
The report includes both the current transmit rate (since last report) and the lifetime transmit rate.
h3. Listener Statistics
The listener reports more detailed statistics including:
|| Element || Description ||
| Elapsed | The time interval that the report covers. |
| Packet size | The received packet size. |
| Throughput | The rate at which packets are being received. |
| Received | The number of packets received. |
| Missing | The number of packets which were detected as lost. |
| Success rate | The percentage of received packets out of the total packets sent. |
| Out of order | The number of packets which arrived out of order. |
| Average offset | An indicator of how out of order packets are. |
As with the publisher both current and lifetime statistics are report. A typical report is as follows:
{noformat}
Lifetime:
Rx from publisher: /195.0.0.2:9999
elapsed: 8770ms
packet size: 1468
throughput: 96 MB/sec
68415 packets/sec
received: 600000 of 611400
missing: 11400
success rate: 0.9813543
out of order: 2
avg offset: 1
Now:
Rx from publisher: /195.0.0.2:9999
elapsed: 1431ms
packet size: 1468
throughput: 98 MB/sec
69881 packets/sec
received: 100000 of 100000
missing: 0
success rate: 1.0
out of order: 0
avg offset: 0
{noformat}
The primary items of interest are the throughput and success rate. The goal is to find the highest throughput while maintaining a success rate as close to {{1.0}} as possible. On a {{100 Mb}} network setup you should be able to achieve rates of around {{10 MB/sec}}. On a {{1 Gb}} network you should be able to achieve rates of around {{100 MB/sec}}. Achieving these rates will likely require some tuning (see below).
h4. Throttling
The publishing side of the test may be throttled to a specific datarate expressed in megabytes per second, by including the {{-txRate M}} parameter when {{M}} represents the maximum MB/sec the test should put on the network.
h4. Bidirectional Testing
You may also run the test in a bidirectional mode where both servers act as publishers and listeners. To do this simply restart test instances, supplying the instance on _Server A_ with _Server B's_ address, by running the following on _Server A_.
{noformat}
datagram-test.sh -polite serverb
{noformat}
And then run the same command as before on _Server B_. The {{\-polite}} parameter instructs this test instance to not start publishing until it is starts to receive data.
h4. Distributed Testing
You may also use more then two machines in testing, for instance you can setup two publishers to target a single listener. This style testing is far more realistic then simple one-to-one testing, and may identify bottlenecks in your network which you were not otherwise aware of.
Assuming you intend to construct a cluster consisting of four machines, you can run the datagram test amongst all of them as follows:
On servera:
{noformat}datagramtest.sh -txRate 100 -polite serverb serverc serverd{noformat}
On serverb:
{noformat}datagramtest.sh -txRate 100 -polite servera serverc serverd{noformat}
On serverc:
{noformat}datagramtest.sh -txRate 100 -polite servera serverb serverd{noformat}
On serverd:
{noformat}datagramtest.sh -txRate 100 servera serverb serverc{noformat}
This test sequence will cause all nodes to send a total of 100MB per second to all other nodes (i.e. 33MB/node/sec). On a fully switched network 1GbE network this should be achievable without packet loss.
To simplify the execution of the test all nodes can be started with an identical target list, they will obviously transmit to themselves as well, but this loopback data can easily be factored out. It is important to start all but the last node using the -polite switch, this will cause all other nodes to delay testing until the final node is started.
{pagerating}