Kea Performance Tests 1.7.6 and Multithreading
This report presents a series of measurements on Kea version 1.7.6, taken using ISC's Jenkins continuous integration system and ISC's Perfdhcp software. The measurements were carried out in April 2020 as part of on-going work to convert Kea to a multithreaded application. The main purpose of the tests was to measure the impact of multithreading on various configurations.
The test described below is continuing to execute, triggered manually or by a commit to the master development branch, so the results shown in our live continuous integration system are certain to differ from those shown below.
At the time this test was completed, the core Kea v4 and v6 daemons had multithreading implemented, but the various Kea hooks had not yet been updated to support threading. Some of the Kea hooks, notably the RADIUS integration hook and the associated Host Cache hook, will not be upgradable to multithreading, because of the Radius client library that hook relies on.
The test was conducted in ISC's internal network using 3 systems. Two were running Kea and the related database backends (details below) and the third was running perfdhcp. All three systems were connected in a single VLAN using 1 gigabit ethernet network. The servers were running on the configuration described below. The perfdhcp was running on much less capable box, but its configuration doesn't affect anything, so was omitted here for clarity.
Hardware specs for Kea servers - Dell R340 servers
- CPU Intel Xeon E-2146G 3.5GHz 6 cores/12 threads
- 64GB RAM
- 3 x SSDs 446GB each in HW RAID-0 configuration (virtual disk size 1338GB)
- Intel(R) 10GbE 2P X710 Adapter (2 ports)
- Intel(R) GbE 4P I350-t Adapter (4 ports)
- Broadcom Gigabit Ethernet BCM5720 (2 ports)
- OS - Ubuntu 18.04.4 LTS
- Mysql - 5.7.29-0ubuntu0.18.04.1
- Postgresql - 190ubuntu0.1
- boost - 126.96.36.199ubuntu1
- openssl - 1.1.1-1ubuntu2.1~18.04.5
- Kea - 1.7.6-isc0018120200407134619 (Tests were run against multiple builds of Kea, because this is running in our continuous integration system. The results highlighted here were from Kea 1.7.6 (git commit 0018120200407134619).)
- perfdhcp 1.7.6
- Kea was configured with a single large subnet.
- All configurations (both Kea4 and Kea6) had a single pool of addresses which was expected to be sufficient to ensure that the pool is never filled during a test. The pool size for v4 was 16777214 addresses, and the pool size for v6 was 18446744073709551616 addresses. In nearly all tests, pools were <50% utilized. However we underestimated Kea performance and in some earlier runs you may see in our Jenkins system we managed to exhaust the pool. (Performance is adversely impacted when pool utilization is high.)
- Default values were used wherever possible.
- Lease lifetimes are longer than duration of each test, so there are no renewals.
- Shared subnets were not used.
- No host reservations were defined, either in the configuration file or in the database.
- For tests with the database backends, both Kea and the database server were running on the same machine. (Performance may be very different if the database and Kea are not running on the same host. Also, the overall performance may be much lower if more than one Kea instance is connected to the same database.)
We experimented with different settings to achieve the maximum performance in multithreaded mode. These are our preliminary findings, which will change in the future. We encourage users to perform their own experiments.
Kea 1.7.6 has an experimental command-line switch -NX (where X denotes the number of threads to run). The upcoming 1.7.7 version will have dedicated configuration parameters for this and the command-line switch will be removed. The parameter names are likely to be "enable-multi-threading", "thread-pool-size" and "packet-queue-size". Please check 1.7.7 release notes or 1.7.7 ARM once it is published.
- It looks like the optimal setting for Kea is to use half of the available threads. For example, if your CPU has 4 physical cores and have hyper-threading enabled, your system will report 8 cores. We found out that using 4 threads seems to result in the best performance.
- The memfile backend worked best with -N4 on our hardware, so this is the setting we used for Kea when testing with the memfile.
- Databases should work best with higher number of threads, as long as this doesn't exceed the number of cores. In this test we used -N6 for the tests with the MySQL and PostgreSQL backends. (In subsequent test runs we changed to using -N12 with PostgreSQL, which improved performance in that environment.)
Traffic Generation (clients)
Client traffic was generated using Perfdhcp, a utility developed by ISC and included with the Kea source and package distribution.
- Clients perform only the basic 4 message exchange (SARR or DORA). There are no release, renew or rebind messages.
- Each client performs the exchange only once.
- Messages do not include any additional options except those necessary to get an address from DHCP server.
Example perfdhcp commands are attached with config files at the end of this article.
- The server was started before each measurement run began and stopped after it ended.
- Kea starts with an empty database (no file) and writes the leases to disk as they are granted. The in-memory database is cleared (with the "lease6-wipe" command) between each test. The MySQL and PostgreSQL database backends were initialized from scratch at the start of each run, and cleared between tests.
- Traffic was generated continuously and the number of leases requested per second increased until the point that the perfdhcp began reporting 2.5% packets being late or dropped. The 2.5% loss rate is an arbitrary, but reasonable limit for acceptable packet loss in a production network. For the purposes of this test, a packet is considered to have been dropped when the response from the Kea server reaches perfdhcp 2 or more seconds from the time the perfdhcp system sent the request packet or is not received at all. This approach ensures that we overestimate the drop rate (we effectively assume worst case scenario that if a response hasn't appeared in 2 seconds, we assume it never will, which is most of the time overly pessimistic), so the actual behavior is better than reported here.
- The basic test was run with 3 different configurations, using the default memfile lease backend, a MySQL lease backend, and a PostgreSQL lease backend.
- The tests were run once with multi-threading enabled and again with multi-threading disabled.
- Identical tests were run for Kea v4 and Kea v6.
- The MySQL, PostgreSQL databases were started before a run and stopped afterwards. After being started, the previous scheme were deleted and the database schema set up from scratch.
- In each test run, we executed the test 9 times, and removed the highest and lowest result. The bar charts show the average of the remaining 7 runs.
These charts display the Average Maximum Lease rate (the rate at which the server hands out leases, measured in leases/second) achievable with up to a 2.5% packet drop rate. This is effectively the performance of the server. As noted above,a packet is considered dropped if the delay seen by Perfdhcp between sending a request and receiving a response exceeds 2 seconds. These charts present the average of 12 runs. To get each lease a total of 4 packets (DORA or SARR) had to be exchanged.
Kea/DHCPv4 Test Results
This section presents results for Kea/DHCPv4 running with the three different lease backend configurations.
The graphs below show the average of 7 runs (after the highest and lowest of 9 runs were discarded).
Every configuration had higher performance with multithreading enabled. As prior performance tests have established, the Memfile lease database is the highest performing, followed by the external database options.
Previously we had found that PostgreSQL was significantly faster than MySQL, but in this test, they were nearly identical in single thread mode, with PostgreSQL faster in multithread mode only.
Kea DHCPv6 Test Results
This section presents results for Kea/DHCPv6 running with the three different lease backend configurations. The graphs below show the average of 7 runs (after the highest and lowest of 9 runs were discarded).
In the DHCPv6 tests, every configuration had higher performance with multi-threading enabled, but the memfile configuration had dramatically higher performance with DHCPv6 vs DHCPv4. (any theories why this was??)
The DHCPv6 tests also showed a significant performance difference between the PostreSQL and MySQL backends.
In the preceding sections, we measured the peak performance in both single and multithreaded mode. The definition used for the peak lease rate is that at which the 2.5% remains unaswered after 2 seconds (perfdhcp does not differentiate if the packet is late more than 2 seconds or truly lost), i.e. for a set of DISCOVER (SOLICIT) requests sent out by perfdhcp, 97.5% of them result in a lease being granted.
Kea/DHCPv4 - Gain due to Multithreading
Kea/DHCPv6 - Gain due to Multithreading
We would like to stress several points:
- Peformance gains due to multithreading are substantial.
- The Kea implementation will evolve and our performance testing will expand over time, so the performance results will surely change in the future.
- The major bottleneck slowing performance in single threaded mode was the database connection for MySQL or PostgreSQL. With the ability to use multiple threads, each with its own connection, that bottleneck is removed. We observed that the whole Kea+database system yields higher performance with multithreading. Therefore, database tuning will have a greater impact on overall system performance when Kea is multithreaded. We did not attempt any database tuning for these tests.
- The version tested is an experimental development version. Development versions should not be used in production networks. The upcoming Kea 1.8 stable version will be the first version with multithreading that will be supported for production use.
- Some features (most notably RADIUS support) are not multithreading capable. Other features that rely on Kea hooks (e.g. High Availability) will be upgraded to use threads as we continue developing this feature.