Tag Archives: ibm

Five steps to tune Netezza query performance

Netezza is designed with simplicity in mind. You can get it up and running in hours rather than weeks. When you follow the basic rules, 99 percent of your applications and queries can perform well. There are six things you should keep in mind while designing your databases and setting up maintenance tasks:

  • Distribution
  • Data types
  • Statistics
  • Zone maps
  • Data organization
  • Groom

When you have taken care of the these things, you shouldn’t have any major issues with performance. However, if you do have issues, how do you find them? Follow these five steps, which can help you to find and possibly fix the performance issues on your appliance.

1. Use IBM Netezza Performance Portal and the query history database

The IBM Netezza Performance Portal is an excellent tool for making sure you have everything in place, and if you don’t it will help you to identify any issues. It provides an excellent front end to the query history database and it is able to connect the appliance performance history with your query history.

Netezza Performance Portal and its installation guide are available for download from IBM Fix Central. The installation is fairly simple. Please refer to the “IBM Netezza Performance Portal User’s Guide” (which is included in the download) for installing and configuring both Netezza Performance Portal and the query history database.

2. Check the server load and resources

One of the greatest things about Netezza Performance Portal is its ability to monitor one or more Netezza systems and their resource usage. From the nice graphical user interface (GUI) you can easily identify performance peaks. Then, if you see any changes in trend, you can drill down and take a closer look by using your mouse pointer to click where the change begins and ends. You can repeat this as many times as you want. After zooming in on the period of time you are interested in, you can click the “Jump to History” button to see which queries were running during that time slot. But first you will need to choose the host you are interested in, as this will populate the “Submit Time” and “Finish Time” fields in the query history view.

3. Identify the problematic queries

Identifying problematic queries is, of course, easier said than done. Basically, there are three different types of long-lasting queries that usually require a closer look:

  • Queries that take a long time to finish because they need to access a lot of data.
  • Queries that take long time to finish because the query is not optimal.
  • Queries that take a long time to finish because the database design is not optimal.

When you first look at your queries, they probably just look like long-lasting queries. However, if you know your data and your queries, you might be able to place at least some of them in the first class I mentioned (queries that naturally take a long time to finish because of large amounts of data). What I usually do myself, after I have zoomed in on a performance peak or otherwise interesting period from the Netezza Performance Portal monitoring view, is sort the queries based on their “Query Duration.” I simply list the queries in descending order and then look at the “Query Text.”

When you have selected the interesting query, you can perform various actions with Netezza Performance Portal, including the following:

  • You can right-click the query and check the “Identical Query Trend Chart.” This will give you an idea of the variation in duration for identical queries over time. For instance, if the system is overwhelmed by concurrently running workloads, it is obvious that a query will not run as fast as it normally would. If you notice that a query took longer than normal to run, you should check what else was running on the system at that time.
  • If it really took longer to run than is typical, you can check the “Query/Plan Activity Chart.” This will give you a nice graphical view of all the queries running concurrently on the system, which could be affecting the duration of the query you were interested in.
  • You can check if the statistics are up to date on tables related to the query, and you can even update the statistics thorough Netezza Performance Portal if they are outdated.You can also check encumbrance. There might have been, for instance, loads or aborted ad hoc queries running on the system that negatively affected the system performance. I have once identified the latter to be the case for why highly-prioritized extract, transform and load (ETL) tasks did not finish in time. Since the queries were aborted, they were not seen in the Query/Plan Activity Chart, but rather on the encumbrance view.

Picture 1: Identical Query Trend Chart

4. Check the query plan

You can check the query plan directly from Netezza Performance Portal and that’s not a bad choice. However, what I usually do is check only the plan ID. You can find this from one of the columns when you are listing the queries on the query history view. I then take the plan ID, log in to the appliance and use nz_plan to take a closer look at the query. One advantage of nz_plan is that it lists the interesting snippets early in the file. Another attractive feature is that it rewrites the query very nicely in a readable format.

That said, you can still use the other techniques available to produce the query plan, including the one available directly through Netezza Performance Portal.

Picture 2: Query Plan generated with nz_plan

 

5. Check the distribution keys and change them if needed

Now that you have the query plan, one of the first things you can check from there are the distribution keys for the tables the query is accessing. Are they what you assumed they were? Are there re-distributions—single or double? If you see something like “1[03]:spu DownloadTableNode distribute into link 2147484337” you know that the table is distributed. If you assumed it isn’t, then you should check the distribution keys again.

Check how it works now when there are no issues

Don’t wait until you have issues. Familiarize yourself on how to monitor performance issues before you have performance issues. If you don’t already have Netezza Performance Portal, install and deploy it. Try and test how nz_plan utility works. Read the query plans. By doing this, you will be ahead of the game and ready to tackle any upcoming issues.

If you have any questions or suggestions related to query performance tuning, please leave a comment. You can also follow me on Twitter @TVaattanen to discuss more about Netezza.

Everything you wanted to know about networking but were afraid to ask (Part Three)

This blog post is the third part of a series about questions you may have wanted to ask about Netezza networking. The first part concentrated on basic Netezza networking, while the second part continued with network bonding and floating IP addresses. This is the third part, which concentrates on advanced configuration options.

Network speed

By default, a Netezza appliance host has two available Peripheral Component Interconnect (PCI) slots for additional PCI cards. Normally you would use one for a 10 GB dual port Network Interface Adapter (NIC) and the second available slot for dual port 8 GB Host Bus Adapters (HBA). The first you could use for 10 GB networking, and the second could be used for Storage Area Networking (SAN) or LAN-Free backups.

Internally, the appliance uses 10 GB networking. Externally, the default is 1 GB. If you want to have 10 GB external networking, then you need to have the additional 10 GB dual port NIC. Assuming you have a 10 GB network infrastructure in place, you most probably want to go directly to 10 GB.

Even if you plan to initially start with 1 GB external networking, you should consider getting the additional 10 GB NIC and 8 GB Host Bus Adapter (HBA), because you are likely going to use them later.

More about network bonding

By default, the appliance has two hosts. Both of the hosts have one external bonded virtual network device, which consists of two physical 1 GB network interfaces. By default, the network bond is created as active/passive, so the maximum bandwidth you can achieve is 1 GB. If you ask, and your network switch supports link aggregation, you can configure the network bond as Active/Active to get a 2 GB link.

As mentioned above, there are two available PCI slots. This means you can also add two 10 GB dual port NICs to those slots. That way, you can bond up to four 10 GB physical network devices together to achieve maximum 40 GB bandwidth.

Another option would be to use two of the 10 GB ports for virtual IP addresses for application connectivity, and the two remaining ones for a backup network. There are plenty of options, when you consider that you can bond together any of the 10 GB ports in any order to create a bonded device, and then you can choose to go for active/active or active/passive mode.

What about LAN-Free?

This section doesn’t actually cover pure TCP/IP networking, but rather connectivity without TCP/IP. As mentioned earlier, you can have 8 GB HBA installed on one or both of the available PCI slots on the hosts. If you decide to have at least one available PCI slot for additional 8 GB HBA, you could use it for LAN-Free backups.

TCP/IP networking is usually done in shared mode, so you have to share the bandwidth with other users—unless you have a dedicated link, which most often you don’t have. With SAN it is easier and more common to create a dedicated link between the appliance and, for example, the backup server. Or you can connect to an external SAN disk through a dedicated link. That of course has clear benefits; when you know exactly how much bandwidth there is and when you don’t need to share it with anyone.

Another benefit with the LAN-Free option is is the CPU usage. TCP/IP implementations tend to have more CPU overhead compared to SAN. I would emphasize the benefit of the dedicated link though, since CPU on the host is rarely limited while dealing with backups, for instance.

Management interfaces

I already mentioned the management IP addresses: usually two per host, one being the host IP itself, and the other being the IP address of the integrated management module (IMM).

The IMM IP addresses are extremely handy if the host itself is not reachable through the host IP due to the fact it has failed with a hardware error, or if there is something wrong with the configuration. Through IMM, you get console access though the web interface, and either debug the problem or fix the configuration issue.

Some clients require a separate management IP, which is not attached to any network devices used by applications and which still has direct TCP/IP connectivity to the host. In this case neither the host IP nor the IMM IP can be used; you need to use some other available physical network port or interface. If this is the case, you should clearly define the requirements, so you can check the available options.

What else?

If anything else is on your mind that you did not dare to ask earlier, feel free to ask or comment below. You can also follow me on Twitter @TVaattanen to discuss more about Netezza.

Everything you wanted to know about Netezza networking but were afraid to ask (Part Two)

This blog post is the second in a three-part series with the goal of answering questions you might have about Netezza networking. The first part concentrates on basic Netezza networking, whereas this second part covers more advanced networking concepts. For advanced configuration options, you can check out the upcoming third part of this blog post.

Network bonding

You have two hosts: active and passive. Each has its own IP address. These IP addresses are not floating. These are called host IPs. Since you want to have maximum redundancy on all components, there are actually two physical network devices virtually bound together to create virtual networking devices (one for each host). Both hosts have two physical network devices that carry one IP address. This is called network bonding.

Let’s say both of the hosts have network devices eth6 and eth7 and they create a coupled virtual device called bond2. We usually use bond0 and bond1 internally, so the first bonded device for external use is normally bond2.

For the virtual device bond2, you can assign an IP address and connect to a host. Both active and passive hosts will have this device and both of the hosts will have their own individual IP address, which is bound to this virtual device.

Virtual IP

If you think of this from an applications point of view, it wouldn’t make sense to connect to the host IP, since if the active host fails, you would need to re-configure applications to use the new active host, which has a different IP.

That’s why applications use virtual IP. Virtual IP is actually an IP alias, which is bound to an active host. Hosts run standard Linux operating systems, so if you are familiar with Linux, it’s easy to explain. If not, it’s still not rocket science. On Linux, you can easily add IP aliases on top of any physical, or virtual for that matter, network device . If you have physical network device eth0 with fictional IP address 192.168.1.100, you can add another IP address to that same physical device just by assigning an IP to device eth0:0. Next you add to device eth0:1 and so on.

In this case, you have virtual network device bond2, which is a bonded device having physical devices eth6 and eth7 behind it. If you lose eth6, you are still good as long as physical device eth7 is good. To connect to either of the hosts directly, you would use the IP address assigned to bond2 on the particular host, or rather the host name you have assigned in your domain name server (DNS) for that IP address.

Floating IP

As I said, applications connect to a virtual IP. The virtual IP is assigned to virtual network device bond2:0. It only exists on an active host. This is something called a floating IP, and it is always on the active host. If Host 1 fails, it will be on Host 2. If, as in my example, device eth6 fails, you have bonded device bond2, which consists of eth6 and eth7, the floating IP is still good on that same appliance as before.

There are two virtualization layers here. One is done though network bonding, the other is done through cluster software. If one of the network devices physically breaks, the network bonding will do the trick, and you are still good to go. If the other appliance breaks, you have clustering software, which can deactivate the bond2:0 on the failing host and create bond2:0 on new active host.

So the bond2:0 always has the virtual IP your applications are able to use. You should, of course, always assign host names in your DNS for this virtual IP, and use this host name in your applications instead of using IP addresses directly. That way, if you ever need to change the IP address for the virtual IP, you don’t need to change configurations for several applications. Instead, you just have to change the IP for the host name you have defined for the virtual IP in your DNS configuration.

What about changes to the default configuration?

I will cover advanced configuration options in part three of this blog post. If you have any network-related questions or suggestions, please add them below in the comments. You can also follow me on Twitter @TVaattanen to discuss more about Netezza.