August 21, 2017
Hot Topics:

A Picture is Worth a Thousand Lines of Code...

Archives

Now that you've described the data, you need to tell RRDtool how to store it. This is where you want to break out a calculator. If you've chosen to do 1 day, 2 days, 1 week, 1 month, and 1 year graphs, you'll have to store the data in archives that make displaying data in those formats sensible. You don't want blocky graphs; you want enough data points to give you good detail, but you don't necessarily want to store too much data.

```my @opts = (
'--step', 300,
'DS:in:GAUGE:600:0:U',
'DS:out:GAUGE:600:0:U',
'RRA:AVERAGE:0.5:1:288',
'RRA:AVERAGE:0.5:2:288',
'RRA:AVERAGE:0.5:7:288',
'RRA:AVERAGE:0.5:30:288',
'RRA:AVERAGE:0.5:356:288'
);
```

Archive definitions begin with an "RRA" and contain four critical pieces of information, all colon delimited: Type, "Xfiles Factor," Steps, and Rows.

Types

Archive types are different than Data Source types; there are quite a few simple and complex types. For the purposes of this article, you'll just take a look at the simple types: AVERAGE, MIN, MAX, and LAST. If you've worked with SQL before, you may notice a striking resemblance of these types to SQL Aggregate functions. That's because you're aggregating data from your Data Sources into RRAs or "Round Robin Archives."

"Xfiles Factor"

This is a number between 0 and 1 that tells RRDtool what percent of the aggregated data can be made up of "Unknown" while still remaining valid. I've used a factor of 50%, to say at least half the consolidated data must be valid. You can adjust this number higher or lower to eliminate gaps in aggregated data sets.

Steps

This number tells RRDtool how many steps (in this example, 300 second intervals) to consolidate for each Archive. As a simplified example, I'm using the number of days as the number of steps. You'll need to adjust this for making graphs of different time frames.

Rows

To get the number of rows, I divided the number of seconds in a day (86400) by the step (300), and got a result of 288. The number of rows tells RRDtool how many rows of aggregated data to store. Using the number of days as the step, I can keep 288 rows of data and cover the graphs I want to do.

There are other ways to adjust the steps and rows in the RRA to limit the number of RRAs that you have to create, but that can take up more space than you might need. If you take time to design the database, and meet the majority of your needs, you can limit the size of the RRD itself, thus making the access times a little bit quicker.

Code for Creation

The @opts array you've been working with is a list of arguments you'd pass to the RRDs::create() function to build an RRD on the local disk. Here's a sample snippet that I use to create my RRDs dynamically.

```my \$RRD_DIR = '/var/rrd/myapp';
sub create_rrd {
my \$hostname = shift;
\$hostname =~ s/[^wd_-.]+//g;  # taint check hostname if coming
# from an untrusted source.
my \$fn = join '/', \$RRD_DIR, \$hostname;
return \$fn if -f \$fn;
my @opts = (
'--step', 300,
'DS:in:GAUGE:600:0:U',
'DS:out:GAUGE:600:0:U',
'RRA:AVERAGE:0.5:1:288',
'RRA:AVERAGE:0.5:2:288',
'RRA:AVERAGE:0.5:7:288',
'RRA:AVERAGE:0.5:30:288',
'RRA:AVERAGE:0.5:356:288'
);
RRDs::create \$fn, @opts;
#
# Check for error
my \$err = RRDs::error;
warn "Error creating \$fn: \$errn" if \$err;
#
# return the filename
return \$fn;
}
```

Adding Data to the RRDs

Now that the RRDs are set up, you need to fill it with your data. I usually use RRDs; it involves networking, and so the data is relevant to IP addresses or something of that nature. To facilitate speedy coding, I usually write a data updating function that looks like this:

```sub update_rrd {
my \$host = shift;
my \$RRD = create_rrd(\$host);
my @vals = @_;
my \$update = join ':', time,@vals;
RRDs::update \$RRD, \$update;
my \$err = RRDs::error;
warn "update_rrd: problem updating \$RRD: \$err"
if \$err;
}
```

And to add data, you call the update_rrd() function tgat will automatically create the RRD for you if one doesn't exist, thanks to the interaction of the the create_rrd() function and update_rrd(). You pass in the following arguments: the hostname, the in traffic, and the out traffic. If you had your data stored in a hash, it might look like this:

```foreach my \$host (keys %traffic) {
update_rrd(\$host,\$traffic{\$host}{in},\$traffic{\$host}{out});
}
```

Page 2 of 4

Comment and Contribute

(Maximum characters: 1200). You have characters left.

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Most Popular Developer Stories

Thanks for your registration, follow us on our social networks to keep up-to-date