As defined previously, the focus of accounting is to track the usage of network resources and traffic characteristics. The following sections identify various accounting scenarios:
Network monitoring
User monitoring and profiling
Application monitoring and profiling
Capacity planning
Traffic profiling and engineering
Peering and transit agreements
Billing
Security analysis
This is certainly not an exhaustive list of the different accounting scenarios and categories. Nevertheless, it covers the needs of the majority of enterprise and service provider customers. Each section describes the problem space, examples of specific results, and some implementation examples.
Let's start by discussing some generic examples that are at the edge between accounting and performance monitoring. The fuzzy area of "network monitoring" fits in here. The term "network monitoring" is widely interpreted: one person might relate it to device utilization only, and someone else might think of end-to-end monitoring. In fact, network monitoring is a vague expression that includes multiple functions. Network monitoring applications enable a system administrator to monitor a network for the purposes of security, billing, and analysis (both live and offline). We propose to use the term "network monitoring" for any application that does not fit into the other categories.
Table 1-2 illustrates device utilization. Assume that we have a network with three service classes deployed. Class 0 delivers real-time traffic, such as voice over IP, and class 1 carries business-critical traffic, such as e-mail and financial transactions. Class 2 covers everything else; this is the "best-effort" traffic class. Table 1-2 illustrates the total amount of traffic collected per class, including the number of packets and number of bytes. This report provides relevant information to a network planner. The technology applied in this example is an SNMP data collection of the CISCO-CLASS-BASED-QOS-MIB (see Chapter 4, "SNMP and MIBs"), which describes all the CoS counters.
| Class 0 | Class 1 | Class 2 | ||||
|---|---|---|---|---|---|---|
| Time (Hour) | Packets | Bytes | Packets | Bytes | Packets | Bytes |
| 0 | 38 | 2735 | 1300 | 59800 | 3 | 1002 |
| 1 | 55 | 3676 | 400 | 44700 | 61 | 9791 |
| 2 | 41 | 36661 | 400 | 16800 | 4 | 240 |
| 3 | 13 | 1660 | 200 | 8400 | 4 | 424 |
| 4 | 16 | 14456 | 400 | 44700 | 4 | 420 |
| 5 | 19 | 2721 | 400 | 44400 | 1 | 48 |
| 6 | 21 | 24725 | 600 | 35600 | 516 | 20648 |
| 7 | 19 | 3064 | 700 | 412200 | 15 | 677 |
| 8 | 5 | 925 | 1200 | 176000 | 1 | 48 |
| 9 | 4 | 457 | 1300 | 104100 | 1242 | 1489205 |
| 10 | 5 | 3004 | 1900 | 1091900 | 1 | 48 |
| 11 | 4 | 451 | 400 | 39800 | 545 | 22641 |
| 12 | 4 | 456 | 800 | 54200 | 1017 | 1089699 |
| 13 | 5 | 510 | 500 | 41600 | 36 | 3240 |
| 14 | 4 | 455 | 400 | 99300 | 15 | 3287 |
| 15 | 5 | 511 | 800 | 36800 | 685 | 27578 |
| 16 | 4 | 454 | 100 | 4000 | 3 | 144 |
| 17 | 4 | 457 | 500 | 309500 | 2 | 322 |
| 18 | 4 | 455 | 400 | 34100 | 4 | 192 |
| 19 | 5 | 3095 | 1300 | 104100 | 4 | 424 |
| 20 | 4 | 398 | 100 | 15200 | 4 | 424 |
| 21 | 5 | 1126 | 800 | 54200 | 12 | 936 |
| 22 | 7 | 782 | 1300 | 104100 | 4 | 835 |
| 23 | 9 | 7701 | 600 | 35600 | 1 | 235 |
Another scenario of network monitoring is the use of accounting usage resource records for performance monitoring. The accounting collection process at the device level gathers usage records of network resources. These records consist of information such as interface utilization, traffic details per application and user (for example, percentage of web traffic), real-time traffic, and network management traffic. They may include details such as the originator and recipient of a communication. Granularity differs according to the requirements. A service provider might collect individual user details for premium customers, whereas an enterprise might be interested in only a summary per department. This section's focus is on usage resource records, not on overall device details, such as CPU utilization and available memory.
A network monitoring solution can provide the following details for performance monitoring:
Device performance monitoring:
- Interface and subinterface utilization
- Per class of service utilization
- Traffic per application
Network performance monitoring:
- Communication patterns in the network
- Path utilization between devices in the network
Service performance monitoring:
- Traffic per server
- Traffic per service
- Traffic per application
Applied technologies for performance monitoring include SNMP MIBs, RMON, Cisco IP SLA, and Cisco NetFlow services.
The trend of running mission-critical applications on the network is evident. Voice over IP (VoIP), virtual private networking (VPN), and videoconferencing are increasingly being run over the network. At the same time, people use (abuse?) the network to download movies, listen to music online, perform excessive surfing, and so on.
This information can be used to
Monitor and profile users.
Track network usage per user.
Document usage trends by user, group, and department.
Identify opportunities to sell additional value-added services to targeted customers.
Build a traffic matrix per subdivision, group, or even user. A traffic matrix illustrates the patterns between the origin and destination of traffic in the network.
Accounting records can help answer the following questions:
Which applications generate the most traffic of which type?
Which users use these applications?
What percentage of traffic do they represent?
Where do they come from?
Where do they go?
Do the users accept the policies on network usage?
When will upgrades affect the fewest users?
There are also legal requirements related to monitoring users and collecting accounting records. For example, you could draw conclusions about an individual's performance on the job. In some countries, it is illegal to collect specific performance data about employees. One solution could be to collect no details about individuals. Although this is ideal from a legal perspective, it becomes a nightmare during a security attack. Consider a scenario in which a PC of an individual user has been infected by a virus and starts attacking the network, and the user is unaware of this. It would be impossible to identify this PC without collecting accounting records per user, so you need to collect this level of detail. The same applies to the victims of the attack: They will certainly complain about the bad network service, but the operator cannot help them without useful data sets. From a data analysis perspective, we need to store performance baseline information and apply statistical operations such as "deviation from normal" to spot abnormalities.
A compromise could be to gather all details initially and separate the storage mechanisms afterwards. You could keep all details for security analysis for a day (minimum) or a week (maximum) and aggregate the records at the department level for performance or billing purposes. This approach should be okay from a legal perspective if you ensure that there is no public access to the security collection.
Note
Check your country's legal requirements before applying per-user accounting techniques.
Applied technologies for user monitoring and profiling include RMON; Authentication, Authorization, and Accounting (AAA); and Cisco NetFlow services.
With the increase in emerging technologies such as VoIP/IP telephony, video, data warehousing, sales force automation, customer relationship management, call centers, procurement, and human resources management, network management systems are required that allow you to identify traffic per application. Several years ago, this was a relatively easy task, because there were several different transmission protocols: TCP for UNIX communication, IPX for Novell file server sharing, SNA for mainframe sessions, and so on. The consolidation toward IP eliminated several of these protocols but introduced a new challenge for the network operator: how to distinguish between various applications if they all use IP. Collecting different interface counters was not good enough any more. From a monitoring point, it got worse. These days most server applications have a Web graphical user interface (GUI), and most traffic on the network is based on HTTP. In this case, traffic classification for deploying different service classes requires deep packet inspection, which some accounting techniques offer. Because of these changes, we need a new methodology to collect application-specific details, and accounting is the chosen technology. An example is Cisco Network-Based Application Recognition (NBAR), which is described in Chapter 10, "NBAR."
The collected accounting information can help you do the following:
Monitor and profile applications:
- In the entire network
- Over specific expense links
Monitor application usage per group or individual user
Deploy QoS and assign applications to different classes of service
A collection of application-specific details is also very useful for network baselining. Running an audit for the first time sometimes leads to surprises, because more applications are active on the network than the administrator expected. Application monitoring is also a prerequisite for QoS deployment in the network. To classify applications in different classes, their specific requirements should be studied in advance, as well as the communication patterns and a traffic matrix per application. Real-time applications such as voice and video require tight SLA parameters, whereas e-mail and backup traffic would accept best-effort support without a serious impact.
The next question to address is how to identify a specific application on the network.
In most environments, applications fall into the following distinct categories:
Applications that can be identified by TCP or UDP port number. These are either "well-known" (0 through 1023) or registered port numbers (1024 through 49151). They are assigned by the Internet Assigned Numbers Authority (IANA).
Applications that use dynamic and/or private application port numbers (49152 through 65535), which are negotiated before connection establishment and sometimes are changed dynamically during the session.
Applications that are identified via the type of service (ToS) bit. Examples such as voice and videoconferencing (IPVC) can be identified via the TOS value.
Subport classification of the following:
- HTTP: URLs, MIME (Multipurpose Internet Mail Extension) types or hostnames
- Citrix applications: traffic based on published application name
Classification based on the combination of packet inspection and multiple application-specific attributes. RTP Payload Classification is based on this algorithm, in which the packet is classified as RTP based on multiple attributes in the RTP header.
In some of these cases, deeper packet inspection is needed. This can be performed by Cisco NBAR, for example.
Figure 1-7 displays traffic details per application, aggregated over time.
An alternative report would identify the various protocols on the network—for example, IPv4 traffic compared to IPv6 traffic or TCP versus UDP traffic. Figure 1-8 shows a protocol distribution.

Cisco IT performed a network audit to track the applications on the Cisco internal network, and it provided some interesting results. The following list of applications and protocols comprises about 80 percent of the total traffic that traverses the WAN:
HTTP
IP telephony
IP video
Server and PC backups
Video on demand (VoD)
Multicast
SNMP
Antivirus updates
Peer-to-peer traffic
Techniques to obtain the classification per application are RMON2, Cisco NetFlow, and Cisco NBAR. All three classify the observed traffic per application type. Chapter 5, "RMON," explains RMON; Chapter 7, "NetFlow," provides NetFlow details; and Chapter 10 covers NBAR.
A more advanced report could combine application-specific details and CoS information. A network planner can use such a report to isolate problems in a QoS-enabled environment (such as to detect when a certain class is almost fully utilized but the bandwidth cannot be increased). In this case, one or multiple applications could be moved to another class. For example, e-mail traffic could be reclassified from class 1 to class 2.
The next report, shown in Table 1-3, is based on Table 1-2, but it extends the level of detail by including some application-specific parts. For class 0, we are interested in the percentage of VoIP and non-VoIP traffic; in class 1 we distinguish between e-mail and SAP traffic (assuming that only these two applications get assigned to class 1). For the best-effort traffic in class 2 we distinguish between web traffic (HTTP), peer-to-peer-traffic, and the rest. This report cannot be compiled by retrieving SNMP data from the CISCO-CLASS-BASED-QOS-MIB, because it collects only counters per traffic class, not counters per application within a class. Hence, we leverage either NetFlow or RMON (Remote Monitoring MIB Extensions for Differentiated Services, RFC 3287) to gather the extra level of per-application details.
| Class 0 | Class 1 | Class 2 | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Load | Application (Bytes) | Load | Application (Bytes) | Load | Application (Bytes) | ||||||||
| Time (Hour) | Packets | Bytes | Voice | Other | Packets | Bytes | SAP | Packets | Bytes | HTTP | Peer-to-Peer | Other | |
| 0 | 38 | 2735 | 264 | 2471 | 1300 | 59800 | 38870 | 20930 | 13 | 1002 | 752 | 100 | 150 |
| 1 | 55 | 3676 | 128 | 3548 | 400 | 44700 | 29055 | 15645 | 61 | 9791 | 8812 | 979 | 0 |
| 2 | 41 | 56661 | 780 | 55881 | 400 | 16800 | 10920 | 5880 | 4 | 240 | 216 | 24 | 0 |
| 3 | 13 | 1660 | 328 | 1332 | 200 | 8400 | 5460 | 2940 | 4 | 424 | 382 | 42 | 0 |
| 4 | 16 | 14456 | 128 | 14328 | 400 | 44700 | 29055 | 15645 | 4 | 420 | 378 | 42 | 0 |
| 5 | 19 | 2721 | 1164 | 1557 | 400 | 44400 | 28860 | 15540 | 10 | 480 | 48 | 48 | 384 |
| 6 | 21 | 24725 | 9856 | 14869 | 600 | 35600 | 23140 | 12460 | 516 | 20648 | 18583 | 2065 | 0 |
| 7 | 19 | 3064 | 2048 | 1016 | 700 | 412200 | 267930 | 144270 | 15 | 677 | 609 | 68 | 0 |
| 8 | 5 | 925 | 512 | 413 | 1200 | 176000 | 114400 | 61600 | 12 | 960 | 48 | 96 | 816 |
| 9 | 4 | 457 | 256 | 201 | 1300 | 104100 | 67665 | 36435 | 1242 | 1489205 | 1340285 | 148921 | 0 |
| 10 | 5 | 3004 | 1684 | 1320 | 1900 | 1091900 | 709735 | 382165 | 3 | 256 | 230 | 26 | 0 |
| 11 | 4 | 451 | 96 | 355 | 400 | 39800 | 25870 | 13930 | 545 | 22641 | 20377 | 2264 | 0 |
| 12 | 4 | 456 | 64 | 392 | 800 | 54200 | 35230 | 18970 | 1017 | 1089699 | 980729 | 108970 | 0 |
| 13 | 5 | 510 | 128 | 382 | 500 | 41600 | 27040 | 14560 | 36 | 3240 | 2916 | 324 | 0 |
| 14 | 4 | 455 | 416 | 39 | 400 | 99300 | 64545 | 34755 | 15 | 3287 | 2958 | 329 | 0 |
| 15 | 5 | 511 | 496 | 15 | 800 | 36800 | 23920 | 12880 | 685 | 27578 | 24820 | 2758 | 0 |
| 16 | 4 | 454 | 128 | 326 | 100 | 4000 | 2600 | 1400 | 3 | 144 | 130 | 14 | 0 |
| 17 | 4 | 457 | 256 | 201 | 500 | 309500 | 201175 | 108325 | 2 | 322 | 290 | 32 | 0 |
| 18 | 4 | 455 | 196 | 259 | 400 | 34100 | 22165 | 11935 | 4 | 192 | 173 | 19 | 0 |
| 19 | 5 | 3095 | 2048 | 1047 | 1300 | 104100 | 67665 | 36435 | 4 | 424 | 382 | 42 | 0 |
| 20 | 4 | 398 | 286 | 112 | 100 | 15200 | 9880 | 5320 | 4 | 424 | 382 | 42 | 0 |
| 21 | 5 | 1126 | 956 | 170 | 800 | 54200 | 35230 | 18970 | 12 | 936 | 842 | 94 | 0 |
| 22 | 7 | 782 | 612 | 170 | 1300 | 104100 | 67665 | 36435 | 4 | 835 | 752 | 84 | 0 |
| 23 | 9 | 7701 | 2096 | 5605 | 600 | 35600 | 23140 | 12460 | 2 | 235 | 212 | 24 | 0 |
Best practice suggests monitoring the network before implementing new applications. Taking a proactive approach means that you analyze the network in advance to identify how it deals with new applications and whether it can handle the additional traffic appropriately. A good example is the IP telephony (IPT) deployment. You can run jitter probe operations with Cisco IP SLA, identify where the network needs modifications or upgrades, and start the IPT deployment after all tests indicate that the network is running well. After the deployment, accounting records deliver ongoing details about the newly deployed service. These can be used for general monitoring of the service as well as troubleshooting and SLA examination.
Internet traffic increases on a daily basis. Different studies produce different estimates of how long it takes traffic to double. This helps us predict that today's network designs will not be able to carry the traffic five years from now. Broadband adoption is one major driver, as well as the Internet's almost ubiquitous availability. Recently, Cisco internal IT department concluded that bandwidth consumption is doubling every 18 months.
This requires foresight and accurate planning of the network and future extensions. Enterprises and service providers should carefully plan how to extend the network in an economical way.
A service provider might consider the following:
Which point of presence (PoP) generates the most revenue?
Which access points are not profitable and should be consolidated?
In which segment is the traffic decreasing? Did we lose customers to the competition? What might be the reason?
An enterprise IT department might consider the following:
Which departments are growing the fastest? Which links will require an upgrade soon?
For which department is network connectivity business-critical and therefore should have a high-availability design?
These questions cannot be answered without an accurate traffic analysis; it requires a network baseline and continuously collected trend reports. Service providers and professional IT departments should go one step further and offer service monitoring to their customers. This approach can identify potential bottlenecks in advance. It also lets the provider proactively notify customers and offer more bandwidth, different QoS, more high availability, and so on.
Capacity planning can be considered from the link point of view or from the network-wide point of view. Each view requires a completely different set of collection parameters and mechanisms.
For link capacity planning, the interface counters stored in the MIB are polled via SNMP, and the link utilization can be deduced. This simple rule of thumb is sometimes applied to capacity planning. If the average link utilization during business hours is above 50 percent, it is time to upgrade the link! The link utilization is calculated with the MIB variables from the interfaces group MIB (RFC 2863).
Apply the following equation to calculate utilization:
input utilization = [(Δ(ifInOctets)) * 8 * 100] / [(number of seconds in Δ) * ifSpeed]
output utilization = [Δ(ifOutOctets)) * 8 * 100] / [(number of seconds in Δ) * ifSpeed]
Note
On Cisco routers, the ifSpeed value is set by the bandwidth interface command. This bandwidth is a user-configurable value that can be set to any value for routing protocol metric purposes. You should set the bandwidth correctly and check the content of the BW (bandwidth) value in Kbps with the show interface command before doing any interface utilization calculations.
Some alarms, such as a trap or a syslog message, may be sent to the fault management application to detect a threshold violation. When we use accounting information for fault management, we enter the world of performance management, whose applications are described later in this chapter.
Link capacity planning might be enough in most cases when a network administrator knows about a bottleneck in the network. After a link's bandwidth is upgraded, the network administrator should identify the next bottleneck—this is a continuous process! In addition, most networks are designed with economical justifications, which means that very little overprovisioning is done. The term "network over-subscription" describes an abundance of bandwidth in the network, so that under normal circumstances, performance limitations are not caused by a lack of link capacity. Put another way, the only restriction that one application sees when communicating with another application is in the network's inherent physical limitations. In contrast, the term "network overprovisioning" describes a network design with more traffic than bandwidth. This means that, even under normal circumstances, not enough bandwidth is provided for all users to use the network to perform their tasks at the same time, using their maximum allocated bandwidth. The network over-subscription concept is obviously a more cost-effective approach than network overprovisioning, because it assumes that not all users will use their fully dedicated bandwidth at the same time. However, capacity planning is more complex in this case. The computation of what constitutes adequate provisioning, without gross overprovisioning, depends on accurate core capacity planning along with realistic assumptions about what group of users will use what applications and services at key time periods. Another approach is to do networkwide capacity planning by collecting the "core traffic matrix." The core traffic matrix is a table that provides the traffic volumes between the origin and destination in a network. To collect this for all the network's entry points, we need usage information (in number of bytes and/ or number of packets per unit of time) per exit point in the core network. Figure 1-9 shows the required bandwidth from Rome to all the other PoPs; Table 1-4 shows the results.
| Rome Exit Point | Paris Exit Point | London Exit Point | Munich Exit Point | |
|---|---|---|---|---|
| Rome Entry Point | NA[*] | ... Mbps | ... Mbps | ... Mbps |
| Paris Entry Point | ... Mbps | NA[*] | ... Mbps | ... Mbps |
| London Exit Point | ... Mbps | ... Mbps | NA[*] | ... Mbps |
| Munich Exit Point | ... Mbps | ... Mbps | ... Mbps | NA[*] |
[*] Not applicable. Traffic entering a specific entry point and leaving the same entry is not null. Each PoP contains more than a single interface and more than a single router, so the traffic local to the PoP is a reality. Nevertheless, this traffic is not taken into account for the capacity planning of the links inside the network.
The capacity planning can be done by mapping the core traffic matrix to the topology information. Because we know that the traffic from Rome to Paris usually takes the direct link(s) between Rome and Paris, the link dimensioning can be deduced easily in the case of simple design. Nevertheless, mapping the core traffic matrix to the routing information, typically the routing protocol's link state database, eases the process. A lookup in the routing table/link state database returns the path taken by the traffic from Rome to Paris. In addition, capacity planning can provide future projections, such as "What happens if the overall traffic grows 5 percent per month over the next year?" or "What happens if I expect the number of customers in Rome to double in the next six months?"
Which accounting mechanisms will help us create the core traffic matrix? First, we can try to deduce the core traffic matrix with the SNMP interface counters, but we see immediately that we have to solve an N2-level problem with an N-level solution, where N is the number of entry points in the network. Consequently, this approach works in only a particularly well-structured and simple network design. Still, it is not very accurate.
A different approach is to combine the SNMP interface counters with another mechanism, such as the gravity model, which assumes that the traffic between sites is proportional to traffic at each site.
Some researchers even propose changing the routing protocol metrics to infer the core traffic matrix by looking at the delta of the interface counters before and after the routing protocol metrics change. The tomography model (the estimation of the core traffic matrix) is getting more and more attention from the research community nowadays, at events like the Internet Measurement Conference (http://www.acm.org/sigcomm/imc/), Passive and Active Measurement (http://www.pam2006.org/), and the INTIMATE workshops (Internet Traffic Matrices Estimation, http://adonis.lip6.fr/intimate2006/). In addition, more and more tomography white papers are being published on the Internet. Many researchers are trying to find a good balance between producing an accurate collection of all data records (which implies drawbacks such as network element CPU and memory increases, amount of collected information, and so on) and a simpler accounting method that produces an approximate traffic matrix. This book concentrates on a set of accounting features in Cisco routers and switches, including Cisco NetFlow services and BGP Policy Accounting (see Chapter 8, "BGP Policy Accounting"). Note that the core traffic matrix is interesting not only for capacity planning, but also for traffic engineering, as discussed in the next section.
A good analogy for understanding traffic engineering is examining vehicle traffic patterns. There are several highways in the area where I live, so I have several options to get to the airport or office. Based on the day of the week and time of day, I choose a different road to get to my destination. On a typical rainy Monday morning, I should avoid the freeway, because it is not free at all but jammed, so I take inner-city roads. The freeway on Tuesday night is usually perfectly suited for high speed. However, I might not want to pay tolls at a certain bridge or tunnel, so I choose an alternate route. In addition, I always check the radio for traffic announcements to avoid accidents, serious road construction, and so on. I could even go one step further and check the Internet for road traffic statistics before starting my journey. Most of us consider at least some of these options when driving. This analogy applies to network architects, modeling data traffic on the network for traffic engineering. Continuing the analogy, accounting data would be the traffic information, and an exceeded link utilization threshold would be the equivalent of a traffic report on the radio about a traffic jam.
The IETF Internet Traffic Engineering Working Group (TEWG) provides a very technical definition of traffic engineering:
"Internet Traffic Engineering is defined as that aspect of Internet network engineering concerned with the performance optimization of traffic handling in operational networks, with the main focus of the optimization being minimizing over-utilization of capacity when other capacity is available in the network. Traffic Engineering entails that aspect of network engineering, which is concerned with the design, provisioning, and tuning of operational Internet networks. It applies business goals, technology and scientific principles to the measurement, modeling, characterization, and control of Internet traffic, and the application of such knowledge and techniques to achieve specific service and performance objectives, including the reliable and expeditious movement of traffic through the network, the efficient utilization of network resources, and the planning of network capacity."
As soon as we have the core traffic matrix, we might start some network simulation, such as capacity planning. We want to go one step further and associate the notion of simulations (traffic profiling, service, network failure) with the core traffic matrix. If we assume that the network has been designed so that SLAs are respected under normal traffic conditions, what is important for a service provider? The service provider would like to know the consequences on the SLA (which is the direct impact for the customers) in cases of extra load in the network, a link failure, or a router reload—in other words, under abnormal conditions. This is called simulating what-if scenarios.
Networks have well-defined boundaries, set by the network administrator. From a traffic matrix perspective, it is relevant to know how much traffic stays within the boundaries of the core network and how much traffic is crossing them ("on-net" versus "off-net" traffic). More details might be of interest, such as identifying traffic from various sites or traffic destined for certain hosts or ingress and egress traffic per department. Traffic profiling is a prerequisite for network planning, device and link dimensioning, trend analysis, building specific business models, and so on. Traffic engineering has the objective of optimizing network resource utilization and traffic performance.
The core traffic matrix per CoS can be analyzed by an engineering tool. In this example, three classes of service are defined, each representing a specific service:
CoS 1, VoIP traffic
CoS 2, Business-critical traffic
CoS 3, Best-effort traffic
By simulating the failure conditions just described, the service provider can answer questions such as the following: If this particular link fails, all traffic will be rerouted via a different route in the network, but will the VOIP traffic respect the SLA agreed to by the customers? Will it be the same for the business-critical traffic where SLAs are slightly lower? In addition, will the best-effort traffic still be able to go through the network, even though no SLA is assigned to it?
After answering those questions, we can accomplish capacity planning per class of service:
"What happens if video traffic grows constantly by 5 percent per month over the next year?"
"If voice traffic increases by 10 percent per year, where will the bottleneck be?"
"Should I offer new services? Where should I start offering them?"
If the traffic is modeled correctly, simulation applications can benefit the network planner by analyzing network performance under different conditions.
The last step, after traffic measurement, classification, and simulation, is to assign traffic to interfaces and network links. Based on the simulations applied in the previous step, we can assign specific traffic to network links. Most network designs consist of redundant connections between sites and meshed links between the different locations to avoid single points of failure. We can leverage this additional connectivity to distinguish between different types of applications:
Business-critical traffic is assigned to premium connections.
Less-critical application data can be sent via the backup or cheaper links.
Best-effort traffic gets transmitted on the links' remaining bandwidth.
In case of an outage, we might consider blocking the less-critical applications and reserve the bandwidth for the relevant applications. Figure 1-10 illustrates this design. Traffic from Rome to Paris is distinguished in a business-critical part, which is transmitted via the direct link, with SLAs in place. The best-effort fraction of the traffic is sent via Munich and London, where more bandwidth is available.
Several traffic engineering approaches exist:
Changing the Interior Gateway Protocol (IGP) metrics
Changing the Border Gateway Protocol (BGP) metrics
Inserting traffic engineering tunnels for an MPLS core
Introducing high-availability software features or hardware components
Traffic engineering is essential for service provider backbones. Such backbones offer high transmission capacity and require resilience so that they can withstand link or node failures. WAN connections are an expensive item in an Internet service provider's (ISP) budget. Traffic engineering enables the ISP to route network traffic so that it can offer the best service quality to its users in terms of throughput and delay. Traffic engineering accounts for link bandwidth and traffic flow size when determining explicit routes across the backbone.
For specific details about MPLS traffic engineering, refer to the informational RFC 2702, Requirements for Traffic Engineering over MPLS.
We will limit this book to collecting the core traffic matrix; we will not delve into the details of traffic engineering solutions.
Even the largest ISPs own only a fraction of the Internet routes and therefore need to cooperate with other ISPs. To ensure that all destinations of the Internet can be reached, ISPs enter into peering agreements with other ISPs (with other Autonomous Systems [AS] that can be reached using the Border Gateway Protocol [BGP]). BGP is one of the core routing protocols in the Internet. It works by maintaining a table of IP networks or prefixes that designate network reachability between autonomous systems. An AS is a collection of IP networks under control of a single entity—typically an ISP or a very large organization with redundant connections to the rest of the Internet.
The act of peering can be done as follows:
Private peering— This is a relationship in which two ISPs equally provide access to each other's networks without charging. Usually the interconnection is constrained to the exchange of traffic between each other's customers or related companies. No traffic between other third-party networks is allowed to transit the interconnection.
Peering via an Internet Exchange Point (IXP)— An IXP is a physical network infrastructure independent of any single provider that allows different ISPs to exchange Internet traffic between their AS by means of mutual peering agreements. IXPs are typically used by ISPs to reduce dependency on their respective upstream providers. Furthermore, they increase efficiency and fault tolerance.
Transit (or customer-provider relationship)— This relates to traffic exchange between an ISP and another carrier. In contrast to private peering traffic, ISPs pay for the transit traffic to be successfully routed to its destination.
Tier 1 ISPs operate global backbone transit networks and together hold all the world's Internet routes. They tend to have private peering without charging each other to give each other access to all Internet routes. Tier 2 ISPs operate national wholesale transit networks and hence buy connectivity (upstream transit) to the worldwide Internet routes from one or more tier 1 ISPs. Thus, their IP network(s) becomes a subset of those tier 1 IP networks. This is described as a customer-provider relationship. Tier 2 ISPs also peer with each other to minimize the amount of traffic exchanged with tier 1 ISPs from whom they buy upstream transit. Tier 3 ISPs, such as local access ISPs, acquire upstream transit from tier 2 ISPs. The hierarchy model becomes increasingly vague at the tier 3 level, because a tier 3 ISP may buy upstream transit from both a tier 1 ISP and a tier 2 ISP, and may peer with tier 2 and tier 3 ISPs and occasionally with a tier 1 ISP.
Peering as a customer-provider relationship is most common at the bottom tiers of the Internet business. Note that this not a true peering relationship; rather, the customer pays for transit via his or her upstream ISP. Service providers with smaller traffic tend to converge at an IXP, which provides them with a commercially neutral venue for peering.
Figure 1-11 illustrates transit and peering agreements. To provide Internet access to all its customers, the tier 2 ISP-1 signs a transit agreement with the tier 2 ISP-B. Because ISP-B covers only a part of the whole Internet, it signs private peering agreements with other tier 1 ISPs—in this example, ISP-A and ISP-C. Similar to ISP-1, ISP-2 signs a transit agreement, but with the tier 1 ISP-A. According to the private peering agreement between ISP-A and ISP-B, ISP-B will transport all traffic from ISP-2 to ISP-1 free of charge, and ISP-A does the same for traffic from ISP-1 to ISP-2. Note that the private peering agreement between ISP-A and ISP-B does not include any traffic sent via ISP-C toward ISP-2 or ISP-1 by default, but the contract can be extended to incorporate this.
From a technical perspective, transit and peering agreements are controlled by exchanging routing table entries. A transit agreement covers the exchange of all routing table entries, and a private peering agreement includes only the routing table entries of related customers.
From an accounting perspective, an ISP is usually not interested in accounting for end systems outside of its administrative domain. The primary concern is accounting the level of traffic received from and destined for other adjacent (directly connected) administrative domains. In Figure 1-11, ISP-B sends an invoice to ISP-1 for the usage of its network. If ISP-1 wants to allocate the charges applied by ISP-B to end users or subsystems in its domain, it is the responsibility of ISP-1 to collect accounting records with more granularities. These can be used to charge individual users later. Because each provider cares about its direct neighbors, only the provider that charges the end customer needs to collect granular records per user. This accounting scheme is called recursive accounting. It can be applied within an administrative domain if this domain consists of several subdomains.
The private peering agreement should be fair and equitable. Private peering works best when two service providers pass a roughly equal amount of traffic between them, making the deal cost-efficient for both sides. So, even if no charging is involved, two ISPs bound by a private peering agreement would like to compare the volume of traffic sent versus the volume of traffic received. When optimizing peering exchange points with other autonomous systems, there is a need to determine where traffic is coming from, to make the appropriate routing policy changes, to plan, and in some cases to charge a service provider for excess traffic routed.
Transit traffic is related to a fee, so one of the downstream provider's primary tasks is to check the bill sent by the upstream provider. On the other hand, ISPs usually place a higher priority on revenue-generating traffic. Hence, the upstream ISPs want to monitor the traffic patterns to ensure that the transit traffic is indeed getting preferential quality of service without severely affecting the peering traffic.
From an ISP's accounting perspective, there is always a need to classify the traffic per BGP AS, as shown in Figure 1-12. This figure shows an ISP's BGP neighbors and the percentage of traffic exchanged. Determining additional peering partners is an important task. Existing service provider peering relationships may not provide the required Internet coverage. By understanding the destination and source demands of the traffic and the corresponding volume, you can make decisions about possible peering relationships with other service providers.

By analyzing the traffic matrix, an ISP might conclude that it is not peering with the right neighbor.
The sections "Capacity Planning" and "Traffic Profiling and Engineering" covered the advantages of the core traffic matrix. Let us make a further distinction between the internal core traffic matrix and the external core traffic matrix. The internal matrix was defined previously. It's a table providing the traffic volumes between origin and destination in a network, where origin and destination are the network's entry and exit points (typically a router in the PoP). The external core traffic matrix also offers the traffic volumes between origin and destination in a network, but in this case the origin and destination are not only the network's entry points and exit points to analyze, but also the source BGP AS and destination AS. On top of the internal traffic matrix, the external traffic matrix returns information about where the traffic comes from before entering the ISP network and where it goes after exiting.
What are the advantages of analyzing the external core traffic matrix?
First, an ISP might decide whether it is peering with the right neighbor AS. In Figure 1-13, the ISP-1 external core traffic matrix might identify that most of the traffic sent to ISP-B actually is determined at ISP-A, so ISP-1 can potentially negotiate a transit peering agreement directly with ISP-A. In-depth analysis might prove that a big percentage of the traffic sent to ISP-A is targeted for ISP-2, so ISP-1 could consider a private peering agreement with ISP-2. More specifically, statements such as "ISP-1 is receiving an equivalent amount of traffic from ISP-2" can quickly be proven or disproven by analyzing the external core traffic matrix.
Second, the question "Is my network being used for transit or peering traffic?" will be resolved by the external core traffic matrix, because we can easily conclude whether the traffic is on-net or off-net. Normally, a priority is placed on keeping traffic within the network (on-net) to save money versus being sent to another service provider for a fee (off-net).
Third, when ISPs want to change either the peering agreements or the exit point for specific Internet routes, they first think about the implications of such changes on their network. Questions arise, such as "What about capacity planning? Will some links be overloaded as a consequence of the changes?", "What about customers' SLAs? Will they still be respected?", and "What about the traffic engineering setup? Should it be changed?" The combined inputs from the external core traffic matrix, the topology, and routing protocol information, along with an appropriate capacity planning and traffic-engineering application, offer valuable details in these situations.
Based on Figure 1-13, the external core traffic matrix for Rome is described in Table 1-5. Consider a practical example: The link between Rome and Paris is heavily loaded, so the ISP is evaluating the possibility of sending the traffic via London, because the link between Rome and London has available bandwidth. One way would be to forward all the traffic from Rome to ISP-2 via London, because the London PoP also has a direct link to ISP-2. Based on the external core traffic matrix, this traffic profile is identified. Next, we want to know the consequences of the core capacity planning. Again, the external core traffic matrix offers the right input for a traffic engineering tool, because it contains both the ISP exit point and the destination ISP. The tool will be able to quantify the load decrease for the Rome-Paris link and the increase for the Rome-London link.
Finally, peering via an IXP entails a particular accounting requirement. An ISP connected to an IXP can exchange traffic with any other ISP that is connected to this IXP. An IXP infrastructure is usually based on switched Ethernet components (shared medium), where one switch port is dedicated per ISP. Because only a single physical connection to the IXP is required, this design solves the scalability problem of individual interconnections between all ISPs. However, an extra accounting requirement is to account the traffic per MAC address, because the MAC address identifies the entry point of the neighbor BGP AS.
Figure 1-14 displays two scenarios in which five ISPs interconnect. On the left side, for a full-mesh setup, each ISP needs four connections (n – 1). On the right side, only a single link to the IXP is necessary.

Which Cisco accounting features are useful for BGP peering agreement applications? Taking a NetFlow-based approach classifies and accounts traffic by source and destination BGP AS. On top of this, the BGP Policy Accounting feature supports regular expression manipulation on metrics involving the AS path, BGP community, and other attributes as additional classifications. For Layer 2 accounting per MAC address, the "IP accounting MAC-address" feature should be evaluated.
We've already mentioned the tight relationship between accounting and billing in the past, so we will now consider the differences between the two areas. Accounting describes the process of measuring and collecting network usage parameters from network devices or application servers, and billing is an application that makes use of these well-formatted usage records. A raw data collection cannot be sent to the user as an invoice. Because it contains technical details such as IP addresses instead of usernames or servers, usage data records occur multiple times without a relationship to each other, and so on. Instead, raw data needs aggregation, mediation, and de-duplication to be applied first to transform data into useful information for the customers. We will illustrate this through an example. If a user establishes a connection to an application server (for example, a database) at the start of the business day and works on the same server until the end of the business day, and if we collect these records at a device level, it would result in a collection of usage data records:
Multiple records representing traffic from the client to the server, because no router or switch keeps statistics over several hours, due to memory resource limitations and the risk of losing records during an outage. Users expect a consolidated report, usually aggregated per hour of the day.
Response traffic from the server to the client creates additional records, probably collected at a different network device or device interface. A user would expect a merged statement that still contains traffic per direction, but in a single statement instead of several.
Device records contain IP addresses of the client and server; note that the client address especially might change over time. Users expect to be identified by names, not IP addresses, especially in a changing environment where IP addresses are assigned by a DHCP server. Hence, these raw records are processed to meet user needs.
Records also consist of communication protocol and port details, but no user is interested in receiving a receipt that contains port 80 as resource usage element. Instead, users expect to see names of common applications, such as web traffic, VoIP traffic, and e-mail traffic.
Time stamps at the device level can be the device's sysUpTime (an SNMP MIB variable that defines the time since the device booted), or the time since 1970 (Coordinated Universal Time [UTC], including the time zone), or the current date and time, as represented by the network time protocol (NTP). Users do not want the start and stop timer on the invoice displayed in a format such as "1079955900," but instead in a human-readable time-and-date format.
Depending on the path the traffic traverses through the network, multiple records might be collected, containing the same user traffic information. De-duplication keeps the user from being charged twice for the same transmitted traffic.
An alternative to collecting usage records at the device is to collect usage records at the server level. If users need to authenticate at a server or using a service, before using applications, this authentication process could also collect billing information, such as
Logon time
Logoff time
Number of transactions performed (for example, from a booking system)
Number of database requests and responses
CPU usage
In the preceding example, we only need to collect these usage records at the SAP server. The advantage of this approach is that no correlation of IP address and user needs to be performed. But the additional billing functionalities need to be implemented at each application server, which increases the overhead at the application level. A network consists of multiple applications and servers, so a server accounting function is required at every server in the network. This would be best implemented as a single sign-on solution, because users are usually very unhappy if they need to authenticate at every application individually. Figure 1-15 shows a scenario with server-based accounting.
Another method is to use a AAA server, where user authentication occurs once and total time and usage-based details are collected per user. However, the details collected with this approach are not as granular as the per-service accounting scenario just described, where service-specific accounting per server (for example, CPU, database requests, and so on) can be collected. A clear advantage of the AAA server is that all accounting records are centrally created and need not be collected from various servers or network devices. Figure 1-16 illustrates a Remote Authentication Dial-In User Service (RADIUS) accounting scenario.
For a billing solution, the following steps are necessary:
Data collection— Measuring the usage data at the device level.
Data aggregation— Combining multiple records into a single one.
Data mediation— Converting proprietary records into a well-known or standard format.
De-duplication— Eliminating duplicate records.
Assigning usernames to IP addresses— Performing a DNS and DHCP lookup and getting additional accounting records from AAA servers.
Calculating call duration— Combining the data records from the devices with RADIUS session information and converting sysUpTime entries to time of day and date of month, related to the user's time zone.
Charging— Assigning nonmonetary cost metrics to the accounting data based on call duration, transmitted data volume, traffic per class of service, and so on. Charging policies define tariffs and parameters to be applied.
Invoicing— Translating charging information into monetary units and printing a final invoice for the customer. The form of invoicing is selected—whether it should be itemized or anonymized, electronically transmitted or sent by mail, or combine multiple users into a single overall bill (for example, for corporate customers). In addition, billing policies are applied, such as invoicing or charging a credit card.
Figure 1-17 shows the distinction between accounting and billing in the case of device-level accounting, such as Cisco NetFlow services.
In the current service provider market, usage-based billing in an IP network has a clear competitive advantage over a flat-rate billing model. Increasingly, service providers are creating revenue-generating services by offering flexible pricing models for differentiated value-added services and applications. Competitive pricing models can also be created with usage-based billing. Regardless of the enhanced services and their corresponding pricing model, billing records must be exact, and accuracy is mandatory. With the correct technologies, service providers can rapidly develop, price, and provision these new services. Because a flat-rate billing model is not always the provider's preferred choice, new services can be based on the following usage-based billing considerations:
Volume/bandwidth usage— The more traffic that is sent, the greater the bill will be.
Distance-based— If the customer traffic destination remains in the same city or region, sending the traffic can be cheaper than sending the traffic to a different country or continent. Depending on the service provider's network and design, customer traffic can be more expensive, depending on whether the traffic remains in the service provider's network (on-net) or whether it leaves it (off-net).
Application and/or per class of service— Most customers will pay a higher price for the VoIP traffic if SLAs are linked to this specific traffic type. They would pay a medium price for the VPN traffic and a very cheap price for the best-effort traffic.
Time of day— Traffic sent during the night is cheaper than traffic sent during working hours.
Complexity varies from flat-rate billing, where no accounting infrastructure is required, to volume-based billing per class of service. The next sections cover the different billing schemes in more detail.
Flat-rate billing is a very efficient billing mechanism, because it does not entail any accounting and billing infrastructure. Nevertheless, even the cost of the billing system implies some big gains for the service providers in the case of users who either use the service infrequently and/or do not generate a lot of traffic. However, this might not be the case for users who generate a lot of traffic. Hence, the providers might want to lower the access rate to attract more users who do not need sophisticated services, while having some sort of usage-based billing for users who do.
As an alternative, some providers collect SNMP interface counters for charge-back solutions. This is acceptable as long as a customer is connected by a dedicated interface or subinterface at the router or switch. But even then the result is very limited, because only the total amount of traffic is collected. Differentiation per application, class of service, or destination is impossible. The necessity of competitive differentiation has caused service providers to investigate different sorts of usage-based billing mechanisms.
Service providers need to decide which business model to implement. For a long time, the lack of an economical accounting technology prevented service providers and especially enterprises from deploying usage-based billing. Business cases were defined but resulted in enormous costs for infrastructure investments. Additionally, a complete end-to-end solution is often necessary. It is not enough to have only some of the network's elements being capable of tracking and exporting accounting information if you want to charge all customers for using their services.
Nowadays, Cisco features such as NetFlow services, BGP policy accounting, and destination-sensitive billing are adapted for the deployment of a usage-based billing system.
However, billing is not limited to service providers. In the past, enterprise IT expenditures were often pooled and distributed evenly between departments within an organization. Charges applied to those departments were usually based on head count or desktop equipment. As the environment is changing, these financial strategies become increasingly unacceptable. First, the amount of money spent on IT infrastructure by enterprises and service providers alike has steadily increased over the past decade, and those costs need to be recouped. Second, the degree of dependence on the network for any organization is increasing, but not equally across the various departments. Consider a simple example: an enterprise that would like to charge back the cost of its Internet link according the usage per department.
A suitable billing architecture enables a service provider or enterprise customer to do any of the following:
Allocate usage costs fairly between different organizations, departments, groups, and users
Show departments or enterprise customers how they are using the network
Strategically allocate network resources
Justify adding services and bandwidth
Offer new services to selected groups
Note that arguments 3, 4, and 5 are not the primary goal of a billing architecture; nevertheless, they can supply a service provider's Business Support System (BSS) with important information. Adding new services and bandwidth may not be applied to links of customers with best-effort services, even if the connections are highly utilized. For example, in a flat-rate environment, no additional revenue is related to more traffic. In contrast, premium customers might be offered additional bandwidth if they exceed the SLA limits, which gives the service provider a chance to propose upgrades proactively. Although this does not generate immediate revenue, it proposes a paradigm shift because service providers do not get anything extra for doing a good job but get penalized for violating the SLA. Even though a BSS takes this monetary information into account, accounting and performance management applications are unaware of it.
As soon as the decision for usage-based billing is made, several business models are possible, as discussed next.
In this case, you collect the transmitted raw bytes or packet volume per user, group, or department. This model is relatively easy to implement, especially when no application-specific data is required and only the total amount of traffic per group is relevant. Furthermore, a network design that connects one customer per interface or per subinterface can collect volume-based accounting information by polling the interface MIB counters. At first glance, this model sounds fair and intuitive, because bandwidth-intensive users or groups are charged more than others who generate less traffic. On the other hand, you could argue that bandwidth usage should be related to the time of day, because large transfers in the night use the network at a time when the majority of users are inactive. If QoS is deployed, you could also argue that, for instance, the few bytes of an IP phone call in the "gold" traffic class should be more expensive than a large file transfer, which is placed in the "best-effort" class.
For an enterprise customer, volume-based billing is an excellent alternative to a fixed assignment of IT costs, because the level of complexity is relatively low. In terms of destination, we only need to distinguish if the traffic stays within the perimeter (on-net) or if it is targeted at the Internet (off-net). In addition, departmental charge-back is applied. Figure 1-18 shows a typical enterprise scenario with various groups located on the central campus or distributed across wide-area links.

In summary, volume-based billing has the advantage of simplicity and, for service providers, the drawback of a lack of differentiation from the competition.
This model takes into account the simple principle that the distance toward your traffic destination should classify your traffic according to one of a set of enumerated values (such as cheap, medium-priced, or expensive). Figure 1-19 illustrates this example. A customer in Germany might place a very cheap VoIP call to a destination in Germany. Accessing a web server in another country in Europe will cost more, and sending a huge video file to a friend in Sydney might prove to be very expensive.

The drawbacks of this billing mechanism are twofold:
The customer pays for only the traffic sent to the service provider. What would happen if the customer requested a very big file from a server in Sydney? The customer would pay for the very small file request, while the response contains a big file, which would be paid for by the server operator in Sydney! This model introduces a charge-back demand for an Application Service Provider (ASP)—in this case, the server operator in Sydney—to charge either individual users or the service provider to which the customer connects, which then adds this fee to the monthly invoice. This further complicates the billing mechanisms used.
Destination-sensitive billing implies the collection of usage records on all ingress interfaces of the service provider's network. Like any ingress collection mechanism, the destination-sensitive billing feature needs to be enabled on all ingress interfaces in the network; otherwise, some traffic will not be accounted for! This is a challenge for ISPs, especially if multivendor equipment is used at the PoPs and not all access devices support the required features.
To circumvent the drawbacks of the destination-sensitive billing scheme, the destination and source-sensitive billing model takes into account the traffic's destination and source. Let's return to the example of the FTP request and FTP reply from a German customer to a server in Sydney. This customer would pay a high price for the traffic sent to and received from this server in Sydney, without introducing additional complexity, such as ASP charge-back. A major advantage of the destination and source-sensitive billing model is that it can be applied on a per-device and per-interface basis, because the usage records are captured in both directions. Remember that destination-sensitive billing, as described in the previous section, is applied at all interfaces in the network to avoid missing accounting records.
The combination of time and destination and source-sensitive billing is the classic Public Switched Telephone Network (PSTN) accounting model, which has been in place for more than 100 years. The political background of this model was to allow money to flow from developed countries to developing countries to subsidize the networking infrastructures of the developing countries. For billing in the Internet, this model mainly applies to VPN customers who have several sites around the globe connected to a single provider. It does not really apply to end customers for the simple reason that end customers have no indication where a server in the Internet is located, because network connectivity is transparent. However, a modified version of destination-sensitive billing already exists: on-net and off-net traffic. On-net is all traffic that the service provider can keep in its own network or in the network of peering partners. Off-net traffic requires peering agreements between ISPs. Some ISPs already offer on-net access free of charge—for instance, accessing the provider's Internet servers. Another example is free VoIP calls for all customers of the same service provider for all in-net VoIP traffic. Figure 1-20 and Table 1-6 illustrate various pricing categories.

Cisco features for destination- and/or source-sensitive billing are NetFlow services, BGP policy accounting, and destination-sensitive billing.
Another example is a class-dependent tariff in a Differentiated Services (DiffServ) network, where various applications can be charged differently. None of the billing models described so far consider application-specific design requirements for the network. Even though IP telephony application transfer only small amounts of data, they have very strict bandwidth and delay requirements compared to other applications, such as e-mail and web browsing. You could argue that the additional investments made for an appropriate VoIP infrastructure should be compensated by the voice application users only and not by every user. A QoS billing model can solve these requirements, because accounting records per class of service can be combined with destination- and/or source and destination-sensitive billing. We consider this the fairest model, because business-critical applications are charged more than less-relevant applications. Especially if the "best effort" class is free of charge, no user should complain about unfair treatment. Unfortunately, this model is not trivial to implement. Each network element not only needs to collect accounting data (which is already an issue for some components), but also needs to collect this data per class of service. For obvious reasons, this model is applicable only when a real business case justifies the investment. Extra care should be taken when the network elements collecting the accounting information are also modifying the traffic classification (such as by changing the value of the ToS or DSCP setting). For example, in a simple design, accounting information is collected on the ISP edge network element—specifically, at the interface facing the customer. However, what happens if this network element is the one classifying the customer traffic into bronze, silver, and gold classes of service? In this case, you collect the accounting information after the classification (after the ToS or DSCP rewrite) and don't report the accounting information as seen in the packets arriving on the interface. Otherwise, the QoS billing would be inaccurate, because it counts traffic as colored by the user and not as transported in the provider's network.
This also relates to concepts such as traffic flow through specific engineered paths, which is described in the section "Traffic Profiling and Engineering."
Figure 1-21 shows the different building blocks of the DiffServ architecture. To reduce the level of complexity, only the "gold" traffic is fully modeled. Packets entering the network are classified according to the traffic class definitions in three different classes. Gold could represent real-time protocols and applications, such as voice traffic. Business-relevant traffic that is not assigned to the gold class is moved to the silver class. All remaining traffic enters the bronze class and gets "best-effort" treatment.
A meter for each class measures the total amount of traffic per class, checks the maximum defined transmission rate, and sends the amount of traffic that matches the class definition to the marker. Traffic exceeding the traffic profile definition is considered nonconforming and is treated differently—either shaped or dropped. Marked and shaped traffic is sent to the different output queues and is transmitted. From an accounting perspective, we are interested in the "meter" blocks. As you can see, we already need to implement three meters for the gold class, to account for
Conforming traffic
Partly conforming traffic
Nonconforming traffic, which (in this example) is dropped
In our example, we therefore need to implement nine classes in total per device (three levels of conforming traffic times three classes of service).
The Cisco CLASS-BASED-QOS-MIB is the Cisco feature of choice in the deployment of QoS billing.
Content-based billing is already common on the Internet. Even though not everyone was happy about it, erotic pages were one of the first applications on the Internet that generated profit. Today we are offered knowledge-based services, translation services, training, children's games, product tests, and so on. We need to identify traffic per application to charge customers appropriately according to the application used. Both Application Service Providers (ASPs) and web-hosting companies charge customers for application access (this is called content-based billing). In many cases, the ASP works with the ISP to offer a solution to the customer. Who owns what part of the network differs with each scenario, but the need to guarantee QoS, network bandwidth, and response times does not change. Similarly, web-hosting firms can categorize access to their servers and networks into service classes based on SLAs. To build these business models, both SLA monitoring and usage-based billing are required from the technology deployed.
In summary, content-based billing can easily be linked to application-specific SLAs, which is an excellent approach for service differentiation. However, it is resource-intensive due to packet inspection. The Cisco NBAR device instrumentation feature and the Service Control Engine (SCE) offer the required functionality to implement application and content-based billing.
Except for PSTN and GSM networks, time-based billing only applies to dial-in scenarios and pWLAN hotspots. Users get charged based on call duration and time of day. It is relatively simple to implement, because only a network access server (NAS) and a RADIUS server (or some other means of providing user AAA) are required. Accounting records are generated by the RADIUS server and are transferred to the billing application. We distinguish between prepaid and postpaid mode. Prepaid mode requires real-time connectivity between the billing server and the NAS to identify how much credit balance the subscriber has left. The advantage of prepaid billing is that the user can purchase ad hoc access without opening an account with the provider, and no credit check is required.
Because voice applications were the Postal, Telegraph, and Telephone (PTT) "cash cow" for decades, it is obvious that billing was not an afterthought when VoIP was invented. Nevertheless, VoIP-related billing presents many challenges, because different aspects of the overall solution should be considered:
End-user billing (flat, per call, per call feature, per class of service, based on distance, or on-net/off-net)
Billing for handing calls to or receiving calls from the PSTN (per call, total call volume)
Billing for handing calls to other VoIP service providers (bandwidth, QoS, time)
Voice-related traffic studies (type of calls, feature usage, and so on)
A typical VoIP call data record (CDR) consists of the following data fields:
Call ID
Call initiation time (the time when the user started dialing)
Call connection time (the time when the destination phone was off-hook)
Call disconnect time
IP source address (call originator)
Remote IP address (call receiver)
Remote UDP port
Calling number
Called number
Codec (G.711, G.722, G.723, G.726, G.728, G.729, others)
Protocol (H.323, SIP, MGCP, others)
Bytes transmitted
Bytes received
Packets transmitted
Packets received
Round-trip delay
Delay variation (jitter)
Erroneous packets (lost, late, early)
Where and what billing data to collect depends on the service provider's business model (flat-rate versus per-call, prepaid versus postpaid) as well as the details being charged for (time, bandwidth, QoS, feature usage). CDRs are generated and collected for the following scenarios:
End-user billing— CDRs can provide the details needed to bill based on usage, bandwidth, or provided QoS. RADIUS provides the infrastructure for applying a prepaid billing model. Chapter 9, "AAA Accounting," offers more details on RADIUS.
Service provider or PSTN "peer" billing— This information can be obtained through CDRs from call agents or SS7 gateways. For service provider-to-service provider peering, it is mainly an overall bandwidth view (and some traffic classification) rather than a per-call view.
Traffic studies— CDRs can be used for a detailed analysis of the telephony studies (calling patterns, feature usage, and so on).
Due to the level of complexity and the small amount of data transferred, we assume that over time voice calls will be free of charge, at least for best-effort voice quality on-net. Alternatively, service providers can identify additional revenue opportunities by offering value-added services, such as business voice with guaranteed voice quality. Today, it is unclear if customers are willing to pay a premium fee for value-added services or if strong competition will result in increased quality for even best-effort services. The large success of skype for Internet-based telephony is a clear indicator.
Operators of enterprise networks as well as service providers are increasingly confronted with network disruptions due to a wide variety of security threats and malicious service abuse. The originator of the attack could reside within the network or target it from the outside. Or—as a worst-case scenario—attacks could occur at the same time from the inside and outside of the network. Security attacks are becoming a scourge for companies as well as for individuals.
Fortunately, the same accounting technologies that are used to collect granular information on the packets traversing the network can also be used for security monitoring. When attacks are taking place, accounting technologies can be leveraged to detect unusual situations or suspicious flows and alarm a network operations center (NOC) as soon as traffic patterns from security attacks are detected, such as smurf, fraggle, or SYN floods. In a second step, the data records can be used for root cause analysis to reduce the risk of future attacks. Note that the root-cause analysis requires baselining, which is discussed in detail in the section "Purposes of Performance."
Chapter 16, "Security Scenarios," investigates in accounting for security purposes.
Due to business impact, operational costs, and lost revenue, there is an increasing emphasis on security detection and prevention. A design for security analysis should address the following requirements:
End-to-end surveillance, to monitor the local network as well as WAN links connecting subsidiaries.
24/7 availability, because an outage of the security application means becoming vulnerable to attacks.
Near-real-time data collection, because long transmission delays or aggregation intervals can hold up the identification of an attack. Most attacks last less than 15 minutes.
Encrypted communication between the server and the client to avoid giving possible attackers valuable information and prevent some forms of attacks.
Consolidate input from multiple sources and technologies to avoid false alarms. The more sources there are to confirm an assumption, the more likely it is to be true. Too many false alarms will cause the users to reject the tool, and this could cause an attack to be missed that could be business-critical.
A very efficient technology for traffic characterization and user profiling is the Cisco NetFlow accounting feature in the Cisco IOS software. At the NMS application layer, the previously described "deviation from normal" function is most useful in detecting anomalies related to security attacks.
Here's a list of possible checks to detect a security attack:
Suddenly highly increased overall traffic in the network.
Unexpectedly large amount of traffic generated by individual hosts.
Increased number of accounting records generated.
Multiple accounting records with abnormal content, such as one packet per flow record (for example, TCP SYN flood).
A changed mix of traffic applications, such as a sudden increase in "unknown" applications.
An increase in certain traffic types and messages, such as TCP resets or ICMP messages.
A significantly modified mix of unicast, multicast, and broadcast traffic.
An increasing number of ACL violations.
A combination of large and small packets could mean a composed attack. The big packets block the network links, and the small packets are targeted at network components and servers.
All of these symptoms alone could still be considered normal behavior, but multiple related events could point to a security issue. Figure 1-22 illustrates the process of identifying a large number of flows at a device.

An intrusion detection system (IDS) takes these and additional considerations into account. It can also use stateful packet flow analysis to identify suspicious activities or unusual network activities. To understand this topic better, we describe the proposed steps to identify and block a potential DoS attack. An IDS compares the traffic with predefined patterns and either dumps all packets that do not fit these filter criteria or stores usage records for accounting purposes.
In the detection of security threads, a network baseline is compulsory, as described in the section "Fault Management." How can you deduce that something is wrong in the network without knowledge of the network during normal operating conditions? For example, how can you know if the router's CPU and memory utilization are currently low or high if you cannot compare them with the hourly, daily, or weekly values from the past? How can you figure out if an increased number of accounting records generated on Monday morning is a security threat without values defining the average number of records on a Monday morning? Accounting can answer these questions.
Here's a phased approach for identifying and blocking a security attack:
Preparation— Instrument the infrastructure to baseline and compare relevant security parameters:
Classification— Identify the attack's criticality:
- A worm that "only" congests the network links might be considered the lesser evil, compared to a malicious virus that erases data on all PCs or extracts credit card details.
- A DoS attack is initiated by a single client. A distributed DoS (DDoS) attack is started on a large number of compromised hosts in parallel, which increases the complexity of identifying the sources and also increases the attack's severity level.
Reaction— Block the attack with an orchestrated approach of security, performance, fault, accounting, and configuration applications:
- Accounting applications provide the flow records to be monitored and analyzed.
- Performance and fault applications identify network outages or overload caused by the attack.
- Security applications consolidate the information and identify the attack.
- Configuration applications modify the device configurations to block the intruder by configuring mechanisms to block malicious traffic (such as by using access control lists).
- Accounting and performance monitoring applications identify when the attack is over.
Phases 5 and 6 are very important in a complete security management strategy. However, the goal of this book is to collect the right accounting information for the determination and classification of the security threads, not to correct the fault. Therefore, proposed technology examples will address phases 1 through 4 only.
Specific MIB variables allow the polling of CPU utilization, memory utilization, interface utilization, and so on. A very efficient technology for traffic characterization and user profiling is the Cisco NetFlow services accounting feature in the Cisco IOS software, which allows classifying the network traffic per IP address, per application layer, and much more.
The RMON protocol could also be useful, because an RMON probe can analyze all the traffic on the link it is attached to, classify it, and report the results using MIB variables. Note also that some specific security devices, such as an Intrusion Detection System (IDS) or the Cisco PIX Firewall, simplify the detection of security threads. Because this topic goes beyond the scope of this book, we suggest further literature for readers who want to learn more about security concepts, such as Network Security Architectures by Sean Convery (Cisco Press, 2004).
Another area for security monitoring is detecting rogue wireless access points. Wireless LAN (WLAN) is becoming very popular these days, and the prices for WLAN cards as well as WLAN Access Points (AP) have dropped significantly. It is no surprise that users purchase their own APs and connect them to the corporate network, to increase access flexibility and maybe avoid "inconvenient" policies deployed by the corporate IT department. A rogue AP is not authorized for operation by the company's IT group. Operating rogue access points can generate serious security issues, such as opening an uncontrolled interface to the corporate network. An operator's nightmare is the combination of a rogue AP and a hacker identifying it. There are multiple approaches to identifying rogue APs; the most simple way is to scan all IP addresses in the network and check if a web server responds to a request on port 80, because most APs have an integrated web server. Unfortunately, this procedure would identify only rogue APs installed by novice users, because an experienced hacker would probably disable the AP's web server immediately.
How can accounting help in this case? Consider a scenario in which a user disconnects a notebook from the data outlet, plugs in an AP, and connects the notebook wireless instead of wired. Usage patterns would not change, and IP address-based accounting would not identify the new wireless Layer 2 connection. It would require MAC address accounting at the switch where the notebook was originally connected. The AP has a different MAC address than the PC, so the traffic source at this switch port suddenly has a new MAC address. Even though it is possible to apply accounting in this case, it is a complex approach. Easier options exist, such as allowing only a preconfigured MAC address per switch port (see the "Port Security" feature, for example).
However, if other users also start using the AP (on purpose or because the 802.11 client at their PCs identifies a stronger signal and connects through this AP instead of the corporate AP) and if a performance baseline is in place, you can identify changed traffic patterns at this specific port at the switch. This example illustrates how accounting in conjunction with network baselining can help identify network security issues based on altered traffic characteristics.