README.md 9.76 KB
Newer Older
moosdorf's avatar
moosdorf committed
1
# pcap-generator (unfinished)
moosdorf's avatar
updated    
moosdorf committed
2
3
4
## Overview
Pcapplusplus-based tool to generate traffic through simulation of configurable network topology. 
Packet contents and delays are configurable as distributions through an input yaml file. Output is a pcap/pcapng file.
Agostino Moosdorf's avatar
Agostino Moosdorf committed
5

moosdorf's avatar
updated    
moosdorf committed
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
## Repository structure
|-config: contains configuration yaml files, including those defining a role and those used for testing \
|-Makefile: makefile to build program from source \
|-output: directory that may be utilized to store output pcaps \
|-README.md: this file \
|-pcap_generator: pcap-generator executable \
|-src: contains source code 

## Installation
Tested on Fedora 30 and Ubuntu 18.04. Should run on most Linux platforms. \
Install [yaml-cpp 0.6.3](https://github.com/jbeder/yaml-cpp) and [pcapplusplus v20.08](https://pcapplusplus.github.io/docs/install). 
Make sure the header files get installed into /usr/local/include (should be the default) or change the pcap-generator Makefile later accordingly. \
Download this repository, go into this folder and run `make`.

## Usage
### Getting started
`./pcap_generator config/simple_arp.yaml output/simple_arp.pcap` will execute a simple example. \
`config/simple_arp.yaml` specifies the input config file. If you wish to execute over another configuration, just change the path. \
`output/simple_arp.pcap` specifies the file to store the resulting packet capture in. Again, if you wish for it to be stored elsewhere, you may modify the command accordingly. \
One can give a third option that specifies a directory where the roles are configured. It defaults to config/roles. For more information on roles, see further below. 

### Overview configuration fields
For quickly getting an understanding of how a configuration file should look like, we recommend you look at the examples in the config folder. 
Errors in the configuration file will lead to errors in the simulation. At this point (this is a first prototype), we don't guarantee proper error handling with meaningful error messages. 
The following tables contain a detailled description of possible input for each field. \
Hint: tbd(script to convert us/ms/s/min to ns and vice versa)

| Field/Subfield      | Input example                                | Input description              | Possible values                         |
| :------------------ | :------------------------------------------: | -----------------------------: | --------------------------------------: |
| nodes               | [host1, host2]                               | node names                     | list of strings                         |
| links               | [[link1, host1, host2],[link2, host1, host2] | link names and nodes they link | list of 3-field-list of strings         |
| interface_link      | link2                                        | link to capture traffic at     | link name                               |
| interface_link_side | host2                                        | node to capture traffic at     | node name                               |
| duration_type       | number_packets                               | traffic capture duration type  | number_packets or number_nanoseconds    |
| duration            | 1230                                         | traffic capture duration       | value 0 to 2^64-1 (ca. 1.8 x 10^19)     |

For each defined link:
| Field/Subfield                      | Input example | Input description                                            | Possible values                          |
| :---------------------------------- | :-----------: | -----------------------------------------------------------: | ---------------------------------------: |
| delay_distribution_type             | loop          | distribution type of delays to go over the link              | static, loop, self_specified, triangular |
| delay_values                        | [2000, 30000] | delay values in nanoseconds                                  | list of values 0 to 2^64-1               |
| packet_loss_distribution_type       | static        | distribution type of packet losses occuring on the link      | static                                   |
| packet_loss_values                  | [150]         | packet loss values (every n packets)                         | list of values 0 to 2^64-1               |
| packet_corruption_distribution_type | static        | distribution type of packet corruptions occuring on the link | static                                   |
| packet_corruption_values            | [0]           | packet corruption values (every n packets)                   | list of values 0 to 2^64-1               |

For each defined node:
| Field/Subfield                                              | Input example                  | Input description                               | Possible values                                      |
| :---------------------------------------------------------- | :----------------------------: | ----------------------------------------------: | ---------------------------------------------------: |
| application_layer_roles                                     | [bridge]                       | roles active on the node                        | list of role names                                   |
| user_specified_traffic/<link_name>/messages                 | [my_packet_1, my_packet_2]     | traffic generated at the interface (node, link) | list of traffic names                                |
| user_specified_traffic/<link_name>/delay_distribution_types | [loop, static]                 | distribution type of user specified traffic     | list of € (loop, static, self_specified, triangular) |
| user_specified_traffic/<link_name>/delay_values             | [[5, 300], [1270000005000000]] | distribution values                             | list of lists of 0 to 2^64-1                         |
| serialization_delay                                         | [5000]                         | fixed serialization delay                       | value 0 to 2^64-1                                    |

For each defined user_specified_traffic in any node:
| Field/Subfield                                                    | Input example          | Input description                         | Possible values                     |
| :---------------------------------------------------------------- | :--------------------: | ----------------------------------------: | ----------------------------------: |
| <traffic_name>/layers                                             | [eth, arp]             | list of layer names                       | eth, arp (,ip, udp, arp, dhcp, dns) |
| <traffic_name>/<layer_name>/<header_field_name>/distribution_type | static                 | distribution type for header field values | loop, self_specified, static        |
| <traffic_name>/<layer_name>/<header_field_name>/values            | [00:00:00:00:00:01]    | list of header field specific values      | header field specific               |

Header fields within an Ethernet layer: src_mac, dst_mac, eth_type

Header fields within an ARP layer: hardware_type, protocol_type, hardware_size, protocol_size, opcode, sender_mac, sender_ip, target_mac, target_ip

tbd: ip, tcp, udp, dhcp, dns

### Distribution types
(tbd) \
As outlined above, for various configuration fields, one may specify the distribution type and values. 
We use two types of distributions: Those that relate to a sequence of explicitely stated values and those that relate to the whole range of possible input values, optionally limited by 
stating a minimum and maximum value. While the former type (we will just call it index distribution here) yields no problems in implementation, the latter only works on some configuration 
fields, as it is lavish to implement it on all header fields for all layers, or (in case of variables not exhibiting field characteristics, e.g. arbitrary strings in a header fields), simply 
not senseful to implement. When deemed useful, it may however be implemented in some cases, as per the tables above. \
Here, we explain the behaviour of each distribution type. \
static: Only one value has to be specified and will be used once. example: \
`my_packet_1: \
  layers: [eth, arp] \
  eth: \
    src_mac: \
      distribution_type: static \
      values: [00:00:00:00:00:01] \
  ...` \
self_specified: The value will be randomly generated after a given list of cumulative distribution values. example: \
`my_packet_1: \
  layers: [eth, arp] \
  eth: \
    src_mac: \
      distribution_type: self_specified \
      cumulative_probabilities: [0.2, 0.4, 0.9, 1] \
      values: [00:00:00:00:00:01, 00:00:00:00:00:02, 00:00:00:00:00:03, 00:00:00:00:00:04] \
  ...` \
index_loop: The values in the specified values list will be used in turn from left to right and then start again at the left. example: \
`my_packet_1: \
  layers: [eth, arp] \
  eth: \
    src_mac: \
      distribution_type: loop \
      values: [00:00:00:00:00:01, 00:00:00:00:00:02, 00:00:00:00:00:03, 00:00:00:00:00:04] \
  ...` \
index_uniform: The value will be randomly generated after a uniform distribution over the list of given values example:\
`my_packet_1: \
  layers: [eth, arp] \
  eth: \
    src_mac: \
      distribution_type: uniform \
      values: [00:00:00:00:00:01, 00:00:00:00:00:02, 00:00:00:00:00:03, 00:00:00:00:00:04] \
  ...` \
loop: 
uniform: 
triangular: This is only implemented for the various delay configurations\

### Adding new roles
(tbd) \
Every node can be assigned to exercise roles. A node may then forward, drop, modify or generate traffic based on ingress port, existing layer, or any header 
field value of the packet. 
A role can be specified from a stand-alone configuration file. All roles used in the simulation process must be in the specified role folder (default is config/roles as mentioned above). \
Basic existing roles are: bridge, switch, static_router 

## Performance
tbd