-
Notifications
You must be signed in to change notification settings - Fork 4
Home
Zhenbo edited this page Aug 16, 2022
·
24 revisions
Welcome to the sense-rtmon wiki!
- The
*.yml
files in theconfig
directory handles configuration for both cloud and site stacks.ifName
andvlan
together under Hosts and switchData are the unique identifier of each flow. (currently onlyifName
is the identifier). - The
scrapeInterval
andscrapeDuration
do not change the rate. Every scrape is by default 15s and currently no configuration can change that. This is only applicable toSite
stack since no pulling and pushing allowed in the system. -
communityString
underswitchData
is unique to each network element and should be kept as a secret not shown on Github. - Under both
Cloud
andSite
Stacks, there'refill*.py
files that read in the configuration files and dynamically writes install and start scripts. This is hidden from the users understart.sh
andinstall.sh
.
-
Site
installs SNMP exporter from Github. SNMP has many dependencies when generatingsnmp.yml
file. Make sure GO is up to date and gcc is installed properly. - Install also makes script files that curl the local port metrics to pushgateway server on the
Cloud
stack. For example, Node exporter runs on 9100. Script generated from install script will look like thiscurl -s ${MYIP}:9100/metrics | curl --data-binary @- $pushgateway_server/metrics/job/node-exporter/instance/$MYIP
. Instance indicates where the data coming from. This URL can be customized. - ARP and SNMP are similar but more complicated with intermediate storage files.
- The new scripts are added to the current crontab with a cycle of 15s. Every 15s the result of the curl is pushed to pushgateway.
-
./start.sh
script parses usingfill_start.py
the configuration file and fills in./dynamic_start.sh
and run it. - It gives the user to choose which container to start and It composes all the containers in the end.
- It dynamically generates a push script file for each exporter inside
crontabs
sub directory based on user inputs.
- Installs Grafana and Nginx. It allows user to encrypt Grafana to HTTPS using Nginx Reverse Proxy.
- Script Exporter is pull from Github.
- Pushgateway, Script Exporter, and Prometheus, Nginx and Grafana are all under cloud stack right now. To start just run ./start.sh under cloud directory.
- Grafana ports 3000:3000 needed see https://github.com/esnet/sense-rtmon/issues/18#issue-1332438911
-
- Running as a container that enables HTTPS. Certificates and DNS from the host are required.
- In the
stack.yml
file Nginx needs to match the ports of other applications and ports access from. E.g. access 443 to 3000. If we want pushgateway and promethues to be HTTPS we need to open two additional ports (I might be wrong there might be other work arounds, I tried using location / but CSS didn't apply to pushgateway). - The path to certificates right now needs be manually changed
-
- Many browser auto reroute to HTTPS. If you have ports that are still on HTTP it's hard to access due to the redirecting. Go to the website and find
Delete domain security policies
to remove auto direct.
- Many browser auto reroute to HTTPS. If you have ports that are still on HTTP it's hard to access due to the redirecting. Go to the website and find
-
- Currently not functional and in development. It's similar to ARP and can send data to pushgateway with easy fixes (take less than 30 mins). However, the design might need to change.
-
- Script Exporter enables layer2 debugging. Under
examples
directory, theconfig.yaml
tells the script exporter which script to run.args.sh
andmultiDef.sh
are used for single and double switches. Anything more than 2 switches are not implemented yet. - These files are configed by
fill_config.py
date is from configuration files. -
*.sh
files sendecho
andPromethues
database stores the data. The dashboard is looking for what is sent. Every changes made here need to be made in the Layer 2 dashboard templates as well. - e.g.
echo "host1_arp_on{host=\"${host1}\"} 1"
host1_arp_on is stored in prometheus, 1 represents on and 0 is off. - The format is string followed by a number. If a string is included Prometheus database can't take the data in the whole script will fail and no data goes through.
- Script Exporter enables layer2 debugging. Under
-
- SNMP access the switch and MIB though this line:
<host_ip_address>:9116/snmp?target=<switch_ip_address>&module=<module_names_e.g.: if_mib>
- Curl stores the result of the query in an intermediate file then curl the content to pushgateway.
- Downloading MIBS refer to: https://github.com/esnet/sense-rtmon/issues/17#issue-1330372320
-
export MIBDIRS=<mibs_directory>
right now the default directory issite/SNMPExporter/src/github.com/prometheus/snmp_exporter/generator/mibs
.mibs
is generated bymake mibs
. MIB files need to be moved to a single directory. Theinstall_snmp.py
file installs private mibs based on user input network element brand.
-
- ARP is more complicated for it needs to be able to detect changes in ARP table (
arp -a
). - ARP files are located under
Metrics/ARPMetrics
. - Important files:
- arpOut.json stores the output of
arp -a
of the host system in json format. The plain output is converted to json by convertARP.py. - prev.json stores the previous
arp -a
output. - delete.json stores all current URLs on pushgateway in the format that can be processed to erase pushgateway data directly.
- arpOut.json stores the output of
- Put together. aroOut.json is updated every 15s. If there is discrepancy between the it and prev.json, ARP container deletes all current URLs from delete.json files and push new URLs from arpOut.json.
- ping_status and prev_ping_status work in a similar fashion. The host pings the other host and stores the result and send it to pushgateway. If the two files are different, delete everything on pushgateway and resend the URLs and ping status.
- ARP is more complicated for it needs to be able to detect changes in ARP table (
mermaid.live is used to draw diagram. The website has a good live drawing board for instant feedbacks.
-
Future
: Local/Global Ports Unique Flow IDs
For more questions please contact: Zhenbo [email protected] and Pratyush [email protected]
RealTime Flow Monitoring and Analysis