Generated [GENERATED-DATE].
These estimates are based on results gathered with probe requests.
The network size is estimated by gathering identifier probe results, and comparing the number of distinct identifiers with the total number of samples. Based on the assumption that the results are from nodes selected from the entire network at random, as the probes are designed to do, this allows guessing the size the network would have to be to give that proportion. The instantaneous size estimate does this with an hour of samples, and as such estimates how many nodes are online at that moment.
Nodes still contribute to the network if they are online regularly - they need not be online all the time. The effective network size attempts to account for this by using the same estimation technique as the instantaneous size, but with only those identifiers which were seen both in the last period of time and the one before that. This is problematic: one cannot depend on a response from a node during a time period even if it is online. A more accurate estimate could involve the included uptime percentage.
Datastore size probe results return the approximate amount of disk space a node has reserved to store data for the network. After excluding outliers, taking the mean and multiplying by the (weekly) effective network size gives an estimate of the total amount of disk space used for datastores throughout the entire network. Without outliers excluded (seemingly impossibly) extreme values lead to sudden jumps in the estimate. Note that due to block storage redundancy the usable storage capacity is less, but how much less uses guesswork, so this is easier to estimate. It might be on the order of 1/12.
Refused responses mean that a node opted not to respond with the requested information.
Errors are:
- Disconnected: a node being waited on for a response - not necessarily the endpoint - disconnected.
- Overload: a node could not accept the probe request because its probe DoS protection had tripped.
- Timeout: timed out while waiting for a response.
- Unknown Error: an error occurred, but the error code was not recognized.
- Unrecognized Type: a remote node did not recognize the requested probe type.
- Cannot Forward: a remote node understood the request but failed to forward it to another node.
Link length, peer count, bulk reject percentages, and uptime are from the last day of results. All peer counts above {0.histogramMax} count towards {0.histogramMax}. Reported uptime can exceed 100% due to the added random noise. Bulk reject percentages are restricted to between 0% and 100%.
Uptime percentage is weighted toward low-uptime reports to estimate network percentage from percent of reports. This is because high-uptime nodes are more often online and therefore available to report high uptime. The weighting used is percent reports / (percent uptime + 10)
.
The bulk queue (therefore not realtime queue) reject percentages are an indicator of network health. My understanding is that a node will reject a request if it does not have sufficient bandwidth available to take on the commitment. This would mean that high reject percentages might indicate low bandwidth limits and problems with routing. I am currently collecting information on bandwidth limits, but I am not yet plotting that information.
Estimate network percentage for uptime reports by weighting lower uptime. Thanks ArneBab!
Add ideal and uniformly random link length distributions to the link length distribution plot for comparison.
An Internet connection outage for most of the day was presumably prompted by the impractically large amounts of snow.
An ice storm prompted a 3-day power outage.
Change error plot to stacked area for clearer total. Decrease HTL in an attempt to lower overload rate and get a better impression of the network.
Port backend to PostgreSQL and bring the site back online after integrating backlog from the SQLite version. Thanks to RhodiumToad for extensive help! Add sample size label to plots. The bulk reject sample size is likely to be around 2% low. (Only the number of results with data for each type, not the number of results in all, is currently visible from the plotting layer.) Reduce non-RRD plot time span to a day - the probe rate is high enough that it's enough information. Estimate disk space dedicated to datastore instead of store capacity. This excludes outliers, so a plot showing the distribution of those outliers would be useful, but that's not yet implemented.
The SQLite backend had serious problems around May 10th or 20th and stopped storing results or answering queries. I think this was my fault in mangling the database file, but there were enough other annoyances with SQLite being untyped and increasingly slow that moving still seemed worthwhile.
Fix typos.
Backend changes and refactoring.
There was initially a regression where no results were committed.
Change bulk reject percentages plot to log scale.
Add bulk reject percentages plot.
Fix plotting weekly size estimate as daily. Change longest time plots to cover the last year instead of all data. Plot uptime distribution in a histogram for readability. Fix histogram capping.
More database backend improvements. (Thanks Eleriseth!)
Plot each hourly daily size estimate instead of averaging them over 24 hours. (Thanks TheSeeker!)
Add 7-day uptime plot.
Many backend improvements. (Thanks Eleriseth!) Fix Infocalypse repo. (Thanks djk!)
Implement first of many backend improvements. (Thanks Eleriseth!)
Add plots of errors and refused responses. Add daily effective network size. (Thanks ArneBab!) Fix non-relative links in the footer being relative. (Thanks SeekingFor!)
Add plots of the past week and month of network size and store capacity. Click for a larger version.
Correct September 3rd changelog entry. Improve HTML output to meet XHTML 1.1.
Link length and peer count plots:
- Fix order of magnitude error on percentage axis.
- Percentage is of reports, not nodes.
Fix typos and improve wording.
Initial release.
- Plots of network and usable store size estimates for all available data.
- A download of the network and usable store size estimate RRD.
- Plots of the past 7 days of link length and peer count distribution.
Generated [GENERATED-DATE] by operhiem1 using pyProbe. Web mirror is here. The RRD is here. The source code is in this Infocalypse repository, or on GitHub.