Miscellaneous improvements (#3)

Added required dependencies in the setup script Improved logging during dataset creation Added README.md file Change-Id: I67e98938f3660656d2ced916f258dfe5d475ecd3
eu-nebulous · Jul 11, 2024 · 0c14d2f · 0c14d2f
1 parent 2915aca
commit 0c14d2f
Show file tree

Hide file tree

Showing 53 changed files with 64 additions and 22 deletions.
diff --git a/.gitignore b/.gitignore
diff --git a/.hadolint.yaml b/.hadolint.yaml
diff --git a/.yamllint b/.yamllint
diff --git a/LICENSE b/LICENSE
diff --git a/charts/nebulous-exponential-smoothing-predictor/.helmignore b/charts/nebulous-exponential-smoothing-predictor/.helmignore
diff --git a/charts/nebulous-exponential-smoothing-predictor/Chart.yaml b/charts/nebulous-exponential-smoothing-predictor/Chart.yaml
diff --git a/charts/nebulous-exponential-smoothing-predictor/templates/NOTES.txt b/charts/nebulous-exponential-smoothing-predictor/templates/NOTES.txt
diff --git a/charts/nebulous-exponential-smoothing-predictor/templates/_helpers.tpl b/charts/nebulous-exponential-smoothing-predictor/templates/_helpers.tpl
diff --git a/charts/nebulous-exponential-smoothing-predictor/templates/deployment.yaml b/charts/nebulous-exponential-smoothing-predictor/templates/deployment.yaml
diff --git a/charts/nebulous-exponential-smoothing-predictor/templates/hpa.yaml b/charts/nebulous-exponential-smoothing-predictor/templates/hpa.yaml
diff --git a/charts/nebulous-exponential-smoothing-predictor/templates/ingress.yaml b/charts/nebulous-exponential-smoothing-predictor/templates/ingress.yaml
diff --git a/charts/nebulous-exponential-smoothing-predictor/templates/service.yaml b/charts/nebulous-exponential-smoothing-predictor/templates/service.yaml
diff --git a/charts/nebulous-exponential-smoothing-predictor/templates/serviceaccount.yaml b/charts/nebulous-exponential-smoothing-predictor/templates/serviceaccount.yaml
diff --git a/charts/nebulous-exponential-smoothing-predictor/values.yaml b/charts/nebulous-exponential-smoothing-predictor/values.yaml
diff --git a/exponential-smoothing-predictor/Dockerfile b/exponential-smoothing-predictor/Dockerfile
diff --git a/exponential-smoothing-predictor/README.md b/exponential-smoothing-predictor/README.md
@@ -0,0 +1,37 @@
+# Exponential Smoothing predictor
+
+## Introduction
+
+The exponential smoothing predictor is one of the predictors that will be available to the NebulOuS platform for forecasting. Along with other prediction methods, it provides its input to the Prediction Orchestrator which then aggregates it and forwards finalized predictions.
+
+To be used, it follows the process which is outlined in the NebulOuS wiki page related to the generation of predictions:
+
+https://github.com/eu-nebulous/nebulous/wiki/5.3-Predictions-Generation-Process
+
+The exponential smoothing forecaster is based on the use of the Holt-Winters R library to generate predictions. It uses data stored by the monitoring data persistor module (https://github.com/eu-nebulous/monitoring-data-persistor). Data is resampled in order not to create too much load on the forecasting component. Based on the input events, it generates a number of forward predictions, which can be exploited by the Prediction Orchestrator
+
+## Component outline and use
+
+Before starting the component, please ensure that there is at least one recent prediction (where recent is at most as old as the expected difference between the time of the first prediction and the current time), and at least one data point as old as three times the difference between the first prediction and the current time.
+To illustrate, assume that the current time is 1705046000, and the first prediction will happen at 1705046500. Then, one recent datapoint is needed (not older than 1705045500) and at least one older datapoint is needed (not more recent than 1705044500). 
+Usually, a couple of minutes of recent data should be enough for predictions a few seconds ahead (please scale accordingly, depending on your scenario)
+
+The component initially waits for a `start_forecasting` sent to the `eu.nebulouscloud.forecasting.start_forecasting.exponentialsmoothing` topic.
+An example payload for this event follows:
+
+```json
+{
+  "name": "_Application1",
+    "metrics": ["cpu_usage"],
+    "timestamp": 1705046535,
+    "epoch_start": 1705046500,
+    "number_of_forward_predictions": 5,
+    "prediction_horizon": 10
+}
+```
+
+Once this event is sent, it subscribes to the appropriate metric topics and publishes values as required (in the scenario above, 5 predictions every 10 seconds for the cpu_usage metric).
+
+Optionally, and preferably before the start_forecasting event, the component can receive a metric_list event (Event type 3 in the SLO Severity-based Violation Detector interface). Using the information related to the metrics which is contained there, it will be able to constrain predictions to an admissible range for each metric it provides predictions for.
+
+https://github.com/eu-nebulous/nebulous/wiki/5.2-The-SLO-Severity%E2%80%90based-Violation-Detector-Interface
diff --git a/exponential-smoothing-predictor/src/exn/__init__.py b/exponential-smoothing-predictor/src/exn/__init__.py
diff --git a/exponential-smoothing-predictor/src/exn/connector.py b/exponential-smoothing-predictor/src/exn/connector.py
diff --git a/exponential-smoothing-predictor/src/exn/core/__init__.py b/exponential-smoothing-predictor/src/exn/core/__init__.py
diff --git a/exponential-smoothing-predictor/src/exn/core/consumer.py b/exponential-smoothing-predictor/src/exn/core/consumer.py
diff --git a/exponential-smoothing-predictor/src/exn/core/context.py b/exponential-smoothing-predictor/src/exn/core/context.py
diff --git a/exponential-smoothing-predictor/src/exn/core/handler.py b/exponential-smoothing-predictor/src/exn/core/handler.py
diff --git a/exponential-smoothing-predictor/src/exn/core/link.py b/exponential-smoothing-predictor/src/exn/core/link.py
diff --git a/exponential-smoothing-predictor/src/exn/core/manager.py b/exponential-smoothing-predictor/src/exn/core/manager.py
diff --git a/exponential-smoothing-predictor/src/exn/core/publisher.py b/exponential-smoothing-predictor/src/exn/core/publisher.py
diff --git a/exponential-smoothing-predictor/src/exn/core/schedule_publisher.py b/exponential-smoothing-predictor/src/exn/core/schedule_publisher.py
diff --git a/exponential-smoothing-predictor/src/exn/core/state_publisher.py b/exponential-smoothing-predictor/src/exn/core/state_publisher.py
diff --git a/exponential-smoothing-predictor/src/exn/handler/__init__.py b/exponential-smoothing-predictor/src/exn/handler/__init__.py
diff --git a/exponential-smoothing-predictor/src/exn/handler/connector_handler.py b/exponential-smoothing-predictor/src/exn/handler/connector_handler.py
diff --git a/exponential-smoothing-predictor/src/exn/settings/__init__.py b/exponential-smoothing-predictor/src/exn/settings/__init__.py
diff --git a/exponential-smoothing-predictor/src/exn/settings/base.py b/exponential-smoothing-predictor/src/exn/settings/base.py
diff --git a/exponential-smoothing-predictor/src/prepare_python_dependencies.sh b/exponential-smoothing-predictor/src/prepare_python_dependencies.sh
diff --git a/exponential-smoothing-predictor/src/r_predictors/__init__.py b/exponential-smoothing-predictor/src/r_predictors/__init__.py
diff --git a/exponential-smoothing-predictor/src/r_predictors/default_application.csv b/exponential-smoothing-predictor/src/r_predictors/default_application.csv
diff --git a/exponential-smoothing-predictor/src/r_predictors/forecasting_real_workload.R b/exponential-smoothing-predictor/src/r_predictors/forecasting_real_workload.R
diff --git a/exponential-smoothing-predictor/src/r_predictors/prediction_configuration-windows.properties b/exponential-smoothing-predictor/src/r_predictors/prediction_configuration-windows.properties
diff --git a/exponential-smoothing-predictor/src/r_predictors/prediction_configuration.properties b/exponential-smoothing-predictor/src/r_predictors/prediction_configuration.properties
@@ -1,4 +1,4 @@
-#Thu May 16 12:31:21 UTC 2024
+#Thu Jul 11 09:34:44 UTC 2024
 APP_NAME=default_application
 METHOD=exponential_smoothing
 INFLUXDB_HOSTNAME=nebulous-influxdb

diff --git a/exponential-smoothing-predictor/src/r_predictors/r_commands.R b/exponential-smoothing-predictor/src/r_predictors/r_commands.R
diff --git a/exponential-smoothing-predictor/src/requirements.txt b/exponential-smoothing-predictor/src/requirements.txt
diff --git a/exponential-smoothing-predictor/src/runtime/Predictor.py b/exponential-smoothing-predictor/src/runtime/Predictor.py
@@ -109,7 +109,7 @@ def predict_attribute(application_state, attribute, configuration_file_location,
 
     process_output = run(command, shell=True, stdout=PIPE, stderr=PIPE, universal_newlines=True)
     if (process_output.stdout==""):
-        print_with_time("Empty output from R predictions - the error output is the following:")
+        logging.info("Empty output from R predictions - the error output is the following:")
         print(process_output.stderr) #There was an error during the calculation of the predicted value
 
     process_output_string_list = process_output.stdout.replace("[1] ", "").replace("\"", "").split()
@@ -139,8 +139,14 @@ def predict_attribute(application_state, attribute, configuration_file_location,
         prediction_valid = True
         print_with_time("The prediction for attribute " + attribute + " is " + str(prediction_value)+ " and the confidence interval is "+prediction_confidence_interval)
     else:
-        print_with_time("There was an error during the calculation of the predicted value for "+str(attribute)+", the error log follows")
-        print_with_time(process_output.stdout)
+        logging.info("There was an error during the calculation of the predicted value for "+str(attribute)+", the error log follows")
+        logging.info(process_output.stdout)
+        logging.info("\n")
+        logging.info("----------------------")
+        logging.info("Printing stderr")
+        logging.info("----------------------")
+        logging.info("\n")
+        logging.info(process_output.stderr)
 
     output_prediction = Prediction(prediction_value, prediction_confidence_interval,prediction_valid,prediction_mae,prediction_mse,prediction_mape,prediction_smape)
     return output_prediction
@@ -454,8 +460,9 @@ def main():
 
     #Change to the appropriate directory in order i) To invoke the forecasting script appropriately and ii) To store the monitoring data necessary for predictions
     from sys import platform
-    if platform == "win32":
-        os.chdir("exponential-smoothing-predictor/src/r_predictors")
+    if platform == "win32" or bool(os.environ["TEST_RUN"]):
+        print(os.listdir("."))
+        os.chdir("../r_predictors")
         # linux
     elif platform == "linux" or platform == "linux2":
         os.chdir("/home/r_predictions")

diff --git a/exponential-smoothing-predictor/src/runtime/__init__.py b/exponential-smoothing-predictor/src/runtime/__init__.py
diff --git a/exponential-smoothing-predictor/src/runtime/operational_status/ApplicationState.py b/exponential-smoothing-predictor/src/runtime/operational_status/ApplicationState.py
@@ -77,30 +77,24 @@ def update_monitoring_data(self):
         Utilities.print_with_time("Starting dataset creation process...")
 
         try:
-            """
-            Deprecated functionality to retrieve dataset creation details. Relevant functionality moved inside the load configuration method
-            influxdb_hostname = os.environ.get("INFLUXDB_HOSTNAME","localhost")
-            influxdb_port = int(os.environ.get("INFLUXDB_PORT","8086"))
-            influxdb_username = os.environ.get("INFLUXDB_USERNAME","morphemic")
-            influxdb_password = os.environ.get("INFLUXDB_PASSWORD","password")
-            influxdb_dbname = os.environ.get("INFLUXDB_DBNAME","morphemic")
-            influxdb_org = os.environ.get("INFLUXDB_ORG","morphemic")
-            application_name = "default_application"
-            """
             for metric_name in self.metrics_to_predict:
                 time_interval_to_get_data_for = str(EsPredictorState.number_of_days_to_use_data_from) + "d"
                 print_data_from_db = True
-                query_string = 'from(bucket: "'+self.influxdb_bucket+'")  |> range(start:-'+time_interval_to_get_data_for+')  |> filter(fn: (r) => r["_measurement"] == "'+metric_name+'")'
+                query_string = 'from (bucket: "'+self.influxdb_bucket+'")  |> range(start:-'+time_interval_to_get_data_for+')  |> filter(fn: (r) => r["_measurement"] == "'+metric_name+'")'
                 influx_connector = InfluxDBConnector()
                 logging.info("performing query for application with bucket "+str(self.influxdb_bucket))
                 logging.info("The body of the query is "+query_string)
                 logging.info("The configuration of the client is "+Utilities.get_fields_and_values(influx_connector))
                 current_time = time.time()
                 result = influx_connector.client.query_api().query(query_string, EsPredictorState.influxdb_organization)
                 elapsed_time = time.time()-current_time
-                logging.info("performed query, it took "+str(elapsed_time) + " seconds")
+                prediction_dataset_filename = self.get_prediction_data_filename(EsPredictorState.configuration_file_location, metric_name)
+                if len(result)>0:
+                    logging.info(f"Performed query to the database, it took "+str(elapsed_time) + f" seconds to receive {len(result[0].records)} entries (from the first and possibly only table returned). Now logging to {prediction_dataset_filename}")
+                else:
+                    logging.info("No records returned from database")
                 #print(result.to_values())
-                with open(self.get_prediction_data_filename(EsPredictorState.configuration_file_location, metric_name), 'w') as file:
+                with open(prediction_dataset_filename, 'w') as file:
                     for table in result:
                         #print header row
                         file.write("Timestamp,ems_time,"+metric_name+"\r\n")

diff --git a/exponential-smoothing-predictor/src/runtime/operational_status/EsPredictorState.py b/exponential-smoothing-predictor/src/runtime/operational_status/EsPredictorState.py
diff --git a/exponential-smoothing-predictor/src/runtime/operational_status/__init__.py b/exponential-smoothing-predictor/src/runtime/operational_status/__init__.py
diff --git a/exponential-smoothing-predictor/src/runtime/predictions/Prediction.py b/exponential-smoothing-predictor/src/runtime/predictions/Prediction.py
diff --git a/exponential-smoothing-predictor/src/runtime/predictions/__init__.py b/exponential-smoothing-predictor/src/runtime/predictions/__init__.py
diff --git a/exponential-smoothing-predictor/src/runtime/predictions/prediction_requirements.txt b/exponential-smoothing-predictor/src/runtime/predictions/prediction_requirements.txt
diff --git a/exponential-smoothing-predictor/src/runtime/utilities/InfluxDBConnector.py b/exponential-smoothing-predictor/src/runtime/utilities/InfluxDBConnector.py
diff --git a/exponential-smoothing-predictor/src/runtime/utilities/PredictionPublisher.py b/exponential-smoothing-predictor/src/runtime/utilities/PredictionPublisher.py
diff --git a/exponential-smoothing-predictor/src/runtime/utilities/Utilities.py b/exponential-smoothing-predictor/src/runtime/utilities/Utilities.py
diff --git a/exponential-smoothing-predictor/src/runtime/utilities/__init__.py b/exponential-smoothing-predictor/src/runtime/utilities/__init__.py
diff --git a/exponential-smoothing-predictor/src/setup.py b/exponential-smoothing-predictor/src/setup.py
@@ -35,7 +35,13 @@
     # Dependent packages (distributions)
     install_requires=[
         "python-slugify",
-        "jproperties"
+        "jproperties",
+        "requests",
+        "numpy",
+        "python-qpid-proton",
+        "influxdb-client",
+        "python-dotenv",
+        "python-dateutil"
     ],
     #package_dir={'': '.'},
     entry_points={

diff --git a/noxfile.py b/noxfile.py