Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

Analyse runtime continuous queries #783

Open
Alexander-Dubrawski opened this issue Nov 16, 2020 · 0 comments
Open

Analyse runtime continuous queries #783

Alexander-Dubrawski opened this issue Nov 16, 2020 · 0 comments
Labels

Comments

@Alexander-Dubrawski
Copy link
Collaborator

At the moment we actively decide to only analyze results that are older than 3/4 seconds. We should investigate how long the continuous query is running and if 3/4 seconds is the optimal choice.

Technical implementation

There are two factors why we can't get the live data immediately. The reason for that lies in the backend. First of all, we are using continuous queries inside the influx (https://docs.influxdata.com/influxdb/v1.8/query_language/continuous_queries/).

https://github.com/hyrise/Cockpit/blob/F/Fix_diconnected_pluginlogs/hyrisecockpit/database_manager/database.py#L126-#L213

For example, the latency_continuous_query is running every second and calculates the latency for the last five timestamps. This calculation takes some time so we can't get the live data. Moreover, we have the same problem with other metrics. For example detailed query information (https://github.com/hyrise/Cockpit/blob/F/Fix_diconnected_pluginlogs/hyrisecockpit/api/app/metric/service.py#L88-#L121)
If a worker writes the detailed query information in the successful_queries table inside the influx it can happen that we try to read it at time t but the worker wouldn't have finished writing it at time t. That's why in this case we are using an offset from three seconds. This is kind of a magic number and not an ideal solution.

Inside the frontend, we are sending a request with a time range from 10 seconds in the past (With the exact number of seconds I'm not 100% sure I need to find the correct code in the frontend) to the live current time. The backend then returns a List with entries for all timestamps. Since t doesn't have data for every time entry in the requested time range (often for the last three seconds or so) it will return empty entries. The graph is then plotting the results resulting in a not completely filled graph.

So, all in all, we have the problem that we can't really tell when exactly the data is available (and with the continuous queries it takes longer to get the live data in comparison to the plugin log).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant