Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libsql-server,admin: add top-k queries to stats #486

Merged
merged 2 commits into from
Oct 24, 2023

Conversation

psarna
Copy link
Collaborator

@psarna psarna commented Oct 20, 2023

Top 10 most expensive queries are now stored and available
in the admin api. The ranking is kept in-memory only.

$ curl -s http://localhost:8081/v1/namespaces/default/stats | jq '.top_queries' | jtbl
╒══════════════════════════════════════════════════════════╤════════════════╤═════════════╕
│ query                                                    │   rows_written │   rows_read │
╞══════════════════════════════════════════════════════════╪════════════════╪═════════════╡
│ INSERT OR IGNORE INTO coordinates VALUES (?, ?, ?);      │              0 │           1 │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ INSERT OR IGNORE INTO counter VALUES (?, ?, 0);          │              0 │           1 │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ SELECT 1;                                                │              0 │           1 │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ INSERT OR IGNORE INTO counter VALUES (?, ?, 0);          │              1 │           0 │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ UPDATE counter SET value = value + 1 WHERE country = ? A │              1 │           1 │
│ ND city = ?;                                             │                │             │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ INSERT OR IGNORE INTO coordinates VALUES (?, ?, ?);      │              2 │           0 │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ SELECT * FROM counter;                                   │              0 │           3 │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ CREATE TABLE IF NOT EXISTS counter(country TEXT, city TE │              2 │           1 │
│ XT, value, PRIMARY KEY(country, city)) WITHOUT ROWID     │                │             │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ SELECT * FROM counter;                                   │              0 │           4 │
├──────────────────────────────────────────────────────────┼────────────────┼─────────────┤
│ CREATE TABLE IF NOT EXISTS coordinates(lat INT, long INT │              3 │           1 │
│ , airport TEXT, PRIMARY KEY (lat, long))                 │                │             │
╘══════════════════════════════════════════════════════════╧════════════════╧═════════════╛

@psarna
Copy link
Collaborator Author

psarna commented Oct 20, 2023

(pushed the changes before lunch, but I should still test and clean up a little more)

@psarna psarna marked this pull request as draft October 20, 2023 10:57
@psarna psarna marked this pull request as ready for review October 20, 2023 11:26
tantaman pushed a commit to vlcn-io/libsql that referenced this pull request Oct 20, 2023
@@ -650,6 +650,12 @@ impl<W: WalHook> Connection<W> {
};
self.stats.inc_rows_read(rows_read as u64);
self.stats.inc_rows_written(rows_written as u64);
let weight = (rows_read + rows_written) as i64;
if self.stats.qualifies_as_top_query(weight) {
let query = stmt.expanded_sql().unwrap_or("<unknown>".into());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to use the query with bound parameters, maybe it makes more sense to register with params unbound?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarinPostma are you aware of any API in rusqlite that gives you raw SQL string with parameters unbound?

I'm fine with both, although expanded_sql might give you some more insight into the problematic query, e.g. the exact key that contains too much data, and so on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 before we make it a rusqlite statement, of course. Good idea, will do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, fixed

@@ -22,6 +43,11 @@ pub struct Stats {
write_requests_delegated: AtomicU64,
#[serde(default)]
current_frame_no: AtomicU64,
// Lowest value in currently stored top queries
#[serde(default)]
top_query_threshold: AtomicI64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually need that, since it's always top_queries.first()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need that, but I want to avoid taking a lock for top_queries to peek first(), if a query doesn't qualify in the first place. It's just fast path optimization, assuming that 99.999% of the queries are short and don't need to be recorded in the top_queries ranking.

Top 10 most expensive queries are now stored and available
in the admin api. The ranking is kept in-memory only.

```
$ curl -s http://localhost:8081/v1/namespaces/default/stats | jq '.top_queries[]'
{
  "rows_written": 0,
  "rows_read": 1,
  "query": "EXPLAIN SELECT * FROM t;"
}
{
  "rows_written": 0,
  "rows_read": 1,
  "query": "SELECT 1;"
}
{
  "rows_written": 2,
  "rows_read": 0,
  "query": "INSERT INTO t VALUES (1, 'a'), (2, 'bb');"
}
{
  "rows_written": 2,
  "rows_read": 1,
  "query": "create table t(id, v)"
}
{
  "rows_written": 0,
  "rows_read": 4,
  "query": "SELECT * FROM t;"
}
{
  "rows_written": 2,
  "rows_read": 4,
  "query": "INSERT INTO t SELECT * FROM t;"
}
```
... so that we don't distinguish between different bound values
when tracking queries, and instead only gather information
about the raw query string that may contain wildcards, like:
 INSERT INTO t VALUES (?, ?, ?);
@psarna psarna merged commit e15da4e into tursodatabase:main Oct 24, 2023
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants