Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kibana Developer Experience #70733

Open
15 of 41 tasks
mshustov opened this issue Jul 3, 2020 · 9 comments
Open
15 of 41 tasks

Kibana Developer Experience #70733

mshustov opened this issue Jul 3, 2020 · 9 comments
Labels
impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. Meta Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Operations Team label for Operations Team

Comments

@mshustov
Copy link
Contributor

mshustov commented Jul 3, 2020

According to the platform survey, the current state of dev tooling is the main source of frustration for Kibana developers.
This issue is created:

  • to collect ideas to improve the day-to-day work experience for Kibana and community developers.
  • to identify owners for each improvement area
  • to spot low hanging fruits

Debugging

Long feedback loop during development

After changing a branch and installing all the deps, the developer has to wait ~5 min. to have the Kibana server ready to respond after yarn start. Any changes into plugin code require less compilation overhead, but waiting time is enough to loose developer focus:

  • for the server-side: wait for @kbn/optimizer to rebuild a plugin, wait for Kibana to stop and restart optimizer and server workers. Waiting time ~1 min!
  • for client-side code: wait for @kbn/optimizer to rebuild a plugin (in case of sass ~10sec.) and manually reload the page to see changes took effect.

Solutions:

High CPU and memory consumption

It's expected to be high during the build phase due to @kbn/optimizer architecture. But also we started receiving complaints from customers and support about this problem in runtime after Kibana migration to KP has been completed.

Solutions:

Slow IDE / type-checking

VSCode is almost unusable. Go to the definition... within x-pack folder might take several minutes(!). Switching between branches crashes the TS server.
WebStorm works a bit better but suddenly starts consuming 500-600% CPU, considering the CPU consumption of Kibana in dev mode it freezes the laptop for minutes.
node scripts/type_check takes ~ 5 minutes to complete. Devs almost always have to run it in 2 passes: in a focused mode --project and on the whole project.

Solutions:

Hard to use nodejs inspector

Kibana spins off several instances in the cluster mode during development, it's hard to debug with a built-in inspector or built-in IDE debugger.

Solutions:

Chrome Dev Tools crashing

In dev mode DevTools > Performance > Record might stop working suddenly.

Slow js linting on the local laptop

Running eslint locally takes up to 1 hour. Most of the time spent in the prettier/prettier plugin.
Solutions:

Testing #39085

Non-standard tooling usage

The use of non-standard wrappers with injected configuration make it hard for those who are otherwise familiar with the tooling to understand. Additionally, this non-conventional approach prevents usage with editors extensions.

Additional details outlined in @smith's comment

Solutions:

High level fragmentation of testing tools

Kibana uses different test runners: jest, mocha, FTR, karma, cypress. All of them have different setup processes, different API, and requirements to the environment. To make things more confusing, developers have to keep in mind how to run the same runner for different envs: OSS unit-tests, OSS integration tests, x-pack unit tests, x-pack integration-tests.

Solutions:

No clear library of testing utilities

We have different utilities for to facilitate unit and integration tests in different parts of the code base. Many teams invent their own utilities due to the lack of a central testing utility library. This can lead to unnecessary effort, duplication, and even low-value and brittle tests.

Solutions:

  • Provide a single testing library that includes utilities for common patterns (eg. HTTP routes, SavedObject Migrations, Embeddables) in a well-known location
  • Eliminate and/or document poor testing practices or utilities that should be avoided

jest unit-tests are slow

Running unit-tests locally isn't a part of the everyday workflow if it takes more than 30min. for x-pack tests.

Solutions:

  • Debug perf problem to identify the bottleneck
  • Strict unit-test / integration-test separation. Fail test if it takes too long, such tests should be converted into integration tests.

Lack of domain separation in testing

Developers cannot run easily all types of tests related only to their code ownership area.

Lack of docs for FTR

Lack of docs for FTR makes adding new tests hard. The only way to understand how the whole thing works are to read source code and study by examples. Both approaches require a lot of time since FTR is written in js and uses highly-dynamic API that hard to type correctly.
FTR config is just an object. That's the power and problem of FTR. You have to study other config files to understand what options are supported.

Solutions:

  • provide types for FTR config Improve FTR config type safety #69393
  • Provide API to inspect FTR state Provide API to inspect FTR state #98311
  • migrate FTR and test-suits to TS
  • provide recommendations on how to write tests, page-objects, assertions
  • provide testing recipes (how to perform a request to Kibana REST API, how to run Kibana with TLS, etc.)
  • document how to use a debugger on the server-side with FTR

FTR is slow

The same problem as for running Kibana in dev mode + overhead of restarting ES + slowness of Selenium. This is the main source of perf problem on CI.

Solutions:

FTR API is not standardized

FTR API is confusing and not focused enough.
https://github.com/elastic/kibana-team/issues/12#issuecomment-408163843
https://github.com/elastic/kibana-team/issues/12#issuecomment-408166942

Solutions:

  • reduce API surface

Slow CI turnaround time/flaky test/hard to debug test failures

Solutions:

Developer Documentation

Sparse documentation for both external and external developers.

Solutions:

Kibana Platform performance

Solutions:

@mshustov mshustov added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Operations Team label for Operations Team Meta labels Jul 3, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform (Team:Platform)

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@bhavyarm
Copy link
Contributor

bhavyarm commented Jul 6, 2020

cc @elastic/kibana-qa

@smith
Copy link
Contributor

smith commented Jul 6, 2020

Once source of confusion when using the Kibana CLI tools are the custom scripts.

So we have:

  • scripts/type-check.js instead of tsc
  • scripts/jest.js instead of jest
  • yarn storybook instead of storybook-start

...and more. In some cases there's a script at the root and a script in x-pack.

These may or may not forward their arguments to the command they're wrapping, and usually have behavior that's hard to understand.

For example, when running scripts/type-check.js, the script always adds the --pretty flag. If you're running tsc, it will use --pretty by default but turn it off if you pipe the output to another program, unless you explicity give it that flag. So, when you use our script, you get ANSI color characters in your output no matter what. The only way to figure this out is to read the code of the script and its dependencies.

The Jest script in x-pack will run for the entire x-pack "package" unless you pass --testPathPattern, but even then watching doesn't quite work right and it's a pain to set up with any tools like editor extensions.

In APM we've set up our own jest.config.js which imports settings from src/dev/jest but overrides them to run only for APM, so if you go to the APM directory and run npx jest, it runs Jest for that plugin. Configuring it to work with your editor's Jest extension is easy.

This "philosophy" seems to provide a better experience with better performance:

  • Have plugins use the familiar, well documented CLI tools to run commands
  • Use "native" configuration files for these tools, but provide centralized imporatable configuration that can handle the complexity of our environment.
  • Treat each plugin as a "package"

There is indeed a place for many of the scripts in the scripts directories, especially for CI tasks, but when there's a well known and well documented CLI tool available, using that instead has many benefits.

Many of our performance problems are related to the fact that x-pack is a "package" and none of the plugins in there are. There are good historical reasons for this, but at this point it seems that in x-pack we have all the bad parts of running a monorepo with none of the benefits of modern monorepo tooling like yarn workspaces, lerna, or rush. I have a feeling moving closer to this model would solve many problems.

@fbaligand
Copy link
Contributor

Hi,

Concerning developer experience pain, I would like to say that since Kibana 7.8.0, I don't have anymore core code hot reload.
I do yarn start, and if I change something in core code, I need to restart fully Kibana so that my change is applied. That's somewhat painful :(
Before Kibana 7.8.0, I didn't have this issue.

@tylersmalley
Copy link
Contributor

@smith thanks for your feedback - I have opened a discuss issue (#72569) to hopefully move us towards standard Jest configuration while taking advantage of the multi-project support.

@rajat315315
Copy link

Tried several times to start contributing but heavy CPU usage/ errors: "Elastic didn't load properly" turned me away! :(

@joshdover
Copy link
Contributor

Tried several times to start contributing but heavy CPU usage/ errors: "Elastic didn't load properly" turned me away! :(

Sorry to hear that, Rajat! We're focused on improving this area quite a bit over the next few months. As part of a major refactoring project, we will be able to remove many components that consume a significant amount of CPU and memory. Hope this helps solve this problem for you!

@tylersmalley tylersmalley added 1 and removed 1 labels Oct 11, 2021
@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Feb 16, 2022
@tylersmalley tylersmalley removed loe:small Small Level of Effort impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. EnableJiraSync labels Mar 16, 2022
@exalate-issue-sync exalate-issue-sync bot added the impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. label Mar 31, 2022
@tylersmalley
Copy link
Contributor

This is in our backlog to figure out what is left here. We should be able to identify what is remaining and create issues for those and close this meta-issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. Meta Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Team:Operations Team label for Operations Team
Projects
None yet
Development

No branches or pull requests

8 participants