Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measuring AsyncAPI spec adoption #780

Open
derberg opened this issue Feb 20, 2022 · 52 comments
Open

Measuring AsyncAPI spec adoption #780

derberg opened this issue Feb 20, 2022 · 52 comments
Labels
enhancement keep-open Prevents stale bot from closing it

Comments

@derberg
Copy link
Member

derberg commented Feb 20, 2022

Reason/Context

We do not know how many people use AsyncAPI. The most accurate number we could get is the amount of the AsyncAPI users that work with AsyncAPI documents. But how measure how many people out there created/edited AsyncAPI file?

The answer is a solution that includes:

  • SchemaStore
  • promotion of using asyncapi in a filename created using AsyncAPI spec
  • having a server where we expose all JSON Schemas provided for AsyncAPI
  • storing somewhere info whenever JSON Schema is fetched by users, so we can count it as "usage"

20210528_114537

Some more discussion -> https://asyncapi.slack.com/archives/C0230UAM6R3/p1622198311005900

Description

  1. Create new endpoint in server-api service that anyone can use to fetch AsyncAPI JSON Schema files of any version
  2. JSON schemas are in https://github.com/asyncapi/spec-json-schemas and can be used as normal dependency
  3. Whenever JSON Schema file is fetched by the user, information should be stored somewhere. I propose Google Tag Manager as we already have it for the website, we can send data there and then easily read data. I'm all ears if there is something better and still free
  4. Add AsyncAPI config to SchemaStore and have a configuration on AsyncAPI side that will always automatically open a PR against SchemaStore to provide a new location of a new version of the JSON Schema for the new AsyncAPI spec version
  5. Update docs and instructions for users how to configure IDE properly and how to name files. Update official examples

If time left, we need to expose numbers somewhere. Either embed Google Analytics diagram somewhere on the AsyncAPI website or just have at least an API endpoint that exposes the latest numbers.

For GSoC participates

  • you get to code TS in a service that is publicly available and you are sure your work will be consumed by thousands of people
  • you will learn automation with GitHub Actions
  • you will have a chance to learn how to integrate with different services, like Google API, unless you find a better solution and better API to use
  • you will learn in-depth how autocompletion in IDEs is done with SchemaStore
@derberg derberg added enhancement gsoc This label should be used for issues or discussions related to ideas for Google Summer of Code labels Feb 20, 2022
@github-actions
Copy link

Welcome to AsyncAPI. Thanks a lot for reporting your first issue. Please check out our contributors guide and the instructions about a basic recommended setup useful for opening a pull request.
Keep in mind there are also other channels you can use to interact with AsyncAPI community. For more details check out this issue.

@ritik307
Copy link
Contributor

@derberg Sounds interesting ... I would like to take this issue as my GSOC'22 proposal.😊

@derberg
Copy link
Member Author

derberg commented Mar 1, 2022

@ritik307 sounds awesome!

@smoya @BOLT04 @magicmatatjahu any objections to have this endpoint first on server-api? i personally think better to add it here and then if we measure to big traffic, we can always split into a separate microservice

@magicmatatjahu
Copy link
Member

No problem for me, but we have to remember that we also provide that project as Docker Image, so people will have also that path. We have to think how to avoid unnecessary paths for people which use that project.

@BOLT04
Copy link
Member

BOLT04 commented Mar 1, 2022

any objections to have this endpoint first on server-api? i personally think better to add it here and then if we measure to big traffic, we can always split into a separate microservice

@derberg no problem for me 🙂, this is pretty cool!

No problem for me, but we have to remember that we also provide that project as Docker Image, so people will have also that path. We have to think how to avoid unnecessary paths for people which use that project.

@magicmatatjahu I get what you're saying and if this new endpoint does in fact need to use external services (e.g. Google APIs), we would need new config/environment variables for API keys, etc. I propose we use feature flags to solve this. On our deployed version of the API the feature is on, but for local development it's not. If someone wants to try it out locally, they just have to configure the necessary values and turn the toggle on to start measuring spec adoption in their own environment 🙂
What do you think?

@smoya
Copy link
Member

smoya commented Mar 1, 2022

I love the idea of adding our schemas to Schema Store. I didn't know about it until 5 minutes ago and I like it a lot 👍.

I want to add some feedback regarding the creation of a service for serving the schemas:

Serving static files such as JSON Schema files in a fast and reliable way is exactly the reason why CDNs exist. Considering the possible amount of traffic this service will have, and the fact it will keep growing on time (more users, more tooling, etc), I would not advocate for creating and maintaining this service by ourselves.
And in fact, the good point is that we are already using a free CDN for serving our website: Netlify (which has multiple cloud providers).

I understand we want this service because we need those metrics (maybe there is another strong reason I missed, so please correct me).
Would it make sense to just investigate and ask/pay for their Analytics product?

There is also the following Draft PR by @jonaslagoni : #502 that might make sense to check. It aims to serve AsyncAPI JSON Schema files from our website.

On the other side, we could consider the same approach with any other CDN product that offers analytics, such as AWS S3, GCP Cloud Storage, etc (asking for budget, etc).

I would like to know your thoughts.

cc @derberg @ritik307 @magicmatatjahu @BOLT04

@derberg
Copy link
Member Author

derberg commented Mar 2, 2022

Cool. I think the most important is that you support the idea. It is not written in stone to have it as endpoint here:

  • regarding @magicmatatjahu concern about local usage and Docker. There are different options possible, feature flags, or just a need for environment variable that will enable analytics (so on local it wont work). Subject to discuss with person that will work on it
  • regarding @smoya point about maybe using Netlify Analytics in combination with CDN. This is also one of the possible options. Some investigation for sure needs to be done first. This can definitely be an outcome for this task. I personally prefer CDN, just ignored the fact that Netlify might have some Analytics for it.

@magicmatatjahu @smoya @BOLT04 please only keep in mind that we should leave as much as possible up to @ritik307 (if you still want to take this task for GSoC). You folks turn into mentors, just guide @ritik307 what needs to be checked and tried out to get the desired outcome.

@ritik307
Copy link
Contributor

ritik307 commented Mar 2, 2022

@magicmatatjahu @smoya @BOLT04 please only keep in mind that we should leave as much as possible up to @ritik307 (if you still want to take this task for GSoC). You folks turn into mentors, just guide @ritik307 what needs to be checked and tried out to get the desired outcome.

Sure @derberg I would love to take this task for GSOC 😊 and it would be great if you guys mentor me.😊

@magicmatatjahu
Copy link
Member

I think that file hosting and adding metrics to ServerAPI itself will not be a problem. We have control over every part, so we won't have to use additional services. However, CDN would be better in this case and I am for this option!

@smoya
Copy link
Member

smoya commented Mar 18, 2022

I would like to retake this, especially after @derberg raised its concern on #502 (comment).

There is something we should consider before moving forward with a custom solution with our own service. Right now we do not have services exposed openly for being consumed in the same frequency as static JSON Schema files would be.
Exposing a service that has such important responsibility (it would serve our JSON Schema files!) should include a large battery of APM + Infrastructure monitoring, and maybe in the future (emphasis on future) also consider some on-call rotation. It might seem far from today, but if our user adoption keeps growing as it does, it would be a thing.

With a CDN provided by a SAAS company, you remove all of those concerns.

Again, I know we want some metrics, but IMHO it is totally worth asking Netlify and if it fills our goals, pay if needed for the Metrics service. I can tell you is worth paying for a service than having to handle your own high-available service.

@derberg
Copy link
Member Author

derberg commented Mar 18, 2022

regarding @smoya point about maybe using Netlify Analytics in combination with CDN. This is also one of the possible options. Some investigation for sure needs to be done first. This can definitely be an outcome for this task. I personally prefer CDN, just ignored the fact that Netlify might have some Analytics for it.

This is definitely what I prefer since you mentioned Netlify Analytics.
Did you mean this https://docs.netlify.com/monitor-sites/analytics/ or something else?

@smoya
Copy link
Member

smoya commented Mar 18, 2022

regarding @smoya point about maybe using Netlify Analytics in combination with CDN. This is also one of the possible options. Some investigation for sure needs to be done first. This can definitely be an outcome for this task. I personally prefer CDN, just ignored the fact that Netlify might have some Analytics for it.

This is definitely what I prefer since you mentioned Netlify Analytics. Did you mean this https://docs.netlify.com/monitor-sites/analytics/ or something else?

Yup, this is the service I meant.

Netlify Analytics is available and ready, right in the dashboard, for any site you deploy to Netlify. It only costs $9/mo per site.
Source: https://www.netlify.com/products/analytics/

@derberg
Copy link
Member Author

derberg commented Apr 5, 2022

some important info: #502 (comment)

@smoya
Copy link
Member

smoya commented Apr 13, 2022

some important info: asyncapi/website#502 (comment)

IMHO we stay with option to have all JSON files in server-api that would work like a proxy to do analytics. It is up to server-api maintainers to decide if it is ok to first do it in server-api and then if because of the load, it should be split. Nevertheless, IMHO JSON files should not be exposed directly on the website here as we are looking an opportunity to track adoption.

TL;DR: I still think we should avoid creating a new file server app. Instead, look for an alternative based on a SaaS provider . And I'm suggesting some alternative ideas to the previous one. I'm happy to keep evolving this idea and also to put it in practice asap.

I understand the need to get such metrics and how simple it seems to build a file server with built-in metrics. However, I want to stay strong on this idea: We should avoid managing services on our own (at this time). Some of the reasons have been exposed already in (my) previous comments but I'm going to list some of them here with a bit more detail.

AsyncAPI JSON Schema definitions are the most important pieces of software we provide to the community (IMHO). They are meant to be used by systems for parsing and validating AsyncAPI documents and services that use them on runtime for validating messages, among other use cases.
We do have a package for both NodeJS and Go projects that users can use to import those schemas into their projects; however, we don't for any other language, meaning tooling will need to fetch those files at some point from the source.

However, who are the users of those raw files, and how do they use them? I can imagine a few use cases:

  • Parsers written in other languages.
  • Tooling that needs to validate documents in real-time, like IDE's.
  • Private services that need to validate messages.
  • Etc. <- And this is important. We do not have a clear idea of what the usage could be; therefore this initiative of collecting metrics :).

With this in mind, the following points are worth to be noted:

  • High Availability of those JSON Schema definitions is crucial.
    • We do not want to break any of those systems by having those files unavailable because of a downtime in our services. Imagine a parser in PHP that needs those files to be available on build time, but for some reason, those are not available. Besides considering this a good or bad practice (to use files from source), we should not expect people to always have a local copy of those files in their repositories/systems.
    • For that reason, monitoring becomes a vital need. To monitor our services, we should rely on a good monitoring system, preferably a SaaS like Datadog or New Relic.
    • This can eventually trigger the need to have people on an on-call rotation (long-term vision).
  • The maintainability of the services becomes a real thing.
    You create a service; you need to maintain it. That means bugs will need to be fixed.
    • More people are needed: code base increases. This time, the background goes more on High-Available services, so we need to find people who can maintain this as well eventually.
  • Reliability and response times.
    • I expect most of the tools out there to fetch the AsyncAPI JSON Schema files at build time. However, in the case of developers working with AsyncAPI files, they could use IDE's integration with https://www.schemastore.org to have nice autocompletes, etc. See @derberg's issue Why would I use schema store.
    • The source of those files will be fetched from our own raw files (in our supposed service), so if it is down or instead the response times are higher than expected, developer experience will be degraded, partially or entirely. Even though I guess schemastore.org has those files compiled or at least a cache layer, both things will mitigate this for a period of time.
    • Geographic distribution of the service also plays an important position. It is not the same to download the files stored in Washington (USA) from Mumbai (India), than from New York (USA). That's the reason CDN's exist.

Having said that, I'm proposing stick with a SaaS-based solution from the first day that allows us only to take care about the very minimum: as max, collecting the metrics and processing them, but never about serving the files.

We tried with Netlify Analytics. Unfortunately, the metrics we want (hits on JSON Schema files) are not collected. Even though it is a matter of time they support it, we don't have an ETA for it.

There are several other ways we can do this, and those are some of the ideas I have in mind:

Netlify Log Drains

Netlify Log Drains allows sending both traffic logs and function logs to an external service, such as New Relic, Datadog, S3... and also to our own service (could be a Netlify Function as well).
Netlify sends those logs in batches in near real-time. Logs are JSON/NDJSON format. You can see the output of those logs here.
This is not available in all plans, but I'm sure the Netlify support team will be happy to enable this, especially now that we tried Analytics but didn't fulfill our use case.

sequenceDiagram
    participant User
    participant asyncapi.org (Netlify)
    participant AsyncAPI Metrics collector
    Note right of AsyncAPI Metrics collector: Netlify Function <br/>or<br/> any monitoring SaaS

    User->>asyncapi.org (Netlify): https://asyncapi.org/definitions/2.3.0.json
    asyncapi.org (Netlify)->>User: 2.3.0.json
    asyncapi.org (Netlify)-->>AsyncAPI Metrics collector: Netlify Log Drains metrics
Loading

With this approach, and in the more complex solution, we will only care about the metrics collector service, which could eventually be down but won't affect the user request.
In the case of using any SaaS, it will be straightforward. As a side note, there are free tiers in services like NewRelic that maybe could fit our case.

Netlify Edge Handlers

Netlify Edge Handlers work by letting you executing code on the edge directly, intercepting the request. We could run Javascript code there to collect the metrics we want; in our case, the hits on the definition files. This is in BETA right now (you should ask to enable it). However, I would ask you for an ETA for going public. I guess they should have plans to release it as a public beta in the short-mid term.

EDIT: Netlify Edge Functions are now public beta, available for free. https://www.netlify.com/blog/announcing-serverless-compute-with-edge-functions

Use AWS S3

AWS S3 is a well-known solution for storing files. And with the metrics they expose (Cloudwatch), we could know the number of get operations per file.
We would need to add a Netlify rewrite rule (not a redirect) that proxies the requests to the S3 bucket. This is easy to configure through the netlify.toml file.

sequenceDiagram
    participant User
    participant asyncapi.org (Netlify)
    participant AWS S3

    User->>asyncapi.org (Netlify): https://asyncapi.org/definitions/2.3.0.json
    asyncapi.org (Netlify)->>AWS S3: Netlify rewrite rule to asyncapi.s3.amazonaws.com/definitions/2.3.0.json
    AWS S3->>asyncapi.org (Netlify): 2.3.0.json
    asyncapi.org (Netlify)->>User: 2.3.0.json
Loading

The price for this is not pretty high. I did a quick estimation for 30 million requests per month (yeah, a lot) here. We should also include the price for the Cloudwatch metrics, but IIRC is almost "nothing."

If price is a concern, we could investigate Cloudflare R2, which is super cheap. However, the metrics they provide are unknown to me at this moment. Also, we would need to ask for access to R2 as it is in Beta at this moment.

@smoya
Copy link
Member

smoya commented Apr 19, 2022

From today, Netlify Edge Functions (Previously known as Edge Handlers) are now public beta, available for free. https://www.netlify.com/blog/announcing-serverless-compute-with-edge-functions

@smoya
Copy link
Member

smoya commented Apr 22, 2022

With the following, we could add the metrics push into the Netlify function #680

@derberg derberg removed the gsoc This label should be used for issues or discussions related to ideas for Google Summer of Code label Apr 28, 2022
@derberg
Copy link
Member Author

derberg commented Apr 28, 2022

Taking this one off GSoC as it is important topic to handle and can't be delayed

@derberg
Copy link
Member Author

derberg commented Apr 28, 2022

How to start 😄
Lemme start with the positives ❤️

I love idea from #680

On the "negative" side. I have completely different view on Maintainance/High Availability/Response-time topics:

  • we do not build commercial product here, not looking for revenue, we do not have to satisfy the maximum of possible users
  • it is ok to say openly, that "this" service works like "that" has its limits, and that is it, really. Just like GitHub can say they have rate limits 🤷🏼 . We can say that solution is based on GitHub and Netlify and "this" are the limits. We just say recommended usage, like fetch on build, prebundle, cache 👍🏼
  • we offer service to the extent possible in open source. If someone wants to use these for free, go for it 👏🏼 . If you expect more, pay for it 🤷🏼 We can have dedicated funding for DevOps team, collect money from companies, and do it. Or they should just build their own tool really
  • it is ok to say "if you build a parser, and you fetch schemas in real time -> you're doing it wrong" and explain best practice and alternatives
  • in regards to SchemaStore and load from there. Not an issue at all because the cool stuff about measurement we get from this usage is that we get a "number of users" and not "number of downloads". Because IDEs support it in the way that they download it once, and cache.

So, lets fo forward with idea from #680


Alternative/compromise: to not mix topics and try to solve all with one solution. Maybe #680 could have 2 alternative paths, one for the needs related to AsyncAPI JSON Schema and $id and the other that we use only in SchemaStore. One solution with separate paths, and measure data are clean. Still depend on the same rate limits anyway of course

@smoya
Copy link
Member

smoya commented May 3, 2022

I've been playing with Google Analytics 4 as a candidate for publishing our metrics. I have to say, I didn't get a good result.
We could send events through the Measurement Protocol and it will kinda make the thing, but the UX for reading those metrics is completely awful:

1. In the whole realtime metrics, only a small rectangle including the events is present:

2022-05-03 at 10 39 11
2022-05-03 at 10 37 09

2. The details are very hard to check (I added a param for the URL of the fetched file):

2022-05-03 at 10 38 07

As we can see, everything is focused on web apps, so not a really good fit for us. I know @derberg has played a lot with GA , Google Tag Manager, etc. Do you think it is still a fit for this, or should we rather consider using another alternative?

@smoya
Copy link
Member

smoya commented May 3, 2022

I've been checking NewRelic One new free tier, and it allows to send up to 100GB of data, events included. I did a simple test with a POST request and created a simple dashboard to see how it would look like.
2022-05-03 at 16 16 11@2x

Btw, New Relic has NRQL, a custom query language that allows you to easily query anything you send to them in a SQL query language fashion.

If anyone has another suggestion, I'm happy to keep investigating (there are plenty of others out there)

@smoya
Copy link
Member

smoya commented May 4, 2022

In the meantime, I'm moving forward with New Relic solution by now, and the development is all here: #680

In case you want to use another provider for metrics, I'm happy to adapt the code.

More on #680 (comment)

@derberg
Copy link
Member Author

derberg commented May 12, 2022

GA allows you to also create new view, custom components, with scheduled reports etc. But yeah, I'm not GA evangelist.

Tbh I think the approach with New Relic is super nifty, as long as we can use it for free of course 😆 I guess you @smoya and @fmvilas can anyway get us more free storage if we need 😆

❤️ from me for New Relic

Does it mean we have an agreement on implementation? 🙌🏼
Before we finalize we need to give time to @magicmatatjahu @jonaslagoni @BOLT04 @fmvilas to voice opinion as they own either website or this repo, or just need solution (like Jonas)

@BOLT04
Copy link
Member

BOLT04 commented May 12, 2022

yeah let's go with the new relic solution proposed by @smoya 👍
I think with that the Server API doesn't need any implementation, so we could close this issue when that PR is merged right?

wdyt everyone?

@derberg
Copy link
Member Author

derberg commented May 16, 2022

I think we can even transfer it to https://github.com/asyncapi/website now 🤔

@smoya
Copy link
Member

smoya commented Jun 22, 2022

JSON Schema Store PR has been merged now, meaning all JSON Schema files fetched from it are now being downloaded from asyncapi.com/schema-store and metrics show that users are already fetching them:

Metrics showing downloads count of AsyncAPI JSON Schema files

cc @derberg

@derberg
Copy link
Member Author

derberg commented Jun 22, 2022

Omg this is so exciting 😍

@fmvilas
Copy link
Member

fmvilas commented Jul 5, 2022

❤️ Indeed! @smoya start thinking how do we send custom metrics from tooling 😝

@smoya
Copy link
Member

smoya commented Jul 6, 2022

❤️ Indeed! @smoya start thinking how do we send custom metrics from tooling 😝

We would need to expose a service that acts as a metrics ingest forwarding them to NR so we don't expose the NR API key on tooling but just send metrics to our service.

I would think about it eventually!

@smoya
Copy link
Member

smoya commented Jul 13, 2022

After fixing asyncapi/spec-json-schemas#236, JSON Schemas for different AsyncAPI versions are being downloaded from JSON Schema Store:

I see there are downloads from all versions, and I really doubt those downloads are organic or in purpose. What I think it's happening is that, since the schema served by Schema Store now is https://github.com/asyncapi/spec-json-schemas/blob/master/schemas/all.schema-store.json, JSON Schema parsers might be downloading ALL referenced ($ref) schemas at once instead of under demand.

However, VSCode IDE is still showing the error:
2022-07-13 at 17 49 23@2x

which @derberg mentioned already in redhat-developer/vscode-yaml#772 (reply in thread). So 🤷 ...

cc @derberg @fmvilas

@derberg
Copy link
Member Author

derberg commented Jul 14, 2022

which @derberg mentioned already in redhat-developer/vscode-yaml#772 (reply in thread). So 🤷 ...

I think we entered the world where we have to decide if we want to do things in our JSON Schema the way spec and spec maintainers recommend, or just adjust schema to work with tooling provided by the community 🤷🏼

I see there are downloads from all versions, and I really doubt those downloads are organic or in purpose.

yeah, the numbers for 2.0.0-rc1 and 2.0.0-rc2 are suspiciously high and the same 😄
I think you are completely right about the reason, that it is due to refs parsing. Can we measure the number of times https://www.asyncapi.com/schema-store/all.schema-store.json is fetched and automatically subtract that number from other downloads directly in the chart, without manual calculation? (kinda hack but I don't believe there is some other solution)

@smoya
Copy link
Member

smoya commented Jul 14, 2022

Can we measure the number of times https://www.asyncapi.com/schema-store/all.schema-store.json is fetched and automatically subtract that number from other downloads directly in the chart, without manual calculation? (kinda hack but I don't believe there is some other solution)

But if we do that, we will be invalidating all the counts for legitimate downloads. Correct me if I'm wrong, but:

Considering that 1 fetch of all.schema-store.json end ups doing 10 fetches (one for each schema per AsyncAPI version), let's say we start from scratch and we just do one fetch:

Downloads File
1 all.schema-store.json
1 1.0.0.json
1 1.1.0.json
1 1.2.0.json
1 2.0.0-rc1.json
1 2.0.0-rc2.json
1 2.0.0.json
1 2.1.0.json
1 2.2.0.json
1 2.3.0.json
1 2.4.0.json

We can't say, subtract 1 to each download, because this will end up happening:

Downloads File
1 all.schema-store.json
0 1.0.0.json
0 1.1.0.json
0 1.2.0.json
0 2.0.0-rc1.json
0 2.0.0-rc2.json
0 2.0.0.json
0 2.1.0.json
0 2.2.0.json
0 2.3.0.json
0 2.4.0.json

@derberg
Copy link
Member Author

derberg commented Jul 19, 2022

@smoya yeah, you are right 🤦🏼 it sucks

@derberg
Copy link
Member Author

derberg commented Aug 23, 2022

@smoya so looks like we can only measure adoption of the spec in general, not its specific versions?

@smoya
Copy link
Member

smoya commented Aug 23, 2022

@smoya so looks like we can only measure adoption of the spec in general, not its specific versions?

Yes, as the IDE plugins are downloading just one schema (containing all of the versions), we can't know which one they are using.
As the Schema Store matching is based on file patterns and not with content from the file, there is no way we could send data in the request made to our servers (For example, a header including the version).

So unfortunately, I'm running out of ideas here. I could open an issue in Schema Store repo asking for ideas.

@derberg
Copy link
Member Author

derberg commented Aug 24, 2022

It is not that bad. For me, most important is to measure how many users we have. So adoption of the spec in general, and not each version. I'm personally skeptical of such measurements, as then people complain that new versions are not adopted forgetting that they also do not use new versions if they do not need them (anyway, not topic for this issue).

If you can open a discussion with Schema Store, on how to fix things in future, that would be amazing. As even if I'm not interested with specific version adoption, I bet others are 😄

Can you adjust dashboard in New Relic 🙏🏼

So what is left is:

  • dashboard adjustment
  • fixing schema, as now it is failing in VSC
  • persisting data for lifetime
  • describing requirements for people to use schemas in IDE
  • investigating how data collection actually works, caching of schema by plugins, and etag refresh on Netlify side. So we know if we actually get data of only "daily active users" or "increasing number of new users"

missing something?

@fmvilas
Copy link
Member

fmvilas commented Aug 25, 2022

I will definitely be interested to know if people are really adopting version 3.0 once it's out. Would be cool to get some insights. Maybe it's time to measure it on our tools.

@smoya
Copy link
Member

smoya commented Aug 25, 2022

As even if I'm not interested with specific version adoption, I bet others are

I think it is a crucial metric, even though not the only method to collect data from. I would love to have a metric where, after a release, we could see how downloads for older versions go down in favor of the new one.

If you can open a discussion with Schema Store, on how to fix things in future, that would be amazing. As even if I'm not interested with specific version adoption, I bet others are 😄

Done. No hope at all anyway. SchemaStore/schemastore#2440

dashboard adjustment

Do you mean removing the versions stuff from it?

persisting data for lifetime

Do we really need that? With New Relic, we have 1 year right now. If more is needed, we could write some scripts to do aggregations every few months.

investigating how data collection actually works, caching of schema by plugins, and etag refresh on Netlify side. So we know if we actually get data of only "daily active users" or "increasing number of new users"

Related: SchemaStore/schemastore#2438

@derberg
Copy link
Member Author

derberg commented Aug 29, 2022

Do you mean removing the versions stuff from it?

yeah, until we get it solved, this metric is not helpful, we just need total number

Do we really need that? With New Relic, we have 1 year right now. If more is needed, we could write some scripts to do aggregations every few months.

yes we need lifetime data to see over years how numbers change. But I do not mean we need that support on New Relic. Automated script, maybe running on GitHub Actions on a schedule is also fine 👍🏼

Related: SchemaStore/schemastore#2438

yeah, not much help, other than knowing you can clear the cache on demand. Source code indicates it is based on etag. What we need to check what Netlify does when website gets redeployed, if etag for all resources, even redirects is refreshed or not. We are doing some magic there 😄

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added the stale label Dec 28, 2022
@derberg derberg removed the stale label Jan 10, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@smoya
Copy link
Member

smoya commented Dec 15, 2023

@smoya so looks like we can only measure adoption of the spec in general, not its specific versions?

FYI, I tried to give it a last try, but didn't succeed 😞. All the info can be found at SchemaStore/schemastore#2440 (comment).

cc @derberg @fmvilas

Copy link
Member Author

derberg commented Dec 19, 2023

overall adoption is still a great number to have 👍

@smoya
Copy link
Member

smoya commented Feb 13, 2024

FYI, I created SchemaStore/schemastore#3460 as a feature request in Schema Store that, if adopted, will help us achieve our mission.

@sambhavgupta0705
Copy link
Member

@smoya may I know the update of this issue please 😅

@smoya
Copy link
Member

smoya commented Apr 20, 2024

@smoya may I know the update of this issue please 😅

What do you need to know in particular?

@sambhavgupta0705
Copy link
Member

What do you need to know in particular?

Like we will be going forward with this issue or not

@smoya
Copy link
Member

smoya commented Apr 22, 2024

Like we will be going forward with this issue or not

We did move forward. We have a New Relic dashboard that counts downloads of JSON Schema files, among other things.
It is not perfect, as explained in this issue, because due to technical limitations on how plugins work.

ATM there is no quick solution but long term journey, which is working on SchemaStore/schemastore#3460 and then push (or do the work) plugins (such as the VSCode YAML) to adapt to that new mechanism when pulling schemas from Schema Store.

It is a long journey, but happy to welcome people if want to help!

Copy link
Member Author

derberg commented Apr 22, 2024

best would be if we document what we have, make accessible to others and close that issue, as there is a dependency on outside world that will take as @smoya wrote - long way to get it done. What we already have - overall adoption of AsyncAPI is good for me anyway, as I personally do not care much about specific version adoption. We have one big challange - 0 historical data as new relic free account that we use do not preserve data

@sambhavgupta0705 sambhavgupta0705 added the keep-open Prevents stale bot from closing it label Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement keep-open Prevents stale bot from closing it
Projects
None yet
Development

No branches or pull requests

8 participants