diff --git a/sig-providers/meetings/021-2024-09-25.md b/sig-providers/meetings/021-2024-09-25.md new file mode 100644 index 00000000..1fd9205b --- /dev/null +++ b/sig-providers/meetings/021-2024-09-25.md @@ -0,0 +1,199 @@ +# Akash Network - Providers Special Interest Group (SIG) - Meeting #21 + +## Agenda +- Provider updates on Akash Console 2.0 work +- Migration from Praetor to Akash Console +- Zealy working group collaboration and feature testing +- GPU labeling and scaling issues for providers + +## Meeting Details +- Date: Wednesday, September 25, 2024 +- Time: 08:00 AM PT (Pacific Time) +- [Recording](https://bkfs3bdq5725mjyqedqhdutacourrgdyaealobopsbierafcz4ba.arweave.net/CosthHDv9dYnECDgcdJgE6kYmHgBALcFz5BQSICizwI) +- [Transcript](#transcript) + +## Participants +- Tyler Wright +- Jigar Patel +- Deval Patel +- Andrew Mello +- Robert Del Rey +- Maxime Beauchamp +- Andrey Arapov +- Vigneshwar Viswanathan +- B S +- Rodri R Scott +- Carruthers +- Others + +## Meeting Notes +- **Tyler Wright's Opening:** + - Discussed updates regarding Akash Console 2.0, focusing on provider developments. + - Invited Jigar and Deval to provide a status update on the ongoing work around Akash Console. + +- **Jigar Patel's Update:** + - Akash Console is undergoing internal testing with a multi-node structure. + - Praetor tool is being integrated into Akash Console 2.0. + - Migration for single-node users will be smoother, but multi-node users will require resource adjustments. + - They expect to move to beta testing soon and will update the provider channel once it's ready. + +- **Deval Patel's Remarks:** + - No significant migration challenges expected for providers, especially if using Kubernetes. + - Future upgrades will simplify processes, particularly for multi-node setups. + +- **Robert Del Rey's Query:** + - Asked about key features in Praetor that should be tested by the Zealy working group. + - Jigar recommended focusing on testing the new provider console instead of Praetor since the transition to Akash Console is underway. + - They agreed to collaborate on creating relevant missions for Zealy users to test new features in Akash Console. + +- **Andrew Mello's Contribution:** + - Raised two points: + 1. **Provider Issue:** Ongoing issue where non-schedulable nodes still bid on the network. Awaiting triage from the engineering team. + 2. **Discount Scheme:** Implemented a staking-based discount scheme for providers, allowing users to receive discounts based on their staked Akt. + - Suggested it could be highlighted in a blog for wider awareness. Tyler agreed to follow up on this. + +- **GPU Labeling & Scaling Concern (Andrew Mello):** + - Criticized the manual process of labeling GPUs, calling for automation using Nvidia APIs. + - Warned that the current approach could hinder scalability, especially as the provider network grows. + - Tyler acknowledged the concern and promised to follow up with the core engineering team for clarification and feedback on why the automated solution wasn’t pursued. + +- **Open Discussion:** + - Tyler reminded participants to check the provider announcements channel regularly for updates. + - Discussed a recent message about the “Kill zombie parent script” in the provider announcements channel. + +- **Closing Remarks by Tyler Wright:** + - Action items from the meeting will be followed up in subsequent SIG meetings or asynchronously in relevant channels. + - Thanked everyone for their participation and reminded them of the steering committee meeting happening the next day. + +## Action Items +- **Jigar & Deval:** Continue testing Akash Console 2.0 and provide updates to the provider channel as the beta phase approaches. +- **Tyler Wright:** Follow up with Andrew on the discount scheme blog post and ensure its wider publication. +- **Tyler Wright:** Provide clarification on the decision to avoid Nvidia API integration for GPU labeling. +- **Andrew Mello:** Reshare details of the staking-based discount scheme with Tyler for further review and publication. +- **All Providers:** Regularly check the provider announcements channel for important updates. + +# **Transcript** + +Tyler Wright: All right, welcome everybody. To providers monthly meeting for September. + +Tyler Wright: Over 25th 2024 during this special interest group for providers. The ecosh community, talked about anything related to their cost provider. Sometimes we talk about pooling upgrades the Previously again a team led by Jigarette Ball built a tool called Prater. that is now a part of Akash console. So they've given us updates on all the work that they've been doing great and again and now around to cost console and then be opened up to any conversations from anybody on the call in the community. I know that there's been some talk about provider configs for the cost, I work out. So again I'm not sure the person is not currently here, but that might be something that we talked about later on. But yeah, again all of the meetings are + +Tyler Wright: Hub under the providers subfolder in the community repo. But usually again I'll just hand it over to maybe Jigger and Deval to talk about how things are going with the provider portion of a cost console which they're working on. That's a part of their console 2.0 work. So Jigarette device kick it over to you to just give the community an update and all the things that you're got going on. + +Jigar Patel: Yeah, thanks, Yeah, so we are currently working on the process of becoming a provider in There is a working progress, PR that is across console Github. And currently where we are, testing internally of the whole new PCs process with the multi note and different personality around it. And we are confident that it's going in the right direction. And yeah, we'll meet us soon to test it out. + +Jigar Patel: And then, if that guy's successfully tested any boxes, books are fixed. Then we'll erase that to the public. But yeah, the progress is going good. And I'm thankful to Scott with his hair upon k3s scripts, the process become easier and we're like taking that itself bits and pieces of that script into our process on. That makes three years multinod. So that was the biggest change in our process where for multinodes we were using KAS, which was having process we had to do. + +Jigar Patel: A lot of installations and processes on the provider side, which now is okay. Here is much simpler and lightweight. And yeah, we're experimenting with that. Yeah, once that's in a beta, will put update into the ecosystem better and into the provider channel as well. Thanks. + +Tyler Wright: Excellent, thank you there. does anyone have any questions on update from Jigaron? Duval on the work? They're doing around the provider side of the cosh Council which again is currently pay to work. + +Tyler Wright: One question, I think you all have talked about this previously, jigar, but have you all started to think about a migration path. If there's one necessary from prey tour to their trash console, or people got to spin down resources and spin them back up. Yeah, the console, when it's in proud, + +Jigar Patel: so if they are already on a single node, that might be a way to Migrate them to the new console, but if there are no multinode or structure where Kate s is used then. Yes, there will They need to, take it down and I think everybody can add more on this topic. + + +### 00:05:00 + +Tyler Wright: Excellent. Go ahead. Deval + +Deval Patel: In the migration path, there is nothing much to do because then what happens is that the behind the scene, they're providers running, right? And what we support after becoming, the provider is just adding the attributes and, changing a few things here and that we give the upgrade path as well. So that should not be any issue going forward, But yeah, I don't think so, we need to migrate them from CAs process to Kts, right? But behind the scene posts, both uses the Kubernetes, keep CDL. Right? So we can run commands, or we can upgrade the hand charts and do things by that, right? So yeah. + +Tyler Wright: Perfect. Go ahead. Jigar. + +Jigar Patel: Yeah, so some of the functionality will be only Let's say adding a node or doing node, which is not available in predator right now, right? So for that, yes, they need to upgrade in a future, But to continue using the provider as it and current state, they don't need to because we were able to update the library's as long as they are on a helm chart. They were able to do Pretty much upgrading the provided self or restarting, the provider. So those functions will be intact and they will be able to use that into the new process as well. It just something that is Kubernetes related functionality. That will be added later on will not be available to them. + +Tyler Wright: All Thank you very much. Any other questions for jigarette Deval Before we move on. + +Tyler Wright: Go ahead, Robert. + +Robert Del Rey: Hey guys, hope everybody is feeling. Okay. yeah, I have a question for the page or team cigar and deval + +Robert Del Rey: I'm pretty sure you're aware about where we're doing the Cash City Working Group. We have missions. We motivate people to test out the cash and we reward them. Anyway, in the working group, we were thinking about having a great information. And I wanted to ask you guys Are there any key features you want people to test out? And the reason why I asked that is specifically, it's because we can do a mission about go to Prater and do this and then write your feedback and we can add a form so we could get I guess some comments and some people testing later. So that's why I went to ask which key features, I don't know. They could test out + +Jigar Patel: Hey, thanks Robert. Yeah, absolutely. We are open to that but I wanted to just put it here that let's do testing on the new provider console that what we are working on, right? Because we will be stop working on the older printer provider, So most guys available in beta, I'll reach out to you for silly and I'll join the silly working group as well to discuss more on that. + +Robert Del Rey: Here, that's great. If you don't mind, I can also send you a message and… + +Jigar Patel: Absolutely absolutely. + +Robert Del Rey: we can take the Thank you so much in the next question is. I believe I can keep this question for later since you have plans on migrating to console. But I wanted to ask you if I were to do a video about greater What other things you would like to highlight about it? + +Jigar Patel: Are human like the functionality device. + +Robert Del Rey: Yeah. I mean, there are videos about how to become a provider using Prater. So I wanted to know if you have other ideas about other type of videos we could do about Prater. + +Jigar Patel: Yeah, I think once we go live with the new console provider, I think that would be great because the process and the screens and will be changed. Then what we had in previously, right? So to presentation of the memories and to have a newer, information through videos, That would be great, So, yeah, I think once we have some kind of beta, you and me, what we can like to a meeting and then I'll show you how this works. And then you can make a video on it so the public how this whole process works. + + +### 00:10:00 + +Robert Del Rey: Right, that sounds really cool. Thanks, That's all for me from now. Thank you. + +Tyler Wright: Thank you Jigarette and Deval for the update, again, we'll look for more updates and Rider Praetor and other channel channels I'm sure as we get to the Beta phase. Cool. His ears must have been ringing because I know that Andrew, who's on the call today has been talking. The providers channel actually made some issues around some things. He talked about in the last six support meeting, but since again, made some issues around some of the behavior that he's seen. And again, I think the members of the core engineering team are going to do some continue testing. And then again, at the next six support meeting, if not prior it will get triage and there'll be some comments made. + +Tyler Wright: Andrew not to put you on the spot, but I know that your tend to be vocal. I just want to see if you want to talk about any stuff that you can talk about in the providers channel. or if there's anything else you want to talk about specifically? Because I think we had come to that part of the agenda for open discussion. + +Andrew Mello: Yeah, hi everyone. Can you hear me? Okay, I'm on a different connection today. + +Tyler Wright: Yes. + +Andrew Mello: Okay, great. My first point of order was gonna be the support issue that is remaining open for providers. That basically any node in the cluster, that's tainted. As non-schedule is still bidding on the network. And again, that's still sitting there. I believe a waiting triage since last week, which would be one of my major concerns and then additional to that today. I don't have the ability to share this right now with you guys, but I did kind of want to bring this up on the call with everyone this week. I was able to implement for the first time, ever discounts for users on providers. And this is particular to the crypto and coffee providers. And the ones that I run + +Andrew Mello: Basically the scheme is going to be set up like this that by staking a specific amount of Akt on the crypto and coffee, validator, That is taking those coins. Will automatically receive discounts when deploying against the nine providers that I run. So I was actually looking for to let you guys know that that's coming, but be also a little bit of direction in terms of where we could publish that. I have a full page devoted to this that I just have yet to launch. And when I launched that page the page is basically the instructions. The tiers that, you get your discount for staking x amount of coin. this is a really nice potential. blog piece or for the homepage kind of thing because it really solves discounts. I mean, we've never been able to give away, + +Andrew Mello: Accounts, and things of that nature. So it's unique way to approach the problem. It's live right now and once I would love again, Tyler, just a little bit of guidance of where you think I should bring that to light and bring that out and make people aware because without awareness, it doesn't do much good. So, + +Tyler Wright: Yeah, I can get some additional insights. I would say The blog makes the most sense because that'll be the most far-reaching. I think, on the website, there is a way to just make a blog or you can share with me or in a public channel, kind of like a DOC version of the blog and… + +Andrew Mello: Yes, it just Tyler. + +Tyler Wright: then we can work on getting it out to the site. + +Andrew Mello: Just for context. I did share that with you a few days ago, if you look back. So if you could just follow up with me on that, take a deeper look at it know that I am gonna publish that to Github and my website, they're gonna appear on both, but I want to get it into the hands of users, Because it's really pretty cool. Again, steak a few thousand Akt and you can get, double digit percentage discounts on the providers. So pretty, nice incentive for people to keep taking Akt. And also again, some of our larger holders are gonna have some really nice discounts on compute + + +### 00:15:00 + +Tyler Wright: That's Andrew. If you can just read a share that with me, I'm looking through our thread and not finding right now so if you can reshare that with me, I'll make sure that it gets looked at and shared more widely. + +Andrew Mello: I will. Yep. And again, just please don't forget. The providers are bidding when they should not and that is Pretty weird. Yeah. Okay. Thank you. + +Tyler Wright: Thank you. Any thoughts or questions about anything that Andrew was just talking about? + +Tyler Wright: All right I wanted to see if there's anything else on the provider side that anybody else wanted to discuss today. Just want to see if there's anything else that anybody wanted to bring up. + +Tyler Wright: If not, I would just remind everybody to always look at the provider announcements. Again, If you're an active provider on the cash marketplace, please look at the providers announcements channel. I think Andy and others share messages. So again, since our last meeting I think there was a message around the Kill zombie parents grip. So again, if you are an active provider again, please look at that channel or if you need access to that channel for whatever reason, please reach out to myself, Tyler Core team or any other members of the court team or cautious insiders to get a roll update. + +Tyler Wright: If there's nothing else, then you all can have some time back again. I'll follow up and look for that Doc from Andrew Mello. I'm sure the Core Engineering team will continue to follow up on some of those issues that are waiting. Triage again, we usually trust issues on a monthly basis but for some of these pressing issues, we can push along a little bit faster. again, please again, if you see anything on the network, continue to testing, bring insiders. And again, if there's an issue, we can escalate it and create an issue inside the support repo. + +Tyler Wright: Much appreciate bigger and deval for all the work that they've been doing in the updates that they provided around the work that they're doing on Console 2.0 Again. There will be some announcements made shortly across a number of different channels. Again, look out for announcements that Andrew has about some of the work that he's been doing and continues to work on. Go ahead, Andrew. + +Andrew Mello: Sorry for the delay here, but I actually did have one other point of order to bring up because it doesn't seem to be getting much movement attraction. Namely the major issue of manually labeling GPUs is something that again is + +Andrew Mello: Creating a demonstratively difficult situation as we continue to evolve here and the real, right way to do things is to use the Nvidia APIs that become available. Once the helm charts are deployed and just read from those API endpoints, the users GPU information and all of the details that already exist on cluster. And I'm just getting concerned that I am not seeing any effort direction motivation, or understanding really that this data that we're asking users to. + +Andrew Mello: manually collect and put together themselves and create the situation where we're basically slowing our expansion, unnecessarily and the real solve for this is to simply have the operator inventory, read the Nvidia, API endpoints that are available on the cluster. Now taking that to the next Level I have working solutions for this. I have working solutions. I have no where to go or implement this or see anyone kind of taking on how we're gonna do GPUs, right at scale. So I just wanted to kind of put that out there for the providers that Again if that there is a better way to add GPUs, the data is all available on cluster and + + +### 00:20:00 + +Andrew Mello: Where would I go to get this going in the right direction because again, the issues with manually adding GPUs are substantial and greatly outweigh the issues of simply reading the GPUs from, the API endpoints and using the existing NVIDIA validation and video validation software to verify that the GPUs is the thing we should be doing. So that is something that again has been a change since this spring that been resolved and not adequately addressed + +Tyler Wright: Thank you for sharing. What I can do is again, I know that there have been updates to the process on the manual process right now. I know that's not what you're looking for. I can share some feedback because again, the core engineering team had already looked into the solution that you're described and decided not to move forward with it. I can provide you in an appropriate providers or providers channel sometimes soon directly from record engineers. So why that is and then maybe you can provide a demo and we can talk async about some of the next steps. But again, there was a clear decision made not to move forward in that direction. I don't want to go into it right now but I'll make sure that you get the adequate information that you're looking for. + +Andrew Mello: Just to have it on the record here. It's very very Difficult now, to scale providers due to the way that GPUs are implemented. And I do not feel that this process is in any way made anything better for the provider or the user, and their ability to validate the GPU, is the GPU particularly because once again a special with new providers, who are inexperienced have Understanding of the complexity of the inventory operator, the inventory operator, that was is Quite simply puts a very disappointing this year of its function. + +Tyler Wright: Thank you Andrew, for that feedback. Again, I had an action item and we'll take some necessary steps between meetings and follow up in Next meeting with some at least updates from the community but much appreciate that. + +Andrew Mello: And just to put a bow on that to make sure everyone's very aware of why my concern is so high for this. scaling scaling Right. 50 providers is a drop in the bucket of kind of early startup days. But when we get to 500 providers, when we when we get to 50,000 providers, we cannot get there. Adding one GPU at a time. Thank you. + +Tyler Wright: Thank you. Is there any other discussion topics that anybody wants to bring up here on the stick providers call? + +Tyler Wright: If not. Then, again, there's a number of action items that will handle continue push forward between meetings, much appreciate everybody's participation again. I'll see you all next month. There's a steering committee meeting happening tomorrow, I believe, but I get much appreciated. Everyone's opinions, thoughts, be back hard, work etc. Yeah. I hope everyone has a great rest of the day and we'll continue to work together and talk online. Appreciate you. + +Andrey Arapov: Thank you. + + +### Meeting ended after 00:24:22 +