Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to split generated pilota files #510

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

missingdays
Copy link

Fixes #454

@missingdays
Copy link
Author

missingdays commented Oct 18, 2024

@Millione @PureWhiteWu Hi. I couldn't find tests for the related builders, could you please point me to one if they exist?
Otherwise, I changed one example to use the new split_generated_files option. It compiles and runs. I also verified that the files are actually split and that the RustRover IDE still navigates to the symbols in the generated split files.

@Millione
Copy link
Member

Perhaps you can add one integration test in the tests folder for workspace mode?The builder is here https://github.com/cloudwego/volo/blob/main/volo-build/src/workspace.rs#L11

@CLAassistant
Copy link

CLAassistant commented Oct 28, 2024

CLA assistant check
All committers have signed the CLA.

@Millione Millione force-pushed the feature/split-generated-files branch from 6706960 to 029121d Compare October 28, 2024 12:58
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here can we split it in files?

@Millione
Copy link
Member

@missingdays Hi, are things going well here?

@missingdays
Copy link
Author

@Millione Hi, I'm waiting for the reply here - #510 (comment)

@Millione
Copy link
Member

Millione commented Oct 30, 2024

Sorry for that, I forgot to say that can you check out why the generated code fails to compile in split mode? It seems related to the split mode implementation in pilota-build

@missingdays
Copy link
Author

@Millione As I mentioned, Windows cannot handle files that only differ by case. For testService and TestService, the generated files are considered the same, so the compilation fails.

@Millione
Copy link
Member

Maybe we can change the filename that only differ by case a little bit? For testService and TestService, it can be TestService1 and TestService2 (just for example)

@missingdays
Copy link
Author

We can do that, I'll try to implement it

@missingdays
Copy link
Author

@Millione Please see cloudwego/pilota#280. I added a test as well

@Millione
Copy link
Member

Millione commented Oct 31, 2024

@missingdays cloudwego/pilota#280 gets merged, can you check out whether it works fine for here?

Here can we split it in files?

There is one more question left. There are some structs generated in one file because of volo-build, I think we should split it either.

@missingdays
Copy link
Author

missingdays commented Nov 4, 2024

@Millione

gets merged, can you check out whether it works fine for here?

The tests pass now

There are some structs generated in one file because of volo-build

In impl pilota_build::CodegenBackend for VoloThriftBackend, there's no API that would allow splitting the stream into multiple files, since there's no access to the directory where the additional files should be placed. We could change the API in a couple of ways:

  1. Provide a base directory in which additional files can be placed. This will solve this specific problem, but it's a bit specific I think
  2. Instead of passing stream: &mut String, pass stream: PilotaStreamBuilder, which would have methods such as push_str, start_new_file_generation(name), finish_file_generation. So it could be used like
stream.start_new_file_generation("ServiceResponseRecv.rs");
stream.push_str(...); // push the impl of ServiceResponseRecv
stream.finish_file_generation();

This way, PilotaStreamBuilder can always be extended with new methods which will allow more precise generation without changing CodegenBackend each time.

  1. Provide the base directory in Context which is accessible from CodegenBackend

@Millione
Copy link
Member

Millione commented Nov 4, 2024

@missingdays Great thoughts. Point 2 and 3 look good to me, I think you can go for it.

@missingdays
Copy link
Author

@Millione After exploring the code a bit more, I realized that we could use Context#mode + Context#item_path to get all the required information to locate the files. But mode is pub(crate) and so cannot be accessed from volo. I sent an mr to make it pub - cloudwego/pilota#286

@missingdays
Copy link
Author

missingdays commented Nov 7, 2024

@Millione Hi. Splitting service_ImageService requires more work than I originally anticipated, basically re-writing all the logic from pilota again in volo.
Considering all the other splitting, I'd think if it's not split, it will still not get too large to cause any problems, wdyt?

@Millione
Copy link
Member

Millione commented Nov 8, 2024

@missingdays I think it will still be too large and leaving it unsplit looks inconsistent to the user. Is the workload heavy? Can you elaborate on what you're going to do? We can work on it together to find out the best solution.

@missingdays
Copy link
Author

@Millione you are right, it would be inconsistent.
Is the service name deduplication needed here as well? Can there be 2 services named articleService and ArticleService in the rpc_article/src/article module for example?

@Millione
Copy link
Member

@missingdays Yes, I think it is needed here either.

@missingdays
Copy link
Author

@Millione I added the functionality to split the services generated by volo into several files. Let me know if this splitting is enough or if we should split it into more files.

As for the name deduplication, I didn't find a good way to do it. What would be the correct way to get all the needed service names so that we could check if the service name already exists?

@missingdays
Copy link
Author

There are also so more changes to pilota required, so I'm using a custom branch fix/context-mode-pub from my repo https://github.com/missingdays/pilota. We can make all the changes there and then merge them all together before merging this MR, to avoid a bunch of small MRs to pilota.

@Millione
Copy link
Member

@missingdays

I added the functionality to split the services generated by volo into several files. Let me know if this splitting is enough or if we should split it into more files.

I think we can split it into more files, commented in the code specifically.

What would be the correct way to get all the needed service names so that we could check if the service name already exists?

As for this, you can refer to cx.names(def_id), if its name duplicates, it will return true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split to ImageServiceClient.rs and ImageServiceServer.rs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split the Recv and Send?

@missingdays
Copy link
Author

As for this, you can refer to cx.names(def_id), if its name duplicates, it will return true.

The deduplication has to check all the names, not just the current one. Because we need to transform all names to the lowercase and only then check for duplicates. For example, service ArticleService and service articleService will have different def_id, but their related directory names would clash.

@Millione
Copy link
Member

The deduplication has to check all the names, not just the current one. Because we need to transform all names to the lowercase and only then check for duplicates.

What's behind cx.names(def_id) is that we've already converted all the names to the rust style to check for duplicates, and stored all the corresponding def_ids in a HashSet. I think this will work as you describe?

but their related directory names would clash.

I haven't understood it well for how their related directory names affect this, can you explain it more? As far as I'm concerned, the cx.names(def_id) for service ArticleService and service articleService will both return true, and then we know we should make a change for the filename.

@missingdays
Copy link
Author

missingdays commented Nov 14, 2024

Thanks, I see your point. However, we need to know how to deduplicate. If for both ArticleService and articleService we get true, how to know for which service to generate a postfix _1 and for which _2? If the ordered list of all names was available, we could check if ArticleService or articleService is first in that list and assign it the _1 postfix, and the other _2.

@Millione
Copy link
Member

Oh, I get it.

If for both ArticleService and articleService we get true, how to know for which service to generate a postfix _1 and for which _2?

I have thought about using a global AtomicU64, but maybe not a best way to solve this (the postfix maybe different every time generated and influences the IDE cache)

If the ordered list of all names was available, we could check if ArticleService or articleService is first in that list and assign it the _1 postfix, and the other _2

Great, I think you can put any information you need into cx, and the ordered list of all names information I think you can get from https://github.com/cloudwego/pilota/blob/main/pilota-build/src/middle/context.rs#L382.

@Millione
Copy link
Member

@missingdays Any help needed here? I can help implementing the rest if you don't have enough time.

@missingdays
Copy link
Author

missingdays commented Nov 18, 2024

@Millione Yeah sorry, I've been busy most of the last week.
I think only deduplication of the folders is left to be implemented? If you could implement it, that would be great.
I added you as a collaborator to both pilota and volo forks, feel free to push any changes to the fix/context-mode-pub branch if necessary.

@Millione Millione force-pushed the feature/split-generated-files branch from f10c984 to 7c1309e Compare November 20, 2024 14:16
@Millione
Copy link
Member

@missingdays

I think only deduplication of the folders is left to be implemented?

Yes, I just implement it, please take a look if there's something I'm missing. Futhurmore, for the rest of job, I think there is only some detailed code style to change.

@missingdays
Copy link
Author

@Millione Nice!

I think there is only some detailed code style to change.

I'm ready to fix them if you have any comments.

volo-build/src/thrift_backend.rs Outdated Show resolved Hide resolved
volo-build/src/thrift_backend.rs Outdated Show resolved Hide resolved
volo-build/src/thrift_backend.rs Outdated Show resolved Hide resolved
@Millione
Copy link
Member

@missingdays Well done! I think it's ready to merge. May you submit a PR to pilota first?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Generate thrift code in multiple mods/files
3 participants