Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoder for Time Information not found #28

Open
marcocrxu opened this issue Mar 23, 2023 · 5 comments
Open

Decoder for Time Information not found #28

marcocrxu opened this issue Mar 23, 2023 · 5 comments

Comments

@marcocrxu
Copy link

marcocrxu commented Mar 23, 2023

Pilgrim uses several methods to compress time information (interval and duration).
It seems there is no decoder for time information in pilgrim_app_generator.c. The output of time information to intervals.dat and durations.dat in pilgrim_logger.c makes me feel confused and have no ideas how to decode it.
Could you please give some decode cases about them? Many thanks.

@wangvsa
Copy link
Collaborator

wangvsa commented Mar 23, 2023

Pilgrim app generator(pilgrim_app_generator.c) is a code generator. Given the pilgrim traces, It tries to generate a C program that recovers the communication pattern. It relies on the order of the captured calls, not the detailed time information.

I just added an example of decoding timestamps in pilgrim2text.c.
#29
(You need to rebuild pilgrim and re-generate the traces)

So you may want to start from pilgrim2text.c. You can also try running pilgrim2text /path/to/your/trace-dir to see the outputs.
The command will generate a _text directory under your traces directory.

@wangvsa
Copy link
Collaborator

wangvsa commented Mar 23, 2023

By the way, the two papers below have all the details about Pilgrim:
Near-Lossless MPI Tracing and Proxy Application Autogeneration
Pilgrim: Scalable and (near) Lossless MPI Tracing

@marcocrxu
Copy link
Author

I have read your papers, and I am interested in Pilgrim. Your new pr helped me a lot. Many thanks.
However, I found that sometimes I got segmentation fault when generating proxy using Pilgrim. The test program is flash, like sedov-3d, stirturb. It seems that sym->val may < 0 in some cases here.

void handle_one_symbol_pre(FILE* f, Symbol *sym, CallSignature *cst) {

    if(wt_loop) {

        if(sym->val >= 0)
            wt_loop_count -= get_wt_completed_reqs(&cst[sym->val]);
        /* sym->val may < 0 and then segmentation fault happens */
        if(cst[sym->val].func_id != wt_loop_call_id) {
            // .....
        }
    }
}

I try to return at once if sym->val < 0, but the proxy it generated will be messed up. Do you know how to fix this? Much appreciated in advance.
Also, I found the generated proxy use same buf in MPI_Reduce for both send and receive, which result in a runtime error reported by openMPI. I made a small modification, like using a buf_recv, to avoid this problem.

@marcocrxu
Copy link
Author

Here is a case when I return immediately in handle_one_symbol_pre. The function for nonterm is empty, which is abnormal.
The buf_0 = malloc(10000000); is set by me since sometimes buf_0 = malloc(0); may happen.

image

@wangvsa
Copy link
Collaborator

wangvsa commented Apr 24, 2023

Sorry for the late response, was totally occupied by other projects. Will take a look at this, but might take me a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants