Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable block size in LSS #182

Open
barche opened this issue Feb 1, 2012 · 14 comments
Open

Variable block size in LSS #182

barche opened this issue Feb 1, 2012 · 14 comments
Assignees
Labels

Comments

@barche
Copy link
Member

barche commented Feb 1, 2012

No description provided.

@ghost ghost assigned tbanyai Feb 1, 2012
@tlmquintino
Copy link
Member

what is the objective here? is it to have multiple block sizes for the same linear system?
e.g.: [4x4][3x3][1x1] - 4 coupled equations, 3 coupled equations and 1 equation?

@tbanyai
Copy link
Member

tbanyai commented Feb 2, 2012

  • nope, that one can be done fairly straightforward by wrapping multiple lssmatrices and lssvectors
  • the point is to have different number of unknowns per nodes

@tlmquintino
Copy link
Member

"the point is to have different number of unknowns per nodes"

In which sense you want to have different number of unknowns per nodes?
Different points in a shape function have diff. number of variables?

@tlmquintino
Copy link
Member

@barche was this what you implemented in the LSS?

@wdeconinck
Copy link
Member

This is necessary for solving multi-physics in one system. Parts of the matrix will have different number of equations. So different block-sizes.

Pedro suggested not using a block-accumulator at all, and offer an API to directly insert entries in the matrix.
Two approaches could of course co-exist.

@barche
Copy link
Member Author

barche commented Apr 18, 2012

No, this is not implemented yet, the changes I did just change the ordering from [u1, v1, p1, ... un, vn, pn] to [u1, v1, ... un, vn, p1, ... pn]. At the interface level, things still work as if it's in the old ordering, though. The change was purely a preparation for using block preconditioners from the Teko package, which only need the internal Trilinos matrix to have a blocked structure.

To get a variable number of equations, we should first think a bit about an interface that can easily support this. Working directly on the matrix would work and is easy to implement, but it also places a lot of burden on the user. It may be a good first step, though.

@wdeconinck
Copy link
Member

Direct insertion in the matrix needs an API of course, returning the location in the matrix, specifying an element and variable id.
Pedro has experience with this. Maybe he can be of assistance (coding, or verbal)? (Question to Pedro to respond).

@pmaciel
Copy link

pmaciel commented Apr 19, 2012

Hi there,

it was forever since I spoke to any of you and I haven't followed CF3 as much as I've liked, though I hope that will change. This also serves as a disclaimer regarding my knowledge of the code or past design decisions...

From a low-level point of view, and for those familiar with the CSR/CSC matrix formats, there was in the past a BLAS/LAPACK proposal for multi-size block matrix structures, particularly a format called VBR (variable block row), which addresses the problem of variable "block" size. In addition to IA, JA, it also added an BIA, BJA equivalents. It was not accepted, but somehow made it's way into Aztec. It is much more flexible than the original CSR/CSC formats (well, it does have block-addressing) and it is an option for multi-physics simulations, in my regard. But most (all except Aztec?) solvers don't seem to support it, so I think for generality you're better off writing a indexing wrapper for CSR instead, keeping it simple... and general.

But I wanted to clarify one thing, the concept of a "block accumulator", first by understanding what that name means. About it's name, it is to serve as an intermediate cache (accumulator) that can represent somehow an element contribution, or partial calculation regarding a finite element/volume/geometric equation term (block). That's how I see it at least. From an engineer point of view, it's nothing more than a local matrix, a highly convenient one because it avoids the global integration of the particular element in a mesh (typically), so indexing is done once, and is reusable. There is place for a block accumulator, yes, but you see, there are no "elephants all the way down" (discworld term for recursivity). That means, as soon as you have the hierarchy mesh -> LSS -> block accumulator, you're finished. I think that works for 75% of what everybody needs doing now, but multi-physics and multi-domain require something else (in my opinion).

So my proposition is something else, getting rid of it altogether. First, start with how a block accumulator (BA) is used: it is just a local, element-wise linear system (LS). It contains a small, dense matrix, solution and right-hand side (RHS) vectors (these last two are great additions, by the way). As a side note, typically in CF2 assembly, the BA was calculated locally, the RHS globally, then the BA got mapped into the big matrix. This required global and local knowledge at the same time... that is, even the ordering of the equations and variables were forced sequential between local and global.

So once you have a big LS, and a small LS, you're missing the mapping. (Now before I go into this, I haven't really tried this before.) The mapping typically has a relation to a mesh, and this is where things get really funny: you can even get rid of that "restriction". So I explain: in multi-domain you want a big LS, but you have two different things happening in two domains (A&B), each with their own mesh, and each domain uses different PDEs (NS&Heat). So you want:

LS|big <- inter-mesh/coupling-based mapping -> (LS|A) <- mesh A sequential mapping -> (LS|NS) local dense "block-accumulator" linear system
LS|big <- inter-mesh/coupling-based mapping -> (LS|B) <- mesh B sequential mapping -> (LS|Heat) local dense "block-accumulator" linear system

And obviously, the big advantage is that we don't have to stop in "LS|big" but add a "LS|bigger", that can do something else too. That is truly multi-domain, building the simulation out of smaller meshes, while the mappings can account for the multi-physics (inter-mesh/coupling-based mapping). The mappings have obviously to know about the variables and their ordering, and they consist of pair combinations, like where typically you'd have two maps "local2global" and "global2local" (ah, but that is so despicable!) you now can get an object myfunnymap< MapType >::add(LSbig,LSA), myfunnymap< MapType >::set(LSbig,LSA) or even better, declare myfunnymap< MapType > mapping, and use mapping.add() making use of cached indexing tables calculated at construction time.

I can easily foresee a factory of mappings, and (maybe) a policy pattern maybe in the style we do "from-any-to-any" temperature conversions because they are non-linear: type A index ordering get converted to COMMON index ordering, then to type B ordering. Unless you actually specialize the mapping<typeA,typeB> combination in some optimal way. Sorry this is a technicality, maybe not a good analogy, and a lot needs to happen before the problem gets to this.

But this way, you get the same performance of block-accumulator type assembly, but in a consistent approach hiding local/global structures, the performance of caching indexing (or just run-time generating an index "[i*size+j]"), multi-domain and multi-physics (this last one, at least from the approach of variable-block size). Win-win-win-almost win.

Isn't this the original way Galerkin first saw a PDE problem? Just do a linear system on a element! So I cite prior art, can't patent it and won't become a millionaire.

My 3.5 cents :-)

@pmaciel
Copy link

pmaciel commented Apr 19, 2012

Oh, that was big. Sorry...

@tlmquintino
Copy link
Member

Great post!
I especially like the Discworld reference.

I agree with this vision. In the spirit of Zen simplification: Two LSS views and a mapping in between.
Creating this indirection, increases the future options enormously.

I am only concerned with with the efficiency - but premature optimization is the root of all evil. ;)

@wdeconinck
Copy link
Member

Thanks for the 3.5 cents! Very much appreciated!

It seems like an interesting way, and hopefully as your reasoning states there would be no performance hit, I would like to see a proof of concept. Feel free to fork and start coding, whenever u find some time to do so.

My opinion is functionality and simplicity over performance, if the performance hit is acceptable (<2%).
I think "functionality and simplicity" should become the CF3 moto and always be in our minds.

@tlmquintino
Copy link
Member

"Functionality and Simplicity" -- is a very Zen moto. I like it!

@wdeconinck
Copy link
Member

@tlmquintino You feel very Zen today it seems :p

@barche
Copy link
Member Author

barche commented Apr 19, 2012

Hi Pedro,

Many thanks for your extensive comment! Some considerations:

  • The "knowledge of variables and their ordering" is currently encapsulated in the VariablesDescriptor (already used in a new create method I added)
  • We should strive towards an extension or evolution of the current LSS interface. Probably new create and access methods will need to be added.
  • Developing in a separate branch is essential, since both Sebastians and my PhD work rely heavily on the current LSS as it is used in UFEM. Ideally, not much should break, but I will of course help with the conversion of UFEM if needed
  • I think currently everyone is stretched for time, so it would be perfect if you could work out some proposals, preferably starting from the LSS interface as-is, assuming you are not stretched for time, that is ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants