Variable block size in LSS #182

barche · 2012-02-01T10:07:47Z

No description provided.

tlmquintino · 2012-02-02T10:03:55Z

what is the objective here? is it to have multiple block sizes for the same linear system?
e.g.: [4x4][3x3][1x1] - 4 coupled equations, 3 coupled equations and 1 equation?

tbanyai · 2012-02-02T10:13:52Z

nope, that one can be done fairly straightforward by wrapping multiple lssmatrices and lssvectors
the point is to have different number of unknowns per nodes

tlmquintino · 2012-02-02T14:59:27Z

"the point is to have different number of unknowns per nodes"

In which sense you want to have different number of unknowns per nodes?
Different points in a shape function have diff. number of variables?

tlmquintino · 2012-04-18T09:36:26Z

@barche was this what you implemented in the LSS?

wdeconinck · 2012-04-18T10:57:29Z

This is necessary for solving multi-physics in one system. Parts of the matrix will have different number of equations. So different block-sizes.

Pedro suggested not using a block-accumulator at all, and offer an API to directly insert entries in the matrix.
Two approaches could of course co-exist.

barche · 2012-04-18T21:48:19Z

No, this is not implemented yet, the changes I did just change the ordering from [u1, v1, p1, ... un, vn, pn] to [u1, v1, ... un, vn, p1, ... pn]. At the interface level, things still work as if it's in the old ordering, though. The change was purely a preparation for using block preconditioners from the Teko package, which only need the internal Trilinos matrix to have a blocked structure.

To get a variable number of equations, we should first think a bit about an interface that can easily support this. Working directly on the matrix would work and is easy to implement, but it also places a lot of burden on the user. It may be a good first step, though.

wdeconinck · 2012-04-19T07:38:47Z

Direct insertion in the matrix needs an API of course, returning the location in the matrix, specifying an element and variable id.
Pedro has experience with this. Maybe he can be of assistance (coding, or verbal)? (Question to Pedro to respond).

pmaciel · 2012-04-19T10:18:05Z

Hi there,

it was forever since I spoke to any of you and I haven't followed CF3 as much as I've liked, though I hope that will change. This also serves as a disclaimer regarding my knowledge of the code or past design decisions...

From a low-level point of view, and for those familiar with the CSR/CSC matrix formats, there was in the past a BLAS/LAPACK proposal for multi-size block matrix structures, particularly a format called VBR (variable block row), which addresses the problem of variable "block" size. In addition to IA, JA, it also added an BIA, BJA equivalents. It was not accepted, but somehow made it's way into Aztec. It is much more flexible than the original CSR/CSC formats (well, it does have block-addressing) and it is an option for multi-physics simulations, in my regard. But most (all except Aztec?) solvers don't seem to support it, so I think for generality you're better off writing a indexing wrapper for CSR instead, keeping it simple... and general.

But I wanted to clarify one thing, the concept of a "block accumulator", first by understanding what that name means. About it's name, it is to serve as an intermediate cache (accumulator) that can represent somehow an element contribution, or partial calculation regarding a finite element/volume/geometric equation term (block). That's how I see it at least. From an engineer point of view, it's nothing more than a local matrix, a highly convenient one because it avoids the global integration of the particular element in a mesh (typically), so indexing is done once, and is reusable. There is place for a block accumulator, yes, but you see, there are no "elephants all the way down" (discworld term for recursivity). That means, as soon as you have the hierarchy mesh -> LSS -> block accumulator, you're finished. I think that works for 75% of what everybody needs doing now, but multi-physics and multi-domain require something else (in my opinion).

So my proposition is something else, getting rid of it altogether. First, start with how a block accumulator (BA) is used: it is just a local, element-wise linear system (LS). It contains a small, dense matrix, solution and right-hand side (RHS) vectors (these last two are great additions, by the way). As a side note, typically in CF2 assembly, the BA was calculated locally, the RHS globally, then the BA got mapped into the big matrix. This required global and local knowledge at the same time... that is, even the ordering of the equations and variables were forced sequential between local and global.

So once you have a big LS, and a small LS, you're missing the mapping. (Now before I go into this, I haven't really tried this before.) The mapping typically has a relation to a mesh, and this is where things get really funny: you can even get rid of that "restriction". So I explain: in multi-domain you want a big LS, but you have two different things happening in two domains (A&B), each with their own mesh, and each domain uses different PDEs (NS&Heat). So you want:

And obviously, the big advantage is that we don't have to stop in "LS|big" but add a "LS|bigger", that can do something else too. That is truly multi-domain, building the simulation out of smaller meshes, while the mappings can account for the multi-physics (inter-mesh/coupling-based mapping). The mappings have obviously to know about the variables and their ordering, and they consist of pair combinations, like where typically you'd have two maps "local2global" and "global2local" (ah, but that is so despicable!) you now can get an object myfunnymap< MapType >::add(LSbig,LSA), myfunnymap< MapType >::set(LSbig,LSA) or even better, declare myfunnymap< MapType > mapping, and use mapping.add() making use of cached indexing tables calculated at construction time.

I can easily foresee a factory of mappings, and (maybe) a policy pattern maybe in the style we do "from-any-to-any" temperature conversions because they are non-linear: type A index ordering get converted to COMMON index ordering, then to type B ordering. Unless you actually specialize the mapping<typeA,typeB> combination in some optimal way. Sorry this is a technicality, maybe not a good analogy, and a lot needs to happen before the problem gets to this.

But this way, you get the same performance of block-accumulator type assembly, but in a consistent approach hiding local/global structures, the performance of caching indexing (or just run-time generating an index "[i*size+j]"), multi-domain and multi-physics (this last one, at least from the approach of variable-block size). Win-win-win-almost win.

Isn't this the original way Galerkin first saw a PDE problem? Just do a linear system on a element! So I cite prior art, can't patent it and won't become a millionaire.

My 3.5 cents :-)

pmaciel · 2012-04-19T10:18:21Z

Oh, that was big. Sorry...

tlmquintino · 2012-04-19T10:39:16Z

Great post!
I especially like the Discworld reference.

I agree with this vision. In the spirit of Zen simplification: Two LSS views and a mapping in between.
Creating this indirection, increases the future options enormously.

I am only concerned with with the efficiency - but premature optimization is the root of all evil. ;)

wdeconinck · 2012-04-19T10:44:23Z

Thanks for the 3.5 cents! Very much appreciated!

It seems like an interesting way, and hopefully as your reasoning states there would be no performance hit, I would like to see a proof of concept. Feel free to fork and start coding, whenever u find some time to do so.

My opinion is functionality and simplicity over performance, if the performance hit is acceptable (<2%).
I think "functionality and simplicity" should become the CF3 moto and always be in our minds.

tlmquintino · 2012-04-19T10:49:03Z

"Functionality and Simplicity" -- is a very Zen moto. I like it!

wdeconinck · 2012-04-19T10:50:40Z

@tlmquintino You feel very Zen today it seems :p

barche · 2012-04-19T21:25:55Z

Hi Pedro,

Many thanks for your extensive comment! Some considerations:

The "knowledge of variables and their ordering" is currently encapsulated in the VariablesDescriptor (already used in a new create method I added)
We should strive towards an extension or evolution of the current LSS interface. Probably new create and access methods will need to be added.
Developing in a separate branch is essential, since both Sebastians and my PhD work rely heavily on the current LSS as it is used in UFEM. Ideally, not much should break, but I will of course help with the conversion of UFEM if needed
I think currently everyone is stretched for time, so it would be perfect if you could work out some proposals, preferably starting from the LSS interface as-is, assuming you are not stretched for time, that is ;)

ghost assigned tbanyai Feb 1, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variable block size in LSS #182

Variable block size in LSS #182

barche commented Feb 1, 2012

tlmquintino commented Feb 2, 2012

tbanyai commented Feb 2, 2012

tlmquintino commented Feb 2, 2012

tlmquintino commented Apr 18, 2012

wdeconinck commented Apr 18, 2012

barche commented Apr 18, 2012

wdeconinck commented Apr 19, 2012

pmaciel commented Apr 19, 2012

pmaciel commented Apr 19, 2012

tlmquintino commented Apr 19, 2012

wdeconinck commented Apr 19, 2012

tlmquintino commented Apr 19, 2012

wdeconinck commented Apr 19, 2012

barche commented Apr 19, 2012

Variable block size in LSS #182

Variable block size in LSS #182

Comments

barche commented Feb 1, 2012

tlmquintino commented Feb 2, 2012

tbanyai commented Feb 2, 2012

tlmquintino commented Feb 2, 2012

tlmquintino commented Apr 18, 2012

wdeconinck commented Apr 18, 2012

barche commented Apr 18, 2012

wdeconinck commented Apr 19, 2012

pmaciel commented Apr 19, 2012

pmaciel commented Apr 19, 2012

tlmquintino commented Apr 19, 2012

wdeconinck commented Apr 19, 2012

tlmquintino commented Apr 19, 2012

wdeconinck commented Apr 19, 2012

barche commented Apr 19, 2012