Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved library performance by caching the internal image data, #9

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

AquariusPower
Copy link
Contributor

@AquariusPower AquariusPower commented Mar 28, 2018

this way, instead of delete (free) the image data memory,
if it matches the requirements (WxH) that cached data will be re-used (just to overwrite not read).

By not trying to free memory, therefore not requiring to allocate new
memory, it is easier on the OS.

As a practical result, it avoids slowness and memory swapping to HD.

EDIT: mmm.. I should provide a stress loop to demonstrate it, I only measured the improvement at Ivan game.

this way, instead of delete (free) the image data memory,
if it matches the requirements (WxH) that cached data will be re-used.
By not trying to free memory, therefore not requiring to allocate new
memory, it is easier on the OS.
As a practical result, it avoids slowness and memory swapping to HD.
libxbrzscale.cpp Outdated
* dst_img if not null may be re-used, and is also returned
*/
SDL_Surface* libxbrzscale::scale(SDL_Surface* dst_imgCache, SDL_Surface* src_img, int scale){
if(scale<2 || scale>6){std::cerr<<"invalid stretch value min=2, max=6, requested="<<scale<<std::endl;exit(1);}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use console output.
Since it's a class now, better way to return an error would be to have an errorCode and maybe errorMessage property that you are supposed to check if this function returns NULL for example.
exit(1) from a library because some parameter was bad is not common practice and I would have a WTF moment if something like this happend to me.

Copy link
Contributor Author

@AquariusPower AquariusPower Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: NM... I was asleep in the comment below xD

I am not sure how to code:
errorCode
errorMessage
?
Must be on the code right? will I take a look as soon I can :)

and, what about if instead of exit(1) it was assert()? it gives a nice output if the programmer did some mistake, as xBRZ wont work out of these constraints, so should only happen in development time.

libxbrzscale.cpp Outdated
@@ -134,8 +138,30 @@ void displayImage(SDL_Surface* surface, const char* message) {
}
*/

struct imgUint32{
uint32_t *data;
int iW;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why iW / iH? size should be enough, 32x64 is equal to 64x32 in terms of pixel count, since it's a flat pixel array, it doesn't really matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right! :)

libxbrzscale.cpp Outdated
};
std::vector<imgUint32> vImgUint32Cache;
imgUint32 ImgUint32Cache(int iW,int iH){ //as that memory deletion wont happen promptly, it may consume too much memory too fast unnecessarily becoming a bottle neck on performance
for(int i=0;i<vImgUint32Cache.size();i++){
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you drop W & H parameters for size, you could use bigger buffers for smaller images. This could save some memory and speed things up, as a bigger buffer could be used for smaller images too.

Test loop could be as simple as if (vImgUint32Cache[i].size >= imageSize)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct again! only the bigger cache is necessary!

@atheros
Copy link
Owner

atheros commented Apr 3, 2018

Thank you for your PR.
I'll need some time to test it.

@AquariusPower
Copy link
Contributor Author

May be before testing, these things you pointed out should be adjusted first.
I will see if I do it directly on Ivan and promptly test there.

Btw, ivanWin wont compile if I keep SDL_GetPixel and SDL_PutPixel as inline!! I have no idea why, travisci went crazy or it is some configuration there that they need to adjust to let travisci work as expected: Attnam/ivan#331 (comment)

Copy link
Owner

@atheros atheros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though about it. What I propose is to move all cache stuff to a child class and add two virtual functions called from scale()
One could be allocatePixels(), the other could be releasePixels() (or whatever)
In the base class the first would do simple allocation, the second one would do the delete, while in the child class you could do some smart caching not affecting the base class and it's performance in other applications.

Since the interface is growing, I'll probably want to refactor some names and add a namespace to all of it when we merge this

libxbrzscale.cpp Outdated
bool libxbrzscale::bFreeInputSurfaceAfterScale=true;
bool libxbrzscale::bFreeOutputSurfaceAfterScale=true;

#ifndef XBRZLIB_NOINLINEGETSETPIX //this may be required to let some compillers linking actually work w/o error: "undefined reference to" these
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you check performance gains of using inline versus more aggressive compiler automatic optimizations? I'm not sure using inline these days is really useful - compilers are usually smarter that we are :)

if you really think it's needed, people usually do this:

#ifdef SOMETHING
#define XXX_INLINE inline
#else
#define XXX_INLINE
#endif

XXX_INLINE foo() {...}
XXX_INLINE bar() {...}

That way code is more readable. XXX should obviously be some library tag.

Copy link
Contributor Author

@AquariusPower AquariusPower Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I actually just learned about inline.
I was just trying to keep things as they originally were (with inline).

I dont know how to actually test it, I guess some profiler would be required, and to disable dynamic CPU throttle, and also a stress test code, but I have only practice in profilling java using NetBeans (that has in integrated profiler that can hook on any java running application, so it is ready for profilling noobs like me :))

So, if you believe inline is unnecessary, I can just remove it xD.
but anyway, your alternative define is indeed much more readable!

what you think? we keep it? do you have practice on profilling cpp?

EDIT: I can't even debug ivan in eclipse yet shame :/ thats why I created this optional code that is a bit troubling to use but works very fine: https://github.com/AquariusPower/CppDebugMessages

libxbrzscale.cpp Outdated

#ifndef XBRZLIB_NOINLINEGETSETPIX //this may be required to let some compillers linking actually work w/o error: "undefined reference to" these
inline
#endif
Uint32 libxbrzscale::SDL_GetPixel(SDL_Surface *surface, int x, int y)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since those functions don't touch any instance properties or methods, they don't need to be inside the class.

I think inline class methods should have implementation in the header if the class declaration is there. It is possible this caused IVAN build issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try it. I upload a commit and travisci compile ivanWin (that was failing cuz of inline on cpp file).

@AquariusPower
Copy link
Contributor Author

AquariusPower commented Apr 5, 2018

I though about it. What I propose is to move all cache stuff to a child class and add two virtual functions called from scale()

I dont know how virtual functions work, therefore I cant see their advantages neither grant what to do so that they will work properly, have to google about it or you got some tips?

One could be allocatePixels(), the other could be releasePixels() (or whatever)
In the base class the first would do simple allocation, the second one would do the delete, while in the child class you could do some smart caching not affecting the base class and it's performance in other applications.

But, as who will call allocatePixels() will have to wait for it to return, even being on a child class, wouldnt that still impact the caller's performance?

Unless you are thinking on multithreading, but even then, only if we predict the new size the next call will want allocated (we could just guess double of current request for ex.) so it will be already ready whenever that request happens.

May be, the caller application (like ivan) could make a pre-request for minimum buffer size, based on user options, ex.: the dungeon area can be about 1000x500 pixels, so 500000 or 0.5MB (and libxbrz would still add +10% safety margin), and that buffer would stay unmodified until the application exits!
But the code is ready there waiting for a bigger buffer request, so it would still not fail if that happens.

Since the interface is growing, I'll probably want to refactor some names and add a namespace to all of it when we merge this

Excellent! I use the naming that I can fastly think about, and is the clearest as possible "to me" (this is why I keep changing it :)), if you think they can be improved that is great!
but anyway this is your project and I am only contributing :)
I just have to update the calls to your changes and we are all good! :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants