-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved library performance by caching the internal image data, #9
base: master
Are you sure you want to change the base?
Conversation
this way, instead of delete (free) the image data memory, if it matches the requirements (WxH) that cached data will be re-used. By not trying to free memory, therefore not requiring to allocate new memory, it is easier on the OS. As a practical result, it avoids slowness and memory swapping to HD.
libxbrzscale.cpp
Outdated
* dst_img if not null may be re-used, and is also returned | ||
*/ | ||
SDL_Surface* libxbrzscale::scale(SDL_Surface* dst_imgCache, SDL_Surface* src_img, int scale){ | ||
if(scale<2 || scale>6){std::cerr<<"invalid stretch value min=2, max=6, requested="<<scale<<std::endl;exit(1);} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use console output.
Since it's a class now, better way to return an error would be to have an errorCode and maybe errorMessage property that you are supposed to check if this function returns NULL for example.
exit(1) from a library because some parameter was bad is not common practice and I would have a WTF moment if something like this happend to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: NM... I was asleep in the comment below xD
I am not sure how to code:
errorCode
errorMessage
?
Must be on the code right? will I take a look as soon I can :)
and, what about if instead of exit(1) it was assert()? it gives a nice output if the programmer did some mistake, as xBRZ wont work out of these constraints, so should only happen in development time.
libxbrzscale.cpp
Outdated
@@ -134,8 +138,30 @@ void displayImage(SDL_Surface* surface, const char* message) { | |||
} | |||
*/ | |||
|
|||
struct imgUint32{ | |||
uint32_t *data; | |||
int iW; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why iW / iH? size should be enough, 32x64 is equal to 64x32 in terms of pixel count, since it's a flat pixel array, it doesn't really matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right! :)
libxbrzscale.cpp
Outdated
}; | ||
std::vector<imgUint32> vImgUint32Cache; | ||
imgUint32 ImgUint32Cache(int iW,int iH){ //as that memory deletion wont happen promptly, it may consume too much memory too fast unnecessarily becoming a bottle neck on performance | ||
for(int i=0;i<vImgUint32Cache.size();i++){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you drop W & H parameters for size, you could use bigger buffers for smaller images. This could save some memory and speed things up, as a bigger buffer could be used for smaller images too.
Test loop could be as simple as if (vImgUint32Cache[i].size >= imageSize)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct again! only the bigger cache is necessary!
Thank you for your PR. |
May be before testing, these things you pointed out should be adjusted first. Btw, ivanWin wont compile if I keep SDL_GetPixel and SDL_PutPixel as inline!! I have no idea why, travisci went crazy or it is some configuration there that they need to adjust to let travisci work as expected: Attnam/ivan#331 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I though about it. What I propose is to move all cache stuff to a child class and add two virtual functions called from scale()
One could be allocatePixels(), the other could be releasePixels() (or whatever)
In the base class the first would do simple allocation, the second one would do the delete, while in the child class you could do some smart caching not affecting the base class and it's performance in other applications.
Since the interface is growing, I'll probably want to refactor some names and add a namespace to all of it when we merge this
libxbrzscale.cpp
Outdated
bool libxbrzscale::bFreeInputSurfaceAfterScale=true; | ||
bool libxbrzscale::bFreeOutputSurfaceAfterScale=true; | ||
|
||
#ifndef XBRZLIB_NOINLINEGETSETPIX //this may be required to let some compillers linking actually work w/o error: "undefined reference to" these |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you check performance gains of using inline versus more aggressive compiler automatic optimizations? I'm not sure using inline these days is really useful - compilers are usually smarter that we are :)
if you really think it's needed, people usually do this:
#ifdef SOMETHING
#define XXX_INLINE inline
#else
#define XXX_INLINE
#endif
XXX_INLINE foo() {...}
XXX_INLINE bar() {...}
That way code is more readable. XXX should obviously be some library tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I actually just learned about inline.
I was just trying to keep things as they originally were (with inline).
I dont know how to actually test it, I guess some profiler would be required, and to disable dynamic CPU throttle, and also a stress test code, but I have only practice in profilling java using NetBeans (that has in integrated profiler that can hook on any java running application, so it is ready for profilling noobs like me :))
So, if you believe inline is unnecessary, I can just remove it xD.
but anyway, your alternative define is indeed much more readable!
what you think? we keep it? do you have practice on profilling cpp?
EDIT: I can't even debug ivan in eclipse yet shame :/ thats why I created this optional code that is a bit troubling to use but works very fine: https://github.com/AquariusPower/CppDebugMessages
libxbrzscale.cpp
Outdated
|
||
#ifndef XBRZLIB_NOINLINEGETSETPIX //this may be required to let some compillers linking actually work w/o error: "undefined reference to" these | ||
inline | ||
#endif | ||
Uint32 libxbrzscale::SDL_GetPixel(SDL_Surface *surface, int x, int y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since those functions don't touch any instance properties or methods, they don't need to be inside the class.
I think inline class methods should have implementation in the header if the class declaration is there. It is possible this caused IVAN build issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try it. I upload a commit and travisci compile ivanWin (that was failing cuz of inline on cpp file).
I dont know how virtual functions work, therefore I cant see their advantages neither grant what to do so that they will work properly, have to google about it or you got some tips?
But, as who will call allocatePixels() will have to wait for it to return, even being on a child class, wouldnt that still impact the caller's performance? Unless you are thinking on multithreading, but even then, only if we predict the new size the next call will want allocated (we could just guess double of current request for ex.) so it will be already ready whenever that request happens. May be, the caller application (like ivan) could make a pre-request for minimum buffer size, based on user options, ex.: the dungeon area can be about 1000x500 pixels, so 500000 or 0.5MB (and libxbrz would still add +10% safety margin), and that buffer would stay unmodified until the application exits!
Excellent! I use the naming that I can fastly think about, and is the clearest as possible "to me" (this is why I keep changing it :)), if you think they can be improved that is great! |
this way, instead of delete (free) the image data memory,
if it matches the requirements (WxH) that cached data will be re-used (just to overwrite not read).
By not trying to free memory, therefore not requiring to allocate new
memory, it is easier on the OS.
As a practical result, it avoids slowness and memory swapping to HD.
EDIT: mmm.. I should provide a stress loop to demonstrate it, I only measured the improvement at Ivan game.