|
本帖最后由 ancientcc 于 2016-8-6 21:07 编辑
display渲染的一大任务是分层渲染(drawing_buffer_commit),分层渲染的单元是tblit。由于分层渲染单元很多,而且每次分层渲染都是要重新构造整个drawing_buffer_,这就让凸显一个问题,如果花在tblit的构造、复制时间较多,会影响app的整体效率。
tblit的效率到底是多少,让看一次运行实例,测试机是2015年出的Macbook proc。
测试时tblit中的loc类型是image::locator。
- void do_test()
- {
- const int size = 1100;
- image::get_image("misc/0.png");
- image::tblit* blit_buf[size];
- for (int loop = 0; loop < 5; loop ++) {
- std::vector<image::tblit> blits, blits2;
- uint32_t start_ticks = SDL_GetTicks();
- // blits.reserve(size);
- for (int i = 0; i < size; i ++) {
- blits.push_back(image::tblit("misc/0.png", image::UNSCALED));
- }
- uint32_t ticks1 = SDL_GetTicks();
- blits2 = blits;
- uint32_t ticks2 = SDL_GetTicks();
- for (int i = 0; i < size; i ++) {
- blit_buf[i] = new image::tblit("misc/0.png", image::UNSCALED);
- }
- uint32_t ticks3 = SDL_GetTicks();
- for (int i = 0; i < size; i ++) {
- delete blit_buf[i];
- }
- posix_print("#%i, tblit, push_back: %u, copy: %u, new: %u, delete %u\n", loop,
- ticks1 - start_ticks, ticks2 - ticks1, ticks3 - ticks2, SDL_GetTicks() - ticks3);
- start_ticks = SDL_GetTicks();
- std::vector<surface> surfs, surfs2;
- start_ticks = SDL_GetTicks();
- for (int i = 0; i < size; i ++) {
- surfs.push_back(image::get_image("misc/0.png"));
- }
- ticks1 = SDL_GetTicks();
- surfs2 = surfs;
- posix_print("#%i, surface, push_back: %u, copy: %u\n", loop, ticks1 - start_ticks, SDL_GetTicks() - ticks1);
- }
- }
复制代码
输出
- #0, tblit, push_back: 3, copy: 1, new: 7, delete 5
- #0, surface, push_back: 1, copy: 0
- #1, tblit, push_back: 3, copy: 1, new: 7, delete 6
- #1, surface, push_back: 1, copy: 0
- #2, tblit, push_back: 3, copy: 0, new: 8, delete 5
- #2, surface, push_back: 2, copy: 0
- #3, tblit, push_back: 4, copy: 0, new: 8, delete 6
- #3, surface, push_back: 1, copy: 0
- #4, tblit, push_back: 3, copy: 0, new: 8, delete 5
- #4, surface, push_back: 2, copy: 0
复制代码
这就意味着,如果不作任何优化,要渲染1000个单元,那每次drawing_buffer_commit单单花在构造tblit上的时间就要3毫秒(如果image::locator中的modifications_不是空,那将消耗更多时间)。对app场景,一次要渲染1000个单元也是常见的事。
上面还计算了new、delete花费的时间。有人认为std::vector::push_back不得不存在复制构造操作,为提高效率想避免这个操作,于是用指针数组方法,像“image::tblit* buf[1000]”。但从以上结果看到,new、delete要比std::vector的复制构造多耗费时间。相比push_back,例子中new/delete多这么多,这当中有Debug原因,Release时会缩小差距,但相比push_back还是没优势。
如何提高image::tblit复制效率
- image::tblit中的loc类型改为"image::locator*";
- 给image::tblit::loc赋值时使用函数image::get_locator。
- std::set<locator> locators;
- const locator& get_locator(const locator& locator)
- {
- std::set<image::locator>::const_iterator it = locators.find(locator);
- if (it != locators.end()) {
- return *it;
- }
- std::pair<std::set<image::locator>::const_iterator, bool> res = locators.insert(locator);
- return *res.first;
- }
复制代码
- app只会增加不会删除locators中单元,保证image::tblit中的loc一直有效。
|
|