Mobile Safari Performance - NetLogo/Tortoise GitHub Wiki

Introduction

The goal of this page is to assess performance and/or functionality differences between Mobile Safari (especially iPad mobile Safari) and desktop browsers.

Summary

The sections below add a bit more color to the following key points:

Tortoise has not yet been optimized for either CPU or Memory performance.
Mobile safari is a slower browser using more limited hardware and is disproportionately slower to execute non-optimized Javascript.
Moving computation onto a separate thread would ameliorate some of these problems, but would take serious time and thought.
An improved event loop may help to provide a better user experience, and is in the pipeline to be released alongside authoring.
Setting the speed-slider to a lower speed is a temporary work around that may give a more responsive user experience.

Delayed Javascript optimization

Tortoise Javascript has not been optimized for performance. In the process of profiling there are a few areas of code which could be tuned to make the "Trout" simulation faster, but there is no guarantee that the optimizations would be beneficial to all models. For example, the tested "Trout" model had an operation where each member of a turtleset counted the members of that turtleset. While this process ends up being expensive for this model, optimizing it would likely make other primitives (such as hatch and die) slower.

Javascript memory usage is another speculative area for optimization. This could have an especially big performance impact on devices like the iPad which are quite memory-constrained. We can see in Chrome that there is increasing memory use over time as the application is run. This indicates a probability (although it is not entirely certain) that the engine has memory leaks. Which could cause degraded performance over time. Additionally, we see relatively frequent Garbage Collections (GCs) in Chrome. GCs temporarily pause the simulation (for a millisecond or two) while cleaning up memory that has been used. While it may be possible to optimize memory usage to reduce garbage collection, doing so may adversely affect performance and so needs to be done carefully. Done correctly, optimizing memory usage could lead to increased performance in computation speed and decreased use of memory, but it would require an investment of time to determine where memory usage could be reduced without adversely impacting simulation speed.

Computing on the view thread

Currently, all of our computations take place in the same thread as the view. This is nice from the perspective of keeping the code simple, but it means that we have to negotiate sharing the view with computation, which we do via the event loop (see below). Long-term, it would be nice to move all of our engine computation to a background thread via javascript web workers. This would allow for a responsive UI along with a more consistent simulation speed (hopefully). It would reflect a significant architectural change in the application, however, and will require time.

The Event Loop

Tortoise runs by installing a timer using setTimeout and periodically calling the update method goForever(). This method does the meat of calculating world changes and updating the view. In Chrome, we find that the goForever() method which runs the main event loops runs in 100-150ms. In Safari, the goForever() method takes at least 350ms. In the iOS Simulator, the goForever() method takes about 3s.

Additionally, differences in the precision of setTimeout affect the event loop. While setTimeout takes a number of milliseconds to wait before calling the function, the actual number of milliseconds waited varies greatly by browser. In particular, mobile Safari seems to be highly variable in the amount of time waited. This causes it to be more likely to attempt to compute two ticks at one time, which is noticeably slower.

To work around these difficulties and get somewhat more usable performance in iOS, the user can decrease the speed at which the model is run. This leads to about one view update every 2 seconds. While we would prefer to have many view updates per second, this at least increases the feeling of responsiveness. The event loop should be re-examined in greater depth after the authoring changes have been merged, as these will substantially alter the structure of the event loop.

Testing tools

Testing for this report used the following tools:

The iOS simulator with iPad2 / iOS 7 and Safari iOS developer tools
Safari Version 7.1 on Apple OS X 10.9.5 with developer tools
Google Chrome 38.0.2125.111 for Mac with developer tools

Testing across three different browsers gives us different information about how the program is running and what sort of things might cause performance problems.

I have done a simple array benchmark across all three browsers with the following results (more is better):

| measured operation               | Chrome     | Safari     | Mobile Safari |
|----------------------------------|------------|------------|---------------|
| push                             | 28,223,021 | 18,914,461 |     6,382,836 |
| unshift                          |  1,577,820 |  2,362,080 |     1,311,721 |
| direct array assignment          | 21,249,285 | 18,308,792 |     6,066,533 |
| fixed array assignment           | 24,768,013 | 18,462,508 |     5,544,585 |
| direct array assignment w/length | 33,556,145 | 17,067,435 |     5,955,050 |

Array operations are relatively common in Tortoise code, and this benchmark shows that might expect desktop chrome to be at least 4-5 times faster than Mobile Safari.

This was not tested in Chrome for iOS. It is unlikely that Chrome for iOS would result in measurably different performance from Mobile Safari, given that both use the same backend components.