I started posting about this project on 17 August and made my first code commit on 26 August. In my previous post I announced the existence the of the Proof of Concept (PoC) version of the application.
The PoC version will mark the end of the first iteration of the project. This seems like an opportune time to reflect on the work that has been done so far.
The original goals of the project were
- Test Datastore access rates.
- Test client side responsiveness while fetching a lot of data from the server.
- Evaluate the GWT and GAE development environments for doing substantial work quickly and with decent quality.
- Evaluate the robustness of the GAE model of doing work in 30 second chunks.
- Evaluate datastore scalability.
- Create a few re-usable components.
- Learn how to create a simple UI. My design skills are limited and I like the sparse Google web page lay-outs so I hope there some widgets and examples for building sparse Googly web pages.
- The application tested the Google datastore access rates. The datastore access rates seemed pretty reasonable, typically a few hundredths of a second to fetch a 100 KB data record. In most caches this was helped by the memcached cache sitting in front of the datastore (code is here). As shown in the example in the previous post, the datastore access time was much shorter than the few tenths of a second it takes to fetch the data back to the client.
- Please test the client side responsiveness and tell me how well you think the application responds while fetching approximately 1MB data from the server per screen. My experience was that it did a pretty good job on Chrome, was okay on Firefox and Opera, slower on Safari and too slow on IE8. All these browsers were tested on Vista from a home DSL connection in Australia.
- I will write a separate post on the GWT and GAE development environments. For now I can say that it was easy to get started with a GWT+GAE+Eclipse+github+cygwin environment from the laptop running Vista that was available to me when I started the project.
- GAE could do all the datastore accesses that were required of my application in much less that 30 seconds. I got client response problems on IE8 when fetching 10 records at a time which took GAE about 0.1 second to process. GAE seems well suited to GUI+database applications.
- As far as I could tell the datastore has scaled perfectly so far. According to the GAE dashboard, the application is currently using 200 MB of storage, which is a minimum of 2,000 records.
- The server cache pipeline is probably re-usable and the client side cache can probably be adapted fairly easily. Apart from that I did not create much re-usable code
- I learned how to create a simple UI with GWT. My design skills are still limited but I liked the sparse Googly list and navigation buttons.
The code ended up as follows at the end of the first iteration
- The datastore access was straightforward. Most of the work was reading the documentation.
- The server cache pipeline was simple and straightforward.
- The web server was straightforward and simple, just an RPC server, except for the best-effort aspects.
- The client cache and UI was where I spent most of my time and learn the most.
It took me a while to get to simple code for the client cache and UI that kept a clean coherent state. The final client cache design was
- Block while waiting for visible records that have been requested from server but have not arrived to arrive. This takes care of the case of incomplete but correctly predicted pre-fetches.
- Clear all pending server fetches. This is seldom costly because of step 1.
- Build a queue of server requests. 0-click away (visible now), 1-click away, 2-clicks, away
- Send requests to the server in the above order. Limit the number of pending requests to a specified number (default=2).
- When visible records have been fetched, callback to UI. This usually results in an immediate callback (and no UI latency) because the records have been pre-fetched.
- Optimisations. (Fetching a 100KB record to display about 200 characters is wasteful)
- Bugs to be fixed.
- Features to be added.
You can watch this project's progress here.
1 comments:
The challenges of scaling social graph databases are discussed here.
Post a Comment