layout | title | sched-activation |
---|---|---|
course |
Week 2, Day 2 |
class="active" |
Source: [{{ site.data.bibliography.helland2013.title }}]({{ site.data.bibliography.helland2013.url }}).
-
Lowest level: Platform software providing basic services (local storage, processes, synchronization, networking)
-
Separate out common features that are above the operating system but more general than one application
- Cavage calls this infrastructure "plumbing" and "concierge" (two names for the same thing)
- The term we prefer (from the Google Datacenter people) is cluster-level infrastructure
- Neither term is precise---people may have different opinions about what lies at this level
-
The cluster-level infrastructure can deal with new features of cloud technology:
- Rapid scalability: provisioning new servers on demand.
- Failure tolerance: Datacenters use hardware that is relatively unreliable; software also fails.
- Rapid changes in datacenter networking: Processes can move to different machines, new servers can be added to an application, …
- Transmission delays due to geography and network outages (the Consistency, Availability, and Partioning problem---CAP)
-
Application-level software can provide commonly-required tools
- Storage
- Credit card payment processing
- Sending emails
- User ID management and login
-
A top-level user service, such as "Display the page for The 2009-2014 World Outlook for 60-Milligram Containers of Fromage Frais", will be composed of many service calls within the application:
- Add this book to the "recently-visited" session data for this user
- Get sample pages, if any
- Check session identifier remains valid
- Get number of items in the shopping cart
-
Each service, in turn, calls other services:
Source: [{{ site.data.bibliography.helland2013.title }}]({{ site.data.bibliography.helland2013.url }}), Fig. 4. Copyright 2013, ACM.
- Each lower level has tighter SLA (response time)
-
How can we take advantage of elastic computing?
-
If there's a sudden surge of requests, how do we scale up quickly?
- Minimize communication between different user requests
- Have multiple machines ready to serve a given user request
-
Sharding (Helland calls it "partitioning") breaks the database in sections with separate keys
-
Replication creates multiple copies of each shard
These methods are also listed in the table from Section 2.2 of [{{ site.data.bibliography.barroso2013.title }}]({{ site.data.bibliography.barroso2013.url }}) that you read for last Wednesday.
Source: [{{ site.data.bibliography.helland2013.title }}]({{ site.data.bibliography.helland2013.url }}), Fig. 4. Copyright 2013, ACM.
{% comment %} [{{ site.data.bibliography.cavage2013.title }}]({{ site.data.bibliography.cavage2013.url }}) . {% endcomment %}