Vamos falar sobre computação na núvem no Brasil - Let's talk about Cloud Computing in Brazil - August 2010
My previous trips to China, India, Japan, London have been super productive. I get a chance to meet tons of new people, make a lot of friends and talk about something that I am truly passionate about: Cloud Architectures and Amazon Web Services Cloud.
Next month, I will be in Brazil and traveling to 3 main cities to keynote and present at different conferences and user groups. My complete plan is as follows:
Aug 6 - Aug 11, 2010 in Sao Paulo:
- Keynote at iMasters Pro's Ecommerce Forum Brazil 2010
- Presentation at the Cloud Computing Summit 2010 - Organized by Tecla
Aug 12 - Aug 16, 2010 in Rio De Janero:
- Meeting Customers and Open for meetings
Aug 17 - Aug 21, 2010 in Brasilia:
- Keynote at CONSEGI 2010 - Third International Congress of Free and Electronic Government
- General session on Architecting for the cloud : Best Practices and Design Patterns
- Getting Started with Amazon Web Services - Hands-on Workshop
If you are in Brazil and passionate about cloud computing, I would like to meet you. If you are an aspiring cloud developer or architect, system integrator trying to win
a local SaaS contract or an ISV trying to build a cloud strategy around your
product, send me an email at evangelists [[at]] amazon [[dot]] com to schedule a meeting. I
would love to exchange ideas, learn more about the local market and
discuss the future. If you are a leader of a local user group and would like us to present to your group, please contact me in advance.
-- Jinesh
What's New in AWS Security: Vulnerability Reporting and Penetration Testing
Security is a top priority for Amazon Web Services. Providing a trustworthy infrastructure for you to develop and deploy applications is a responsibility we take very seriously. One important aspect of gaining your trust is being open and transparent about our security processes and continually working toward achieving industry-recognized certifications. Other important aspects include providing you with mechanisms for contacting us about potential security issues and enabling you to conduct security tests of the applications you deploy on AWS. I'm pleased to announce today two new policies: one that outlines our vulnerability reporting process and one that describes how to receive permission to conduct penetration tests of the applications running on your EC2 instances.
A new page in the AWS Security Center describes our vulnerability reporting process. The process is high-priority for us, it's human-driven, and is governed by a service level commitment. Like other technology providers, we believe in the concept of responsible disclosure: let's work together to protect everyone.
Another page in the Security Center describes our penetration testing procedure. Normally, conducting such tests violates our Acceptable Use Policy because these tests are often indistinguishable from real attacks. However, to ensure higher degrees of application security, external testing is an important phase of development and deployment. We put the procedure in place so that we won't respond to your testing as if your instances were under attack.
The e-mail address aws-security@amazon.com is your single point of contact for all things security-related. If you need to contact us about a particularly sensitive issue, you can encrypt your message with our PGP public key. And, of course, if you suspect abuse of EC2 or other AWS services, our abuse reporting process remains in place.
Finally, a small navigational change. We've moved the bulletins off the main page and onto a separate security bulletin list and changed the format so that all bulletins are displayed rather than just the most recent five.
As always, we welcome your comments and feedback. We're here to help you succeed!
> Steve <
Use Your Own Kernel with Amazon EC2
You can now use the Linux kernel of your choice when you boot up an Amazon EC2 instance.
We have created a set of AKIs (Amazon Kernel Images) which contain the PV-Grub loader. This loader simply chain-boots the kernel provided in the associated AMI (Amazon Machine Image). Net-net, your instance ends up running the kernel in the AMI instead of the kernel specified in the boot process.
You need to install an "EC2 compatible" kernel and create an initrd (initial RAM disk) as part of your AMI. You also need to create a menu (/boot/grub/menu.lst) for the Grub boot loader. Once you've done this you can create the AMI and then launch instances by using one of the PV-Grub "kernels" as described above. You may find this document to be helpful if you want to learn more about the Linux boot process.
To be compatible with EC2, a Linux kernel must support Xen's pv_ops (paravirtual ops) infrastructure with XSAVE disabled or the Xen 3.0.2 interface. The following kernels have been tested and/or have vendor support:
- Fedora 8-12 Xen kernels
- SLES/openSUSE 10x, 11.0, and 11.1 Xen kernels
- SLES/openSUSE 11.x EC2 Variant
- Ubuntu EC2 Variant
- RHEL 5.x
- CentOS 5.x
Other kernels may not start reliably within EC2. We're working with the providers of popular AMIs to make sure that they will start to use PV-Grub in the near future.
You can read more about this in our "Enabling User Provided Kernels in Amazon EC2" document.
-- Jeff;
PS - You could (if you are sufficiently adept) use this facility to launch an operating system that we don't support directly (e.g. FreeBSD). If you manage to do this, please feel free to let me know.
Enhanced CloudFront Logs, Now With Query Strings
One thing that I love (among many) about working at Amazon.com is the customer-driven innovation cycle. We introduce a new product or service with a useful yet somewhat minimal feature set. We do this to get it out into the real world as soon as possible so that our customers can start to use it and to provide us with feedback on it. Then we put an ear to the ground and do our best to listen and to learn. The information that we gather in this way feeds directly in to the product planning process. I hear the phrase "voice of the customer" several times per week as I wander the halls.

The Amazon CloudFront team has been improving their product in this way since they launched it at the end of 2008. In response to requests from customers they have added a number of great features including more edge locations, private content, streaming media content, HTTP request logging, a reduced TTL (Time To Live), private streamed content, streaming access logs, console support, additional pricing tiers, support for HTTPS, and out-and-out price reductions.
Our customers have been asking for additional information in the CloudFront access logs. Specifically, they have asked us to include the URL's query string (the part after the "?") in each log entry so that they can implement better and more detailed tracking of the source of each request.
We have implemented this feature and it is available now.
Here's how it works. The basic URL to the image above is:
http://d1nqddva888cns.cloudfront.net/amazon_product_cycle.pngLet's say that I want to use the same image in this blog post and in a white paper about corporate innovation. I could simply append two distinct query strings to the URL, like this:
http://d1nqddva888cns.cloudfront.net/amazon_product_cycle.png?bloghttp://d1nqddva888cns.cloudfront.net/amazon_product_cycle.png?white_paperMy log analysis software can use the "?blog" and "?white_paper" strings to figure out which source is more popular.
Many customers have told us that they use (or plan to use) this technique to track marketing campaigns and microsites, as well as targeted use of their content. People used to say that "content is king." These days, based on what I am seeing and hearing, numbers and analytics are about to depose the king. The ability to track, analyze, and understand the behavior of site visitors (perhaps using some A/B testing and a healthy dose of Elastic MapReduce) has become a critical success factor.
You can generate these query strings yourself, but I'd assume that sophisticated blogging and content management tools will start to do so over time. CloudFront logs and then ignores the query string. It is not passed along to Amazon S3.
As I said earlier, this new feature is available now and I look forward to hearing how you put it to use. If you develop content management or analytic tools and add support for it, let me know by posting a comment or by sending me some email.
A great way for you to influence our future investments is by sharing your use case with us by means of our CloudFront survey. We always appreciate it when our customers suggest ways to make CloudFront even better.
-- Jeff;
Amazon S3 and Amazon SNS - Best Friends Forever
We're starting to wire various AWS services to each other, with interesting and powerful results. Today I'd like to talk to you about a brand new connection between Amazon S3 and the Amazon Simple Notification Service.
When I introduced you to SNS earlier this year I noted that "SNS is also integrated with other AWS services" and said that you could arrange to deliver notifications to an SQS message queue.
We're now ready to take that integration to a new level. Various parts of AWS will now start to publish messages to an SNS topic to let your application know that a certain type of event has occurred. The first such integration is with Amazon S3, and more specifically, with S3's new Reduced Redundancy Storage option.
You can now configure any of your S3 buckets to publish a message to an SNS topic of your creation (permissions permitting) when S3 detects that it has lost an object that was stored in the bucket using the RRS option.Your application can subscribe to the topic and (when the event is triggered) respond by regenerating the object and storing it back in S3. The message will include the event, a timestamp, the name of the bucket, the object's key and version id, and some internal identifiers.
Let's say that you are using S3 to store an original image and some derived images. You would use the STANDARD storage class for the original image and the REDUCED_REDUNDANCY storage class for the derived images. You would also need to store the information needed to regenerate a derived image from the original image. You could store this in SimpleDB or you could create a naming convention for your S3 object keys and then extract the needed information from the URL.
Consider this image:
http://faces.s3.amazonaws.com/jbarr_2007_web.jpgIt is the original image and would be stored with the STANDARD storage class. Derived images (scaled to a new size in this case) would use a suffix containing the needed information, and would be stored with REDUCED_REDUNDANCY:
http://faces.s3.amazonaws.com/jbarr_2007_web_120x168.jpgA notification would be stored on the faces bucket and routed to a topic such as faces_web_app_errors. Your application need only await events on the topic and respond as follows:
- Confirm the event is of the expected type (s3:ReducedRedundancyLostObject)
- Extract the bucket and key name from the event
- Parse the key name to identify the key of the original object and the transform to be applied
- Fetch the original object
- Apply the transform (image scaling in this case)
- Store the derived object in S3 using the REDUCED_REDUNDACY storage class
Over time, we'll wire up additional events (for S3 and for other services) to SNS. You can prepare for this now by creating general purpose event handlers in your application, and by keeping your code properly factored so that it is easy to create an object when needed. For the case listed above, I would think about structuring my application so that the only way to create a derived object is in response to an event. I would then generate synthetic "lost" events and use them to materialized the derived objects for the first time.
-- Jeff;
AWS Management Console Support for S3 RRS
The AWS Management Console now supports Amazon S3's Reduced Redundancy Storage. You can view and change the storage class of an S3 object in the object's Properties pane:

You can also select multiple objects and change the storage class for all of them at the same time.
Finally, you can set the option when you upload one or more objects:

Are you putting RRS to use in your application? I'd like to learn more. Send me an email or leave me a comment.
-- Jeff;
New VPC Features: IP Address Control and Config File Generation
We've added two new features to the Amazon Virtual Private Cloud (VPC) to make it more powerful and easier to use. Here's the scoop:
- IP Address Control - You can now assign the IP address of your choice to each of the EC2 instances that you launch in your Virtual Private Cloud. The address must be within the range of addresses that you designated for the VPC, it must be available for use within the instance's network subnet, and it must not conflict with any of the addresses that are reserved for internal use by AWS. You can specify the desired address as an optional parameter to the RunInstances function. This will allow you to have additional control of your network configuration, and has been eagerly anticipated by many of our customers. Two use cases that we've heard about already are running DNS servers and Active Directory® Domain Controllers.
Config File Generation - VPC can now generate configuration files (example at right) for several different types of devices including the Cisco ISR and a number of Juniper products including the J-Series Service Router, the SSG (Secure Services Gateway), and the ISG (Integrated Security Gateway). The files can be generated from the command line or from within ElasticFox. Generating the config files in this way lets you avoid common configuration issues and allows you to be up and running in minutes.
If you want to connect a Linux-based VPN gateway to your Virtual Private Cloud, take a look at Amazon VPC With Linux. This article will show you how to set up IPSec and BGP routing and includes detailed configuration information.
If you are running OpenSolaris, take a look at the OpenSolaris VPC Gateway Tool.
-- Jeff;
New Amazon EC2 Instance Type - The Cluster Compute Instance
A number of AWS users have been using Amazon EC2 to solve a variety of computationally intensive problems. Here's a sampling:
- Atbrox and Lingit use Elastic MapReduce to build data sets that help individuals with dyslexia to improve their reading and writing skills.
- Systems integrator Cycle Computing helps Varian to run compute-intensive Monte Carlo simulations.
- Harvard Medical School's Laboratory for Personalized Medicine creates innovative genetic testing models.
- Pathwork Diagnostics runs tens of thousands of models to help oncologists to diagnose hard-to-identify cancer tumors.
- Razorfish processes huge datasets on a very compressed timescale.
- The Server Labs helps the European Space Agency to build the operations infrastructure for the Gaia project.
Some of these problems are examples of what are called “embarrassingly parallel” computing. Others leverage the Hadoop framework for data-intensive computing, spreading the workload across a large number of EC2 instances.
Our customers have also asked us about the ability to run even larger and more computationally complex workloads in the cloud.
It is clear that people are now figuring out that they can do HPC (High-Performance Computing) in the cloud. We want to make it even easier and more efficient for them to do so!
Our new Cluster Compute Instances will fit the bill. With Cluster Compute Instances, you can now run many types of large-scale network-intensive jobs without losing the core advantages of EC2: a pay-as-you-go pricing model and the ability to scale up and down to meet your needs.
Each Cluster Compute Instance consists of a pair of quad-core Intel "Nehalem" X5570 processors with a total of 33.5 ECU (EC2 Compute Units), 23 GB of RAM, and 1690 GB of local instance storage, all for $1.60 per hour.
Because many HPC applications and other network-bound applications make heavy use of network communication, Cluster Compute Instances are connected using a 10 Gbps network. Within this network you can create one or more placement groups of type "cluster" and then launch Cluster Compute Instances within each group. Instances within each placement group of this type benefit from non-blocking bandwidth and low latency node to node communication.
The EC2 API's, the command-line tools, and the AWS Management Console have all been updated to support the creation and use of placement groups. For example, the following pair of commands creates a placement group called biocluster and then launches 8 Cluster Compute Instances inside of the group:
$ ec2-create-placement-group biocluster -s cluster
$ ec2-run-instances ami-2de43f55 --type cc1.4xlarge --placement-group biocluster -n 8
The new instance type is now available for Linux/UNIX use in a single Availability Zone in the US East (Northern Virginia) region. We'll support it in additional zones and regions in the future. You can purchase individual Reserved Instances for a one or a three year term, but you can't buy them within specific cluster placement groups just yet. There is a default usage limit for this instance type of 8 instances (providing 64 cores). If you wish to run more than 8 instances, you can request a higher limit using the Amazon EC2 instance request form.
The Cluster Compute Instances use hardware-assisted (HVM) virtualization instead of the paravirtualization used by the other instance types and requires booting from EBS, so you will need to create a new AMI in order to use them. We suggest that you use our Centos-based AMI as a base for your own AMIs for optimal performance. See the EC2 User Guide or the EC2 Developer Guide for more information.
The only way to know if this is a genuine HPC setup is to benchmark it, and we've just finished doing so. We ran the gold-standard High Performance Linpack benchmark on 880 Cluster Compute instances (7040 cores) and measured the overall performance at 41.82 TeraFLOPS using Intel's MPI (Message Passing Interface) and MKL (Math Kernel Library) libraries, along with their compiler suite. This result places us at position 146 on the Top500 list of supercomputers. The input file for the benchmark is here and the output file is here.
Putting this all together, I think that we have put together a true fire-breathing dragon of an offering. You can now get world-class compute and network performance on an economical, pay-as-you-go basis. The individual instances perform really well, and you can tie a bunch of them together using a fast network to attack large-scale problems. I'm fairly certain that you can't get this much compute power so fast or so economically anywhere else.
I'm looking forward to writing up and sharing some of the success stories from the customers who've been helping us to test the Cluster Compute instances during our private beta test. Feel free to share your own success stories with me once you've had a chance to give them a try.
Update - Here's some additional info:
- Werner Vogels: Expanding the Cloud - Cluster Compute Instances for Amazon EC2.
- TechCunchIT: AWS Launches Cluster Compute EC2 Instances For High Performance Applications.
- ZDNet: Amazon Web Services tackles high performance computing instances.
- And our own HPC Applications page.
-- Jeff;
Attending Casual Connect in Seattle? Then Connect With Us!
The Casual Connect Conference will be held in Seattle later this month. The conference will start on July 20th and will end on the 22nd. We've got some big plans for this conference:
The AWS team will have a booth on the show floor. Be sure to stop by and to say hello. We'll have some AWS stickers and other goodies to hand out.
I will be speaking about Cloud-Powered Social Gaming at 5:00 PM on the 20th. Here's a description of my talk:
Social games can quickly attract user bases measured in the millions or even in the tens of millions. Gathering sufficient processing, networking, and storage resources to deal with this onslaught of traffic can be difficult, time-consuming, and expensive. The unpredictable shape of the adoption makes it impossible to acquire resources ahead of demand. In this session, you will see how cloud-based resources can be the ideal solution to this conundrum.
The Amazon Payments team will wrap up the event with a "Monetization Mingle" party on July 22nd.
We want to make sure that we have time to meet with current and potential AWS customers. Events like these are a great place for us to get to know our customers and to make sure that we are serving their needs. If you are already using AWS we'd love to know more about who you are and what you do. Perhaps we can write a case study or a blog post together. If you are thinking about using AWS for your next smash hit we'd be happy to answer your questions and to help out in other ways.
If you would like to arrange a meeting with us,please fill out out our meeting request form and we'll get back to you ASAP.
-- Jeff;
This Is A Stick-Up!
The AWS Marketing Team recently moved into a shiny new building in Seattle's South Lake Union area.
We'd like to spiff up and personalize our space, and thought that our AWS users and fans could help us to do so! We've got some really nice AWS stickers that we can trade for any or all of the following:
- Some stickers from your company or group.
- A picture of your team, perhaps enhanced with your company or product logo.
- An interesting piece of SWAG.
- A blog post detailing the ways in which your company puts AWS to use, complete with an architecture diagram.
If you'd like some stickers, send us your offering and include a self-addressed envelope (we'll take care of the postage) to the following address:
Amazon Web Services
Attn: AWS Stickers
P.O. Box 81226
Seattle, WA
98108-1300
-- Jeff;
Amazon S3 Bucket Policies - Another Way to Protect Your Content
Users of Amazon S3 have been looking for additional ways to control access to their content. We've got something new (and very powerful), and I'll get to it in a moment. But first, I'd like to review the existing access control mechanisms to make sure that you have enough information to choose the best option for your application.
The two existing access control mechanisms are query string authentication and access control lists or ACLs.
The query string authentication mechanism gives you the ability to create a URL that is valid for a limited amount of time. You simply create a URL that references one of your S3 objects, specify an expiration time for the query, and then compute a signature using your private key.
The Access Control List (ACL) mechanism allows you to selectively grant certain permissions (read, write, read ACL, and write ACL) to a list of grantees. The list of grantees can include the object's owner, specific AWS account holders, anyone with an AWS account, or to the public at large.
Each of these mechanisms controls access to individual S3 objects.
Today, we are adding support for Bucket Policies. Bucket policies provide access control management for Amazon S3 buckets and for the objects in them using a single unified mechanism. The policies are expressed in our Access Policy Language (introduced last year to regulate access to Amazon SQS queues) and enable centralized management of permissions.
Unlike ACLs which can only be used to add (grant) permissions on individual objects, policies can either add or deny permissions across all (or a subset) of the objects within a single bucket. You can use regular expression operators on Amazon resource names ("arns") and other values, so that you can control access to groups that begin with a common prefix or end with a given extension such as ".html".
Policies also introduce new ways to restrict access to resources based on the request. Policies can include references to IP addresses, IP address ranges in CIDR notation, dates, user agents, the HTTP referrer, and transports (http and https).
Finally, with bucket policies we have expanded your ability to control access based on specific S3 operations such as GetObject, GetObjectVersion, DeleteObject, or DeleteBucket.
When you put all of this together, you can create policies that give you an incredible amount of access control.
You could set up a bucket policy to do any or all of the following:
- Allow write access...
- To a particular S3 bucket...
- Only from your corporate network...
- During business hours...
- From your custom application (as identified by a user agent string).
You can grant one application limited read and write access, but allow another to create and delete buckets as well. You could allow several field offices to store their daily reports in a single bucket, allowing each office to write only to a certain set of names (e.g. "Nevada/*" or "Utah/*" and only from the office's IP address range).
Policies and ACLs interact in a well-defined way and you can choose to use either one (or both) to control access to your content. You can also convert your existing ACLs to bucket policies if you'd like.
Read more in the new Using Bucket Policies section of the Amazon S3 Developer Guide. We'll also be holding an Introduction to Bucket Policies webcast on July 13th.
What do you think? How will you use this exciting and powerful new feature?
-- Jeff;
Additional RDS Functionality in the AWS Management Console
We've added some very handy new functionality to the RDS tab of the AWS Management Console. Here's a quick tour.
First, you can create a new DB Instance from any of your existing snapshots. You can click this button:

Or you can right-click on one of your snapshots:

After you provide the parameters, Amazon RDS will create a new DB Instance for you:
As you can see, you can choose to create the new instance in a different Availability Zone, on a different DB instance class, and so forth. You could, for example, create a snapshot of a production database running on one of the larger instance classes and then create a DB Instance using a smaller instance class for some light-duty testing.
Second, you can create a new DB Instance as of any point in time that falls within the backup retention period of any of your existing DB Instances. Again, you can get to this from a button or a right-click on one of your existing DB Instances:
After you provide the parameters, Amazon RDS will create a new DB Instance for you:

If you accidentally break your production database when you check in and run some new code, you can use this feature to create a new instance representing the state of the database as it was immediately preceding the change.
Both of these features will make it even easier for you to use the Relational Database Service.
-- Jeff;
New: Free Tier and Increased Limits for Amazon Simple Queue Service
We want to make it easier and more economical for you to build fault-tolerant, highly-scalable applications using the Amazon Simple Queue Service (SQS).
Effective July 1st, 2010, your first 100,000 requests to SQS each month will incur no usage charges. We'll also provide you with 1 GB per month of outbound data transfer at no charge.
We've made SQS more flexible by giving you more control of the maximum message size and the message retention time:
Maximum Message Size - Up until now, SQS messages were limited to 8 kB. This is now a user configurable limit, with a maximum value of 64 kB.
Message Retention Time - Up until now, the message retention time for all SQS queues was four days. This is now a user configurable value with valid values ranging from one hour to two weeks.
-- Jeff;
Amazon RDS: Support For SSL Connections
By popular demand, the Relational Database Service (RDS) now supports SSL encrypted connections!
We now generate an SSL certificate for each DB Instance. If you need a certificate for an existing instance you'll need to reboot it using the AWS Management Console, the RDS command-line tools, or the RDS APIs.
Here are a few things to keep in mind:
- SSL encrypts the data transferred "over the wire" between your DB Instance and your application. It does not protect data "at rest." If you want to do this, you'll need to encrypt and decrypt the data on your own.
- SSL encryption and decryption is a compute-intensive task and as such it will increase the load on your DB Instance. You should monitor your database performance using the CloudWatch metrics in the AWS Management Console (pictured at right), and scale up to a more powerful instance type if necessary.
- The SSL support is provided for encryption purposes and should not be relied upon to authenticate the DB Instance itself.
- You can configure your database to accept only SSL connections by using the GRANT command with the REQUIRE SSL option. You can do this on a per-user basis so you could, for example, require SSL requests only from users connecting from a non-EC2 host.
-- Jeff;
Mobile Trading Platform on AWS
AWS is not only a rich platform to build products and solutions but also a platform to build specialized platforms. The inherent flexibility of the AWS cloud enables businesses to use it as a platform in a variety of different ways. Some of these platforms are highlighted in my blog post titled The Cloud as a Platform for Platforms.
One such platform which is gaining a lot of steam in financial services industry is MarketSimplified. They provide a mobile trading platform on the top of AWS and specialize in making online brokerages fully mobile.
Customers of MarketSimplified not only get powerful features of the MarketSimplified Platform such as cross-device compatibility, support for multiple mobile OS, manageability, and on-demand analytics of transactions but also the scalability, elasticity and reliability of the AWS cloud. All this with no upfront capital expenditure or mobile application development overhead. Their SaaS Middleware platform combines the power of mobile and cloud computing.
So far, they are touting 11M+ Messages, and over $1B in Trade Value processed. They have powered mobile applications provided by TD Ameritrade, ChoiceTrade, IIFL, FXCM, OptionsXpress, PFGBEST, and tradeMonster.
If you would like to know more about them and their technology and how they leverage the AWS cloud, you can read our case study or meet them personally at SIFMA's 30th Annual Financial Services Technology Expo.
-- Jinesh
Amazon Web Services for Backup and Disaster Recovery
These Solution Providers offer a vast range of solution for many different use cases.
Backup and Disaster Recovery scenarios are great to highlight the advantages of the Cloud versus traditional solutions: if you need Backup and Disaster Recovery for your business, you can avoid the expensive burden of relying on physical tapes and tape management in favor of cloud-based storage.
Data written on tapes typically goes in a vault somewhere and remains useless until the tape is transported back in case of a disaster. Cost of over- or under-provisioning in this case is very high.
On the Cloud, however, you can get rid of tapes and use Amazon S3, a highly durable storage solution: it provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. All the complexities of scalability, reliability, durability, performance and cost-effectiveness are hidden behind a very simple programming interface.
Amazon S3 is intentionally built with a minimal feature set, and if you have very specific needs, that's where our Solution Providers can help.
For example, If you want to use Amazon S3 to backup Windows servers, desktops and live applications such as Microsoft Exchange and SQL Server to Amazon’s highly dependable online storage, you might consider using a solution from one of our Solution Providers, Zmanda.
Their Zmanda Cloud Backup automates the steps needed to backup your data to the cloud, through a GUI based backup configuration and management.
They just announced their third generation Cloud Backup, which fully supports the AWS Asia Pacific Region (check out the pricing list).
I've seen many customer interested in Backup solutions here in Asia Pacific, and I'm sure that they'll be interested in the solution offered by Zmanda.
But what happens when large amounts of data need to be transferred, and the internet simply isn't fast enough to do it in a reasonable amount of time?
AWS Import/Export accelerates moving large amounts of data into and out of AWS, using portable storage devices for transport. AWS transfers your data directly onto and off of storage devices using Amazon’s high-speed internal network and bypassing the Internet.
For significant data sets, AWS Import/Export is often faster than Internet transfer and more cost effective than upgrading your connectivity.
If you are using Amazon S3 and/or AWS Import/Export for Backup or Disaster Recovery, let us know your story and tell us what do you like the most about these services.
You might also be interested in reading Best Practices for Using Amazon S3.
- Simone Brunozzi (@simon)
Big Data Workshop and EC2
Many fields in industry and academia are experiencing an exponential growth in data production and throughput, from social graph analysis to video transcoding to high energy physics. Constraints are everywhere when working with very large data sets, and provisioning sufficient storage and compute capacity for these fields is challenging.
This is particularly true for biological sciences after the recent quantum leap in DNA sequencing technology. These advances represented a step change for the field of genomics, which had to learn quickly about how housing and processing terabytes of data through complex, often experimental workflows.
Processing data of this scale for a single user is challenging, but moving to the cloud meant Michigan State University were able to provide real world training to whole groups of new scientists using Amazon's EC2 and S3 services.
Titus Brown writes about his experiences of running a next-generation sequencing workshop using Amazon's Web Services in a pair of blog posts:
"Students can choose whatever machine specs they need in order to do their analysis. More memory? Easy. Faster CPU needed? No problem.
All of the data analysis takes place off-site. As long as we can provide the data sets somewhere else (I've been using S3, of course) the students don't need to transfer multi-gigabyte files around.
The students can go home, rent EC2 machines, and do their own analyses -- without their labs buying any required infrastructure."
After the two week event:"I have little doubt that this course would have been nearly impossible (and either completely ineffective or much more expensive) without it.
In the end, we spent more on beer than on computational power. That says something important to me."
A great example of using EC2 for ad-hoc, scientific computation and reaping the rewards of a cloud infrastructure for low cost, reproducibility and scale.
~ Matt
New: CloudWatch Metrics for Amazon EBS Volumes
If you already have some EBS (Elastic Block Store) volumes, stop reading this post now!
Instead, open up the AWS Management Console in a fresh browser tab, select the Amazon EC2 tab and click on Volumes (or use this handy shortcut to go directly there). Click on one of your EBS volumes and you'll see a brand new Monitoring tab. Click on the tab you'll see ten graphs with information about the performance of the volume.
For those of you without any EBS volumes (what are you waiting for?), here's what you are missing:
Effective immediately, we now store eight metrics in Amazon CloudWatch for each of your EBS volumes. The metrics are stored with a granularity of five minutes and each data point represents the activity over the period. Here's what we store for you:
- VolumeReadBytes - The number of bytes read from the volume over the five minute period.
- VolumeWriteBytes - The number of bytes written to the volume over the five minute period.
- VolumeReadOps - The number of of read operations performed on the volume in the period.
- VolumeWriteOps - The number of write operations performed on the volume in the period.
- VolumeTotalReadTime - The total amount of waiting time consumed by all of the read operations which completed during the period.
- VolumeTotalWriteTime - The total amount of waiting time consumed by all of the write operations which completed during the period.
- VolumeIdleTime - The amount of time when no read or write operations were waiting to be completed during the period.
- VolumeQueueLength - The average number of read and write operations waiting to be completed during the period.
You can access all of this from the CloudWatch API and the CloudWatch command-line (API) tools of course.
You can use these metrics to diagnose performance issues, learn more about the level of usage of each of your volumes, or to track long-term performance trends for your application.
There is no additional charge for monitoring EBS volumes and the metrics are stored for two weeks.
-- Jeff;
London Calling
Hello!
I'm Matt Wood, and I've just joined Amazon Web Services as the new Technology Evangelist for Europe.
Before joining Amazon I built web-scale search engines at Cornell University in New York City (of which Amazon CTO Werner Vogels is a fellow alumni), document management systems in Cambridge and genes in Hinxton. At the Wellcome Trust Sanger Institute, I helped build the Institute's next-generation, petabyte-scale DNA sequencing platform, which allows genomes to be sequenced in days rather than decades. I have a PhD in Bioinformatics, am a big fan of agile software techniques, once trained as a medical doctor and love cycling, film and Swedish indy music.
I'm interested in helping teams of all sizes take pragmatic steps to build highly available products and services. On demand infrastructure offers so many advantages in terms of agility, scale, security and cost to a wide range of industries: I can't wait to start sharing real world stories and best practices with you all.
I couldn't be more excited about joining the Amazon Web Services team, and I'm looking forward to meeting as many of you as possible in the coming months, answering your questions and catching up on the awesome projects you're building on AWS.
If you have any questions, speaking opportunities or are running one of the many European Amazon User Groups, feel free to drop me a line: mawood at amazon dot com.
~ Matt
Building three-tier architectures with security groups
Update (17 June): I've changed the command-line examples to reflect current capabilities of our SOAP and Query APIs. They do, in fact, allow specifying a protocol and port range when you're using another security group as the traffic origin. Our Management Console will support this functionality at a later date.
During a recent webcast an attendee asked a question about building multi-tier architectures on AWS. Unlike with traditional on-premise physical deployments, AWS's virtualization of compute, storage, and network elements requires that you think differently about how to build network segregation into your projects. There are no distinct physical networks, no VLANs, and no DMZs. So how can you construct the equivalent of traditional three-tier architectures?
Our security whitepaper alludes to the possibility (pp. 5-6, November 2009 edition). In my security presentations I show this diagram to illustrate conceptually how a three-tier architecture can be built:
Security groups: a quick review
Before we explore how to define the architecture, let's take a moment to review some critical details about how security groups work.
A security group is a semi-stateful firewall (more on this in a moment) that contains one or more rules defining which traffic is permitted into an instance. Rules contain the following elements:
- The permitted protocol (TCP or UDP)
- The permitted destination port range (more on this in a moment, too)
- The permitted source IP address range or originating security group
Now there are three particular aspects I'd like to call your attention to. First: security groups are semi-stateful because changes made to their rules don't apply to any in-progress connections. Say that you currently have a rule permitting inbound traffic to port 3579/tcp, and that there are right now five inbound connections to this port. If you delete the rule from the group, the group blocks any new inbound requests to port 3579/tcp but doesn't terminate the existing five connections. This behavior is intentional; I want to ensure everyone understands this. In all other respects, security groups behave like traditional stateful firewalls.
The second aspect is our terminology for port ranges. This often confuses people new to AWS. The traditional usage of the words "from" and "to" in security-speak describes traffic direction: "from" indicates the source and "to" indicates the destination. This isn't the case when defining rules for security groups. Instead, security group rules concern themselves only with destination ports; that is, the ports on your instances listening for incoming connections. The "from port" and "to port" in a security group rule indicate the starting and ending port numbers for occasions when you need to define a range of listening ports. In most cases you need to allow only a single port, so the values for "from port" and "to port" will be the same.
This leads to the third aspect I'd like to discuss: how to define traffic sources. The most common method is to specify a protocol along with an individual source IP address, a range of IP addresses using CIDR notation, or the entire Internet (using 0.0.0.0/0). The other way to define a traffic source is to supply the name of some other security group you've already created. Here's the magic jewel for creating three-tier architectures; it's this capability that answered the person's question on the webcast.
Defining the security groups for a three-tier architecture
If you're an API aficionado, you can use these eight simple calls to create the three required security groups to implement this architecture:
ec2-authorize WebSG -P tcp -p 80 -s 0.0.0.0/0
ec2-authorize WebSG -P tcp -p 443 -s 0.0.0.0/0
ec2-authorize WebSG -P tcp -p 22|3389 -s CorpNet
ec2-authorize AppSG -P tcp|udp -p AppPort|AppPort-Range -o WebSG
ec2-authorize AppSG -P tcp -p 22|3389 -s CorpNet
ec2-authorize DBSG -P tcp|udp -p DBPort|DBPort-Range -o AppSG
ec2-authorize DBSG -P tcp -p 22|3389 -s CorpNet
ec2-authorize DBSG -P tcp -p 22|3389 -s VendorNet
Note here the interesting distinction in the parameters used with the commands. If the rule permits a source IP address or range, the parameter is "-s" which indicates source. If the rule permits some other security group, the parameter is "-o" which indicates origin. Neat, huh?
The color coding in the rule list helps you visualize how the rules relate to each other:
- The first three statements define WebSG, the security group for the web tier. The first two rules in the group permit inbound traffic to destination ports 80/tcp and 443/tcp from any node on the Internet. The third rule in the group permits inbound traffic to management ports (22/tcp for SSH, 3389/tcp for RDP) from the IP address range of your internal corporate network -- this is optional, but probably a good idea if you ever need to administer your instances :)
- The second two statements define AppSG, the security group for the application tier. The second rule in the group permits inbound traffic to management ports from your corpnet. The first rule in the group permits inbound traffic from WebSG -- the origin -- to the application's listening port(s).
- The final three statements define DBSG, the security group for the database tier. The second and third rules in the group permit inbound traffic to management ports from your corpnet and from your database vendor’s network (required for certain third-party database products). The first rule in the group permits inbound traffic from AppSG -- the origin -- to the database's listening port(s).
Of course, not everyone's a programmer (your humble author included), so here are some screen shots showing how to define these security groups using the AWS Management Console. Please be aware that using the Console produces different results, which I'll describe in a moment.
WebSG permitting HTTP from the Internet, HTTPS from the Internet, and RDP from our sample corpnet address range:
AppSG permitting connections from instances in WebSG and RDP from our sample corpnet address range:
DBSG permitting connections from instances in AppSG and RDP from our sample corpnet and vendor address ranges:
Important. The AWS APIs and the Management Console behave differently when defining security groups as origins:
- Management console: When you define a rule using the name of a security group in the "Source (IP or group)" column, you can't define specific protocols or ports. The console automatically expands your single rule into the three you see: one for all ICMP, one for all TCP, and one for all UDP. If you remove one of them, the console will remove the other two. If you wish to further limit inbound traffic on those instances, feel free to use a software firewall such as iptables or the Windows Firewall.
- SOAP and Query APIs: With the APIs, rules containing security group origins can include protocol and port specifications. The result is only the rules you define, not the three broad automatic rules like the console creates. This provides you with greater control and reduces potential exposure, so I'd recommend using the APIs rather than the Console. As of now, while the Console correctly displays whatever rules you define with the APIs, please don't modify API-created rules because the Console's behavior will override your changes. We're working to make the Console support the same functionality as the APIs.
More information
The latest API documentation provides details and examples of how to configure rules in security groups. To learn more, please see:
- Network security concepts (from the Amazon EC2 User Guide)
- Using security groups (from the Amazon EC2 User Guide)
- SOAP and Query API examples (from the Amazon EC2 Developer Guide)
- SOAP API syntax (from the Amazon EC2 API Reference)
- Query API syntax (from the Amazon EC2 API Reference)
I hope this short tutorial has been useful for you and provides information you can use as you plan migrations to or new implementations in AWS. Over time, I'd like to write more short security and privacy related guides which I'll post here and in our Security Center. If you have comments or suggestions about content you'd like to see, please let us know. We're here to make sure you succeed!
> Steve <