Tuesday, October 27, 2009

MySQL gets cloudy with Amazon's new database service

Amazon is offering a new relational database service for EC2 that is powered by MySQL. (By Ryan Paul, October 27, 2009)

Amazon is expanding its Elastic Compute Cloud (EC2) infrastructure with a new offering based on the open source MySQL database system. The Amazon Relational Database Service (RDS) allows users to rent database capacity in the cloud and use it just like a regular MySQL database. Amazon has also introduced support for a new class of EC2 instances intended for high-memory workloads.

Amazon's EC2 service is an increasingly popular solution for deploying Web applications in the cloud, but its database options were previously somewhat limited. Amazon offers a custom database system called SimpleDB that is designed to store rows with simple attribute/value pairs. It lacks the sophisticated features of true relational databases and obviously isn't compatible out of the box with the multitude of existing Web applications that are designed to work with SQL. The new RDS option is a welcome enhancement for EC2 and it addresses one of the service's major deficiencies.

EC2 customers will be able to obtain RDS capacity by renting Amazon DB instances. The cheapest package, the Small DB Instance, provides 1.7GB of RAM and 1 ECU* for $0.11 per hour. A large instance provides 7.5GB of RAM and 4 ECUs for $0.44 per hour. There are five different packages altogether, the most expensive of which is the Quadruple Extra Large DB Instance, which provides 68GB of RAM and 26 ECUs for $3.10 per hour.

The Quadruple Extra Large instance is also available for regular EC2 computing for $2.40 per hour. Amazon introduced it today along with a Double Extra Large instance which provides 32GB of RAM and 13 ECU. Regular EC2 Double Extra Large instances cost $1.20 per hour and the RDS variant costs $1.55 per hour. RDS data storage capacity can be provisioned for $0.10 per GB. The database service includes an automated backup feature that can use the provisioned database storage.

"RDS provides cost-efficient and resizable capacity, while managing time-consuming database administration tasks for customers. The service takes much of the hassle out of setting up and managing relational databases, such as backups and code patching, freeing up its users to focus on their applications and business," wrote Amazon CTO Werner Vogels in a blog entry. "Amazon RDS provides the full capabilities of a MySQL Database, which means that libraries, applications and tools that have been designed for use with MySQL can be used without modification."

Indeed, EC2 users have already figured out and documented the proccess of deploying Django on Amazon's cloud using RDS for database hosting. RDS is clearly a lot easier than trying to manually set up and manage MySQL on EC2 with Elastic Block Storage (EBS).

Amazon's cloud services are evolving and becoming increasingly affordable, but aren't quite ready for everyone yet. Recent studies show that the service still can't match the uptime of in-house data centers and isn't cost-effective for large enterprises with heavy workloads. Amazon aims to change that eventually. In addition to the new features, the company has also announced plans today to reduce per-instance EC2 pricing next month. Amazon says that reduction, which is as much as 15% for some kinds of instances, was made possible by ongoing efforts to bring down its operating costs.

http://arst.ch/9b7

*Note: ECU = "EC2 Compute Unit". One ECU provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.

Monday, October 26, 2009

Ping Li comments on Big Data In The Cloud

I briefly spoke with Ping Li (VC firm Accel) at the Orrick event on "Loving The Cloud" which I wrote up earlier. Over the weekend he posted the following which expands on that subject:
http://gigaom.com/2009/10/25/the-future-is-big-data-in-the-cloud/

=Bill=

Hadoop presentations

I attended the Hadoop Users Group on 10/21/09 over at Yahoo! HQ where 3 different speakers made presentations. Their slide decks have now been posted at: http://tiny.cc/Hadoop_User_Group_Oct_21

=Bill=

Saturday, October 24, 2009

Yahoo! is hiring for Hadoop team

This might be of interest to the CSIX Cloud SIG. One of the best examples of new applications made practical by cloud computing is an Apache open source project named Hadoop. The biggest contributor (by far) to the Hadoop project is Yahoo! They are looking to expand their team as described below. Sorry, I don't have any internal contacts there. =Bill=

Yahoo! Inc. - Cloud Team SW Architects & Engineers - Hadoop, MapReduce, databases & database internals
Contact jwallace@yahoo-inc.com

Software Architects & Engineers Sunnyvale CA, USA
Yahoo! is looking for engineering team members to work on MapReduce and related projects that build database abstractions on top of Apache Hadoop. Understanding of distributed computing, databases, data warehouses, and especially database internals is a big plus. The Hadoop team moves as fast as any startup does, even though the scale of the problems it solves would make any startup founder deeply envious. We need engineers who can develop systems that can handle multi terabytes of new data daily and which runs on thousands of machines. The Hadoop team is growing fast! This is not surprising - all of Yahoo! batch data processing is moving to Hadoop, and we need many more great people to join this team.

For a complete list of available positions please contact jwallace@yahoo-inc.com An example of what we seek is below:

We are looking for great Software Architects & Engineers who have a wealth of experience with complex distributed software systems, algorithms, data structures, and performance optimizations. Understanding of grid computing, databases, data warehouses, and especially database internals is a big plus. Solid Java skills are required. Experience with agile development and open source development is desired. 6+ years of relevant software development experience are desired. Please advise me as to your interest and send me a resume. Also feel free to send me a LinkedIn invite so we can stay in touch.

Thanks,
John Wallace
Talent Acquisition Team
Advertising Technology, Cloud Computing & Search jwallace@yahoo-inc.com netrecruiter05@yahoo.com
http://www.linkedin.com/in/wwwrecruiter

Monday, October 19, 2009

CSix Eco-Green Website

The following message thread occurred in the old CSix Cloud Computing Yahoo Group starting 10-19-9. The text was transferred in case anyone wants to continue the discussion here.



CSIX Eco-Green website

Posted by: "bill_slagle"   bill_slagle

Mon Oct 19, 2009 9:48 am (PDT)

The CSIX Eco-Green SIG (EGG) has their website up and running. If you
want to take a look then see http://www.ecogreen group.org

I think it's interesting that they are using a Content Mgmt System (CMS)
called Joomla. I attended the Drupal BAD Camp this weekend at
UC-Berkeley. I wonder how Joomla and Drupal compare? Just curious.

=Bill=




Re: CSIX Eco-Green website

Posted by: "Glenn Reitsma"  greitsma

Mon Oct 19, 2009 9:56 am (PDT)

Joomla and Drupal are very similar. From what I've read Joomla is a bit
easier to learn and get something up quickly while Drupal has greater
flexibility but with a steeper learning curve to get started building the
site. Once the site is built the content management is easy from either one.

Glenn Reitsma
408-307-3058

http://glennreitsma .emurse.com/ (on-line resume)
http://www.linkedin .com/in/glennrei tsma




Re: CSIX Eco-Green website

Posted by: "Raj V"  rajvish

Tue Oct 20, 2009 10:22 am (PDT)

It would depend on the job.

Joomla is very good if you want  a  quick professional looking site. But Joomala has its own way of looking at things.

Drupal is more complicated but has a better abstraction that makes it a better Content Management System and Framework.

I also notice that Drupal seems ( no numbers yet) to consume less system resources than Joomla.

Having said that they are fast approaching the status of "What is better? Emacs or vi?"

Googling Drupal Vs. Joomla produces 4,650,000  sites. :-)

Raj

Tuesday, October 13, 2009

Cloud Computing Snafu Deletes Microsoft Sidekick T-Mobile Data

The following message thread occurred in the old CSix Cloud Computing Yahoo Group starting 10-13-9.  The text was transferred in case anyone wants to continue the discussion here.



Cloud computing snafu deletes Microsoft Sidekick T-Mobile data

Posted by: "jmcd0205"   jmcd0205

Tue Oct 13, 2009 8:42 am (PDT)

FYI, Blog posting from sfgate.com by Yobie Benjamin. Companies need to remember some basics, like backing up data, maybe 2-3x...?

Eight hundred thousand T-Mobile subscribers who use Microsoft's Danger Sidekick smart phones suffered the worst possible failure that can occur to anyone's personal data. All the customers' data - address books, calendars, to-do lists and photos was wiped out... kaput... gone... destroyed... forever and ever.

This computing nightmare affects not only Sidekick owners but everyone who owns a smart phone who now have to question the integrity of their own devices.

T-Mobile, the operator of the Sidekick's data service and Microsoft fumble to explain how this massive clusterfoot happened.

T-Mobile's web site explains:

Regrettably, based on Microsoft/Danger' s latest recovery assessment of their systems, we must now inform you that personal information stored on your device -- such as contacts, calendar entries, to-do lists or photos -- that is no longer on your Sidekick almost certainly has been lost as a result of a server failure at Microsoft/Danger. That said, our teams continue to work around-the-clock in hopes of discovering some way to recover this information. However, the likelihood of a successful outcome is extremely low... Sidekick users "NOT reset their device by removing the battery or letting their battery drain completely, as any personal content that currently resides on your device will be lost."

The implication of the statement is Microsoft servers suffered a catastrophic failure and that there was no backup. How the f&^% is it possible there was no single backup on an enterprise so critical to the personal lives of tens of thousands of T-Mobile/Microsoft Sidekick users?

This isn't the first time a Web service has crashed and left its users without access to data stored "in the cloud." Google's Gmail has had multiple outages but it has very quickly recovered with no data loss.

This is the FIRST TIME a MAJOR cloud-computing vendor didn't have any backups. It is a total failure of systems from Microsoft's server operating systems, storage systems, processes, procedures and everything that shoulda, woulda, coulda happened.

Folks, this is Microsoft we're talking about here. It's the same company who wants us to upgrade to Windows 7. It's the company who wants to be your cloud computing company of choice. It's the same company who sells server operating systems.

The blame is also being shared with Hitachi Data Systems who provided the failed backup systems. It is being reported Microsoft contracted Hitachi Data Systems (HDS) to do remedial work on the server infrastructure and that, during the work, the server infrastructure failed. There was no backups or replicated data set and so the data was lost forever.

Further this involves a major telecommunications company - T-Mobile, not some small rural mobile phone provider. We need to understand so consumers know what to avoid in the future.

In every company I have ever worked in or consulted for, backing up is part of Information Technology 101. I always advocate not only one backup but sometimes double and triple backups and minimum 30-day archives. How can Microsoft, Hitachi and T-Mobile allow this massive failure?

Microsoft, Hitachi and T-Mobile must come clean and explain where and how the failures occurred lest they suffer the consequence of loosing the public and corporate enterprises' trust. For now till they explain their names are all in the mud.

This is a complete failure. It would be a tough day to sell Microsoft server operating systems software today and even harder to sell Hitachi Data's backup systems... unless Microsoft, Hitachi and T-Mobile comes clean and explains how this massive failure can happen. If I were a company chief information officer or chief information security officer, I would have to do a complete double-take before I commit to a server operating system or backup solution that can suffer such catastrophic failure.

The worst part of this (having owned a Sidekick) is there is NO EASY WAY to backup your Sidekick. It's supposed to do it for you. This absolutely sucks for T-Mobile subscribers who use the Microsoft Sidekick dumb phone.

It's small comfort that T-Mobile suspended the further sales of Microsoft's Sidekick smart phone.




Re: Cloud computing snafu deletes Microsoft Sidekick T-Mobile data

Posted by: "Bob Sutterfield"  bsut2002

Sat Oct 17, 2009 11:16 am (PDT)

Here's how I understand the situation, from other sources:

Microsoft bought Danger some time ago. This acquisition included their
products, staff, physical assets (like servers and software), and their
revenue contracts (like that with T-Mobile). Microsoft is partway through
ingesting their acquisition, and came to the point of technical
integration. The Danger services were singly homed, without redundancy,
which is why they wanted to move onto Microsoft's cloud. They were
re-installing the OS, and preparing to later install the cloud management
software and Danger-specific service software, onto the Danger servers.
This task was being performed by a Hitachi contractor working for
Microsoft.

The project plan assumed (it turns out mistakenly) that the servers and
services and databases were already configured in a manner compliant with
the Microsoft cloud infrastructure spec, so the technician was simply
re-installing everything. If that assumption had been accurate, the
services and databases would have been in service redundantly someplace
else, and removing the Danger infrastructure from service would have
triggered a fail-over to those other instances. (This is why I always
design services with tripled infrastructure: one I can put under
maintenance, one in service, and another backup in case the primary fails
while I'm doing maintenance. Plus I never use cold backups, I always use
hot-hot load balancing between multiple primary active instances.)

I don't know why the integration project plan didn't include an assessment
of the existing infrastructure and dependencies. I don't know why the plan
didn't include the task of backing up the Danger user databases before
FDISKing the servers. I hope the technician didn't lose her job - the
outage was the project manager's fault.


Re: Cloud computing snafu deletes Microsoft Sidekick T-Mobile data

Posted by: "Bob Sutterfield"  bsut2002

Sun Oct 18, 2009 4:45 pm (PDT)

I forgot to mention: This is getting publicity as a Cloud Computing
Failure. It's SAAS (in the sense that we can now label anything that's
hosted "someplace else" as SAAS to replace the old-fashioned ASP label) and
therefore a Cloud application, but it's not running on a Cloud
infrastructure (elastic pools of virtualized resources). They intended to
move it from a traditional hosting infrastructure onto a Cloud, and that
move plan is where the mistakes happened.

Monday, October 5, 2009

Using AWS S3 Service as an External Disk for my Laptop

The following message thread occurred in the old CSix Cloud Computing Yahoo Group starting 10-05-9.  The text was transferred in case anyone wants to continue the discussion here.




Date: Mon, 5 Oct 2009 19:41:48 -0700 (PDT)
To: CSix-Cloud@yahoogroups.com
From: Chaganti Radhakrishna <chaganti.radhakrishna@...>
Subject: Using AWS S3 service as an external disk for my laptop

Folks,

Want to experiment with a simple cloud use case - that of backing up my personal laptop hard-drive on AWS S3 instance. Has anyone done this before? What do you see as the pros and cons of doing this? Like to hear your thoughts on this use case.

RK Chaganti




Date: Mon, 5 Oct 2009 22:44:05 -0700
From: "Waiming Mok" <wmm@...>
To: "'Chaganti Radhakrishna'" <chaganti.radhakrishna@..., CSix-Cloud@yahoogroups.com>
Subject: RE: [CSix-Cloud] Using AWS S3 service as an external disk for my laptop

Might be cheaper with Carbonite (www.carbonite.com )

if you just want to backup your (1) computer.

We can consider carbonite to be in the cloud.



Amazon S3: 15 cents / GB/mo @ 100 GB = $15 / month

(not including upload bandwidth costs)

Vs

Carbonite: $3.61 / month for unlimited backup storage per computer.



Waiming




Date: Wed, 7 Oct 2009 16:48:05 -0700
From: "Rich Zbriger" <rich.zbriger@...
Reply-To: <rich.zbriger@...
Subject: RE: [CSix-Cloud] Using AWS S3 service as an external disk for my laptop

I have been using Carbonite for about 6 months with my desktop computer at
home and it works very well. It runs in the background and backs up new
files and updates to existing files. Backups are not performed immediately,
but seem to be sufficiently often. The initial system backup can take quite
a long time about 12-15 GB per DAY. They recommend not trying to back up a
complete system just user data, but you can specify whatever you want backed
up.



Rich

Thursday, September 3, 2009

Grid vs Cloud

The following message thread occurred in the old CSix Cloud Computing Yahoo Group starting 9-3-9.  The text was transferred in case anyone wants to continue the discussion here.







Date: Thu, 3 Sep 2009 09:49:19 -0700
From: Glenn Reitsma <glenn.reitsma@...>
Subject: grid vs cloud


Is there a common distinction between grid computing and cloud computing or are the two terms synonymous ?


Glenn Reitsma
408-307-3058


http://glennreitsma.emurse.com/    (on-line resume)







Date: Thu, 3 Sep 2009 10:35:07 -0700
From: Bob Sutterfield <bob@...>
Subject: Re: [CSix-Cloud] grid vs cloud


Glenn Reitsma wrote:
Is there a common distinction between grid computing and cloud computing or are the two terms synonymous?


At the layer that's now called IaaS "Infrastructure as a Service", I think of the three RADlab distinctives (infinite resources, no commitment, pay by the drink) as distinguishing Grid from Cloud.  I think of Grid as denoting a large number of identical elements in a loosely coupled parallel architecture.  Many users do Grid-style large scale computing on a Cloud, but not all Grids are managed in a Cloud-y way.


At the layer that's now called SaaS "Software as a Service", there's not much difference from the user's point of view.







Date: Thu, 3 Sep 2009 12:55:35 -0700
From: "Waiming Mok" <wmm@...>
Subject: RE: [CSix-Cloud] grid vs cloud
From my experience, grid (ala grid computing) tends to focus on
Solving computational intensive applications
(e.g. SLAC and CERN use grid for nuclear physics).
Grid has been around since the early days of
lower-cost rack-mount servers (sparc or x86).
There tends to be master job scheduler that dispatches jobs
To the various compute nodes (e.g. Rock, LSF, Sun Grid Engine).
There is a distributed file system shared between the nodes.


Some differences I know about between grid and cloud:


1)      Because grid tends to server computing intensive jobs, grid does not
Use virtualization to partition the hardware node, whereas cloud
Tends to use virtualization partitioning to get multiple OS and workloads
Running in a single node.
2)      Grid dispatches jobs (in form of applications) to the nodes running specific OS.
Cloud tends to spawn virtual machines at the IaaS level.


As Bob says, it’s possible to deploy grid within cloud – should take above into account.


Waiming

Friday, August 28, 2009

Cloud Organizing Concepts (Disorganized)

The following message thread occurred in the old CSix Cloud Computing Yahoo Group starting 8-28-9. The text was transferred in case anyone wants to continue the discussion here.



Date: Fri, 28 Aug 2009 19:49:04 -0700
To: csix-cloud@yahoogroups.com
From: Bob Sutterfield <bob@...>
Subject: Cloud organizing concepts (disorganized )

Here are some disconnected organizing concepts that have been sitting in my drafts folder for a while...

UC Berkeley RAD Lab "Above The Clouds" definition of Cloud Computing includes three primary characteristics:
  • Huge Resources - The illusion of infinite computing resources available on demand, thereby eliminating the need for Cloud Computing users to plan far ahead for provisioning
  • No Commitment - The elimination of an up-front commitment by Cloud users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs
  • Pay by the Drink - The ability to pay for use of computing resources on a short-term basis as needed (e.g., processors by the hour and storage by the day) and release them as needed
Some would say, if it doesn't meet these criteria, it's not Cloud.

Key benefits of each of these three characteristics:
  • Huge Resources - IT agility as systems can be sized to meet demand -- as load scales, system resources are easily obtained to ensure SLAs can be met
  • No Commitment - No longer face the tradeoff between overprovisioning (waste of capital) and underprovisioning (waste of users)
  • Pay by the Drink - Move IT payments from CAPEX to OPEX. Pay only for actual resources consumed. Tie IT cost to business benefit received
See the taxonomy diagram on page 22 of Guidance for Critical Areas of Focus in Cloud Computing for discussion of IaaS, PaaS, SaaS layering - also the nearby discussion of features, openness, and security.

Deployment Options:

        |                  |
Public  | Virtual Private  | Public Cloud
        |    (AWS++)       |        

        |------------------+---------------------
        |                  |

        | Internal Private | External Private Cloud
Private |                  |   (outsourcing++)

        |------------------+---------------------
             Dedicated     |  Shared








Here's a good presentation on Risk and Security in the Enterprise Cloud





To: <csix-cloud@yahoogroups.com>
From: "Junaid Qurashi" <junaidqurashi@...>
Subject: Re: [CSix-Cloud] Cloud organizing concepts (disorganized )

Re: [CSix-Cloud] Cloud organizing concepts (disorganized )

"Some would say, if it doesn't meet these criteria, it's not a Cloud"
How does private cloud fit in this definition. Some private cloud are restricted to a defined set of resources. Or are they not a cloud?
Junaid



Date: Thu, 3 Sep 2009 22:20:42 -0700
To: csix-cloud@yahoogroups.com
From: Bob Sutterfield <bob@...>
Subject: Re: [CSix-Cloud] Cloud organizing concepts (disorganized )
Huge Resources - The illusion of infinite computing resources available on demand, thereby eliminating the need for Cloud Computing users to plan far ahead for provisioning

Junaid Qurashi wrote:
How does private cloud fit in this definition. Some private cloud are restricted to a defined set of resources. Or are they not a cloud?

From the private cloud's users' perspective, any resources they want are available, on demand, without planning for provisioning. They probably have a better-informed view of the limits of the private cloud than do the users of a public cloud, but still they get what they need when the ask for it, within those constraints.