Migrating to SimpleDB

This post is about what to consider when making the move from a traditional RDBMS to Amazon’s SimpleDB data store. As you may be aware, LightSpeed supports working with SimpleDB along with traditional databases so we get questions about migration every now and then and I wanted to gather thoughts on this in one place. LightSpeed provides many benefits to .NET developers wanting to work with SimpleDB so following any general point I’ll be mentioning how LightSpeed can aid in migration.

SimpleDB does not have data types

Unlike when you define a schema in a traditional database, in SimpleDB everything is one type – string. Want to set decimal precision? No luck. Want to set a maximum field length? Go away. This is not to say that having no data types beyond string is necessarily a bad thing, it is simply something that you need to keep in mind when planning a migration to SimpleDB.

How LightSpeed helps: Despite everything being a string, you define your domain model using CLR types (decimal, int, datetime, etc). The nice thing about this is that LightSpeed will take care of the data conversion for you without you needing to worry. If your model has a decimal on it and we get some string back from SimpleDB, we’ll convert it on the fly to the CLR decimal that you actually want to be working with. We also format things so that numeric and date comparisons still work even though SimpleDB supports only lexical comparisons.

No Stored Procedures

There’s not much more to say here other than they’re not supported and I doubt they ever will be. You’ll need to drop stored procedures and use dynamic querying. But hey – you’re in NoSQL land now, surely you’re one of the cool kids who doesn’t like Stored Procedures anyway right? :-) The same goes for much of the database infrastructure that you may be used to – triggers, auditing, etc, does not exist with SimpleDB. It is a simple data store.

SimpleDB does not have foreign keys or relationships.

SimpleDB does not have a notion of FK’s or associations/relationships. You can still have CustomerID on a ShoppingCart, but it won’t have referential integrity enforced at the database level.

How LightSpeed helps: LightSpeed is a convention driven ORM and will pick up that properties named [Type]Id are probably meant to be associations. If you have really funky naming, you’ll need to wire up the associations manually but that’s a one off effort. LightSpeed can then manage the constraints (one to many, many to one, one to one, many to many).

However, SimpleDB doesn’t support joins or multi-statement queries or anything like that. So one thing that LightSpeed can’t do relationship-wise is eager loading. All associations are lazy-loaded: given SimpleDB latency, the n+1 problem can cause significant performance issues. LightSpeed caching can help here.

SimpleDB does not have field constraints

Currently using a unique constraint on a database field to make sure you don’t get duplicates? That won’t fly in SimpleDB. You’ll need to make sure that you’re checking uniqueness (or any other type of constraint you want to enforce) at the application layer, not at the database layer.

How LightSpeed helps: LightSpeed includes many validations such as uniqueness checking – as well as many more – that mean you don’t need to do the constraint enforcement at the database level. This has many advantages beyond plugging the holes in SimpleDB, for example if you have a uniqueness validation failure it’s much easier to present the user with a meaningful error message about the given entity rather than having to handle an insert error message. Again, though, you’ll want to watch out for possible performance implications — a uniqueness validation incurs a round-trip to SimpleDB every time an entity is validated.

SimpleDB is likely slower than your relational database

We’re not saying SimpleDB isn’t web scale, but more that your queries are now performed as HTTP requests. There is an inherent latency in this which is unlikely to exist in a traditional server solution (where the database may be on the same machine, or on the same network segment meaning it’s extremely fast to query). This is a general consideration when looking at the move to SimpleDB. Amazon have been working to improve this situation in various ways – batching of inserts & updates, faster querying if you’re issuing calls from within the Amazon hosting environment (calls to SimpleDB are faster if undertaken from an EC2 instance in the same data center).

How LightSpeed helps: By abstracting away your querying and passing it to LightSpeed you can work with us to support new scenarios with SimpleDB. For example, one user noticed a performance issue that we could solve with a new feature added by Amazon – batched puts. By adding this support into LightSpeed any SimpleDB user working through LightSpeed could simply update to the latest version to get the performance benefits of batched puts. If you handwrite your queries for SimpleDB you would need to manually go through your code base and update to support the new features of SimpleDB as they are released.

LightSpeed caching can also help with data that changes relatively infrequently such as reference data.

This sums up some of the key differences you are likely to encounter when dealing with SimpleDB. We’ve worked hard to try and make LightSpeed abstract away some of the inconsistencies associated with using SimpleDB in your projects and hope that it eases transition between any data store that LightSpeed supports. Interested in testing out SimpleDB for yourself? Grab these tools:

Mindscape LightSpeed
Mindscape SimpleDB Management Tools

Leave a Reply

Archives

Join our mailer

You should join our newsletter! Sent monthly:

Back to Top