A year ago I built an app to keep track of pickup volleyball game scores and payments. It works well, but after a year of regular use it's time to update it with some improvements.
Today I'll walk you through some of the details I considered when deciding to switch to a NoSQL storage solution.
Why Azure NoSQL Table Storage
Azure SQL has been good to me - it works, it's familiar, and it's relatively inexpensive. However, even though the \$5/month Azure SQL bill isn't cost prohibitive (especially with free credits :)), ideally I'd love the app to be able to fund itself via the fractions of a penny that get rounded up as part of the app's balancing logic.
Azure Table Storage in comparison is dirt cheap. As of the time I'm writing this, it will cost me about \$.05/month to store and access my < 1gb of application data.
Another option I considered is Azure SQL Serverless. This is a nice alternative because it would allow me to keep the relational structure of my data, however based on my historical app usage patterns it would definitely end up costing more than Azure Table Storage. Cool service, but not a good fit for my scenario.
Another reason I went with Azure Table Storage is its simplicity.
Azure offers an alternative table storage option called Cosmos DB. Cosmos DB is Azure's premium table storage offering, but it was overkill for what my app needed.
Yes, Cosmo DB's global distribution is cool. Yes, its additional indexes on my data is great. However, as I mentioned, cost is the biggest factor in how I'm deciding on a storage solution for this project. And Cosmos DB has lots of great features, but they come at a price.
My preference for this app is to pay more up front in development time to create a better design then rely on a service that does some of that for me at a higher monthly cost. In this case, simple features will accomplish what I need, so that's what I'm going with.
Design Considerations for Azure Table Storage
When rewriting my app's data layer, there were several new things that I had to account for in Azure Table Storage.
The primary key in Azure Table Storage is made up of two columns: PartitionKey and RowKey. This composite primary key is also the clustered (and only!) index for the table.
The PartitionKey in particular is important because it determines whether related rows of data will exist on the same underlying server or not. Different PartitionKey values may end up on different servers. This can be a good thing if you factor in parallel access in your design, or a significant bottleneck if all of your data ends up residing on a single server.
There are two aspects of latency to consider.
The first is how far you are from your Azure region. For me, this isn't a huge deal since the only users of the app are currently in northeast Ohio, so choosing to store all of the data in the same region is good enough.
The bigger consideration for this app is that with Azure Table Storage, you can't do any joins with your data. Well, you can join in your app, but you can't join in Azure itself.
This means that design is critical to reducing latency since joining multiple tables of data in your app will require multiple calls to the Azure Table Storage service. This is not something that is necessarily a deal breaker, just something that needs to be considered, especially if coming from a relational SQL background where you are using JOINs to filter and reduce your data before it is returned to your app.
Row Size Limitations
NoSQL gets a lot hate for its common pattern of storing giant blobs of semi-structured data in a single field (this design makes sense when considering the latency considerations).
However, this becomes a delicate balancing act since each row in Azure Table Storage can only be a maximum of 1mb in size. This causes you to want to fit as much data into a single call as possible (to reduce the number of calls) while also not exceeding the 1mb row size.
Azure Table Storage does allow up to 252 columns of data per table (plus the required PartitionKey, RowKey, and Timestamp columns) so at least your 1mb of data will be organized.
The above details were the primary considerations I had to take into account for my specific app. There are things data like data consistency, durability, and more that you may want to take into account based on your app's goals and usage patterns.
While NoSQL can often break many of the relational concepts we are used to using, it is often the means for achieving the cheapest pay-only-for-what-you-use pricing on cloud providers.