Stop coding and start scaling!
At the April MySQL conference in San Francisco (2012) I discovered iDB by ScaleArc and was excited by what it can do. After three months of testing and working with ScaleArc I purchased it.
What I do
The company I work for (WDT Inc.) supplies weather data. From a database point of view, we take in data feeds from many sources with “Ingest” applications. All are processed (forecast applications) and provided to many “Out flow” applications. Current conditions are processed into forecasts data, RADAR images, lightning and alerts are mapped into images to be used by governments and individual customers (iPhones). So, we have many applications work with many databases written in many languages.
Coding applications to split read and writes for replication and sharding have been the answer for scaling database access for years. Many companies are working on ways improve replication. The problem with these approaches is they all require code changes. Having been a firewall developer (ipchains), I’ve always thought a proxy was the answer. Think F5-BigIP for databases.
Applications written to work with one MySQL server can’t scale without re-coding. If read/write splitting was all iDB did I’d be happy, but iDB can also monitors the transactions, creates metrics and cache transactions.
Testing
ScaleArc provided me with a VM version and two 30 day licenses. I installed iDB on two VMware servers along with several types of web applications. (some staging for production) The primary application was an in-house Java applications to process weather alerts. Other application included WordPress, MediaWiki, Cacti and Thinkup.
I also ran sysbench and hammerora against iDB. I used two and three MySQL server, with master / master / slave replication between them.
Most test “Servers” where VMware virtual machines running CentOS 6.0 x64. Some tests were run on production class MySQL servers while iDB ran in a VM.
Installation
iDB is designed as an appliance. You can purchase it as hardware, software or a VM. You can have ScaleArc pre-configure iDB for your network. I skipped this, so needed to use a live Linux CD to pop the root password and changed the network configuration.
iDB uses a web interface. The setup process is very intuitive. My installation went well until I tried to install the license key. iDB would not take the key. I turned the problem into ScaleArc support to see how they would respond. I also began to dig into the code and in about two hours solved the problem. I found the startup script was calling the wrong program name.
Within a three days, ScaleArc sent a system engineer, in house, to help with my problem and complete the installation. (I already had it working but I let ScaleArc do it their way to test their support.) The engineer upgraded to a newer version of the iDB server and installed the license key.
Security
As a part of digging into ScaleArc’s code, I check iDB’s security. iDB has two methods to authenticate a user. They are, pass through and off load. The default, off load, authenticates the client/application/user itself. Meaning, iDB connects to the database server with a password you give it for that database user. This allows iDB to answer requests from cached queries without connecting to MySQL.
Pass through, as the name implies, just passes the authentication protocol from the client to the server. In this mode iDB can not cache the queries.
Load Balancing
iDB lets you setup “Clusters” of servers. You can designate each database server as either Read/Write or Read only.
You can define a cluster as either Round Robin or Dynamic load balancing.
Caching
iDB analyzes the traffic passing through it and summarizes it to create analytic data. You can then create expressions to cache queries.
SELECT type FROM `db`\.`settings` WHERE name\=.*
iDB plots the summarized queries and makes it easy to setup caching rules.
Inconveniences
To load balance you need supply iDB with the ID and password of every database user. I’m sure you keep these save somewhere? What about that one, very old, production program that guy (who left two years ago) wrote in Fortran?
MySQL stores a sha1 of each user’s password. After some study I found a way for a proxy to recover the sha1 of the password used by the client. I wrote this up and presented it to ScaleArc. I call it “Transparent Authentication”. ScaleArc has agreed to build this into iDB. I hope to test this soon.
The console doesn’t alway report the current settings. I turned on Access Control List for the cluster and set access to the local LAN only. This creates iptables setting in the underlying Linux server. After changing the access the console didn’t show the iptables setting left. I had to turn off the ACL all together to stop the iptables setting that were left.
Security Issue
I found a number of security problem with ScaleArc. Not all the web pages where protected by iDB security code. I found many test and support scripts in the GUI application that leaked information. I also found a rather scary program that leaked system files. I reported all to ScaleArc. These bugs have been fixed.
Some Application Issues
I found some applications, like WordPress, report “database issues” if they can’t write to the database right away. It seems the first thing these applications do is log the connection. They do not continue to show the page even though the database can be read.
Hammerora Test
All OLTP test ran without a flaw. What more can I say.
SysBench test
This test was to review the latency created by iDB. The SysBench, iDB and MySQL all ran on different VMs. All VMs ran on the same hardware. The first test was run direct to the MySQL server and the second through iDB. The MySQL server was not load balanced.
Transactions per second
Average transaction time in milliseconds
Conclusion
ScaleArc’s iDB works! Unlike the MySQL Proxy I could not make iDB fail. Application have no idea they were going through a proxy. Queries went directed to read or write servers as needed. Replication was not affected.
Although the heart of iDB is solid, I did have some trouble with the user interface. I believe iDB may have started life as a command line program. The web UI is good but need work.
ScaleArc support is 100% the are very helpful and willing to listen to their customer. I’ve found their support has continued to be great after purchase.
When ScaleArc complete the “Transparent Authentication” iDB will truly be a drop-in scaling solution.
Tweet
Rob Smith wrote:
There’s also JMP ( https://github.com/JMPjct/JMPjct ) that comes with a ehcache module. It’s a clean rewrite of mysqlproxy in java supporting true multithreading.
Link | August 28th, 2012 at 1:24 pm
admin wrote:
Very cool. I’ll have to give this some testing. I have never heard of this project but I love reading source. (realy)
Personal I’m not a Java fan. I think java is the Cobol or the 21st Century. I believe applications like this should be written in C. But, I also believe in writing an application and throwing the first source away. Once you know how to write a program (in a quick scripting language) rewrite it into the proper language.
Link | August 28th, 2012 at 1:59 pm
Doug wrote:
ScaleArc is (understandably) tight-lipped about their pricing, but I need to know whether their product is even in the ballpark of affordable before I pitch this to my bosses. Can you drop some hint as to what you paid for your license? Priced per host or per MySQL instance connected to? Was the magnitude hundreds or thousands?
Link | January 14th, 2013 at 12:23 pm
admin wrote:
I’m not a good example for pricing. I negotiated a discount after helping ScaleArc with the product.
We evaluated iDB in VMware for 30 days for free. You should do the same. We purchased two iDB servers with High Availability (HA) for our own hardware. We based our purchase decision on the costs of developer time to do HA, R-W splitting and Caching in code. Think the price of a Big IP server. I can tell you, pricing is based on the number of Servers and Clusters you need.
Link | January 14th, 2013 at 1:51 pm