Getting a grip on storage growth: James Sample, Director of IT Infrastructure for Wiley Publishing
If you're like most people, you probably own at least one For Dummies® reference book. Produced by John Wiley & Sons Publishing, the For Dummies series is extremely popular, boasting more than 125 million volumes in print and comprising a wonderfully broad spectrum of subjects--from stock investing and learning guitar, to network security and SQL, and everything in between. Wiley Publishing's other consumer titles--including the Betty Crocker® cookbooks and Frommer's® travel guides--are also in demand, as are its print and e-publications for the scientific, technical, and medical communities.
But until recently, Wiley's growing success meant the company also faced a growing problem: data management. As a worldwide publisher of nonfiction books, Wiley processes more than 60 titles a month out of its Indianapolis office. It has nearly six million files on its primary file server and routinely experiences 100 percent data growth each year.
"We generate a ton of files every single day that need to be managed," says James Sample, Director of IT Infrastructure at Wiley. "[But] because book content is all unstructured data, we didn't have any of the efficiencies you get when backing up database files. We had to do backup the slow, hard way."
Wiley's policy is to retain the online files associated with each book for 12 months before archiving them; the combined data for a single book can range in size from 1 GB to more than 500 GB. With that much data to keep track of, Wiley's weekly backups of approximately 25 TB of unstructured data, stored on Microsoft Windows-based servers, had become costly and inefficient.
Wiley's backups regularly ran 36 hours and consumed 40 tapes. There was a 48-hour window in which the backups could take place, but with 100 percent growth projected for next year, the backup window was projected to grow to 72 hours. IT knew they couldn't handle the increase and that limited window wasn't the only hitch.
"If there was ever a problem with the backup," says Sample, "we ran over. So if we lost a tape drive in the middle of a backup job on a Friday, we had to restart and then there was no time to finish the backup without impacting users on Monday. We were also seeing huge escalating costs. Every year we were adding more tapes, more libraries."
Sample's IT team decided to implement a file virtualization solution to help cut costs and manage the company's runaway data growth. Among other requirements, they knew they needed a solution that would be both seamless and scalable.
"File virtualization works; it offers big rewards in very short order. It's so simple, even we Dummies can figure it out."
James Sample, Director of IT Infrastructure, Wiley Publishing, Inc.,
File Virtualization for Dummies
File virtualization eliminates the static mapping between client and storage resources by decoupling the logical access to files from their physical locations. Once files are virtualized, data is free to move and storage resources can change without the disruption to users or applications normally associated with those tasks. Hundreds or even thousands of mount points are consolidated down to a much more manageable number--making it easy to perform provisioning, consolidation, and migrations without client reconfiguration.
All this dramatically increases management flexibility, optimizes resource utilization, and eliminates "vendor lock-in" by giving organizations the freedom to choose the storage technology that best meets their requirements.
In addition, placement and movement of data can be automated between different tiers (or classes) of storage. By moving un-changing data off of expensive, high-performance Tier 1 storage and onto more cost-efficient alternatives, organizations can dramatically reduce the amount of data that needs to be regularly backed up. This reduces backup and recovery windows, as well as backup infrastructure costs, all of which strongly appealed to the Wiley team.
"Right off the bat," says Sample, "we tested several [products] that didn't work and needed to be more seamless." Finally, after looking at "all the file virtualization systems out there," they chose to install a pair of highly available F5 ARX 500 devices on Wiley's primary network.
The ARX devices enabled the Wiley IT team to reduce backup time and backup costs by tiering the back-end filer to allow two levels of storage based on each file's last modified date:
Age-based policies automatically move files to the appropriate storage tier without impacting user access. Only files that have been modified within the past month are included in the weekly backup. Files on Tier 2 storage are backed up monthly.
Implementing automated storage tiering has dramatically simplified Wiley's file storage management efforts and lowered storage costs. Weekly backups went from 36 hours to less than one hour. Weekly tape usage went from 40 tapes to just one tape--instantly saving the IT budget more than $2,000 a week.
In addition, the F5 solution hides all storage changes and data movement from clients. "[When users] look in a folder," says Sample, "they see a consolidated view of both tiers. If they access a file that is on Tier 2 and modify it, it is moved back to Tier 1 the instant they are done. If the file goes unchanged for a period of time, it is moved to Tier 2 again. The users access all files the same as they always have. They don't see a difference."
The kind of data storage headache Wiley experienced is pervasive in today's business environment. Nigel Burmeister, Director of Product Marketing for F5 Data Solutions, says analysts predict average annual storage growth rates of more than 50 percent. Many companies exceed that already and are looking at 100 and even 200 percent storage growth annually.
"Plus, around 70 percent of the files in an average enterprise haven't been modified in three months. So if you remove that data from a primary backup data set, the impact on backup times and costs can be pretty dramatic," says Burmeister.
"There are some fundamental trends at work in the data center today," observes Burmeister. "We're creating more new content than ever before. And we're digitizing massive amounts of existing content. If you add to that the fact that we're protecting all of this information--we're also keeping it around for much longer periods of time--the result is that most organizations are facing a data explosion."
That was certainly true for Wiley Publishing. But with the F5 solution, Wiley reduced backup media costs by 90 percent, eliminated the need to spend money on new storage capacity, and shrank backup times by 97 percent. Automated storage tiering means Wiley can efficiently manage its constantly expanding volumes of unstructured data without incurring major infrastructure changes.
A bonus is that the ARX devices offer unmatched scalability; they're uniquely able to scale to fit Wiley's growing storage demands without compromising performance.
As Sample writes in his foreword to File Virtualization for Dummies, it all "came down to scalability, replication, and robustness. We implemented the first [F5] systems in late 2006, and we haven't looked back since."
F5 joined forces with Wiley to produce a File Virtualization for Dummies program, which includes a For Dummies guide and webinar. This program builds on the Wiley and F5 story, with the goal of helping enterprise IT staff get a grip on storage growth and reduce costs.
Learn how file virtualization can help you reduce infrastructure complexity and boost operational efficiency, watch the on-demand webinar File Virtualization for Dummies: How to Get a Grip on Storage Growth and Reduce Costs, featuring James Sample, Director, IT Infrastructure at Wiley Publishing. All registrants will receive a copy of the new File Virtualization for Dummies handbook.