Data Management

Traditional Data Management Going the Way of the Mainframe

The big data technology and services market is expected to reach $48.6 billion in annual spending by 2019. The increasing spend is coming with business intelligence (BI) and analytics requirements that are breaking traditional data warehousing platforms. In a recent study, 46 percent of the senior executives surveyed acknowledged that their traditional technologies were not architected to handle modern workloads.

It’s a problem that is being compounded by an unbound growth of data, and new sources of data only make the quest for accurate insights more difficult. These traditional architectures forced more than 25 percent of respondents to discard data to get analytic insights because they couldn’t scale to process the volume of data being collected. In addition, 31 percent of respondents said traditional architectures are not designed for new workloads, and 29 percent said it is too expensive to scale. Customers are suffering due to the innovator’s dilemma these traditional vendors find themselves in. Is this the tipping point?

More data has been created in the last few years alone than in the entire history of mankind, and it’s not slowing down. It has become imperative for organizations to access, process, and analyze troves of data in real-time to make effective business decisions in a contextual and timely manner, but it is no longer viable for business leaders to fall back on traditional data management platforms. Many have started to turn their attention from static legacy frameworks to more modern solutions like Hadoop and Spark, but have found limited success. While these may be cost-effective tools for storing and sifting through massive amounts of data, 30 percent of respondents in the aforementioned study said these new Hadoop analytics architectures are not yet ready to provide the enterprise-grade performance needed to effectively execute advanced real-time analytics.

Amid this platform stagnation, BI tools are also suffering, and this one-two punch is leaving businesses overwhelmed on dual fronts. Just 40 percent of survey respondents said their BI tools were working well with historical data sets, and 32 percent acknowledged these tools were overmatched by increasing data volumes.

This suffering extends beyond functionality, ultimately forcing enterprises to pay increasing sums for solutions that are getting less effective over time: 47 percent of respondents said the cost to maintain traditional systems continues to rise. However, companies are apprehensive to rip and replace their faltering legacy solutions: only 32 percent of survey respondents indicated that they would supplant them with a modern tool. This is often the case because organizations have already invested so much in their existing data management platforms and they are reluctant to write them off as a loss. In addition, 62 percent of respondents have decided to augment these traditional systems with modern architectures to meet the needs of their modern analytic workloads. This often results in relegating traditional systems to transaction processing workloads – much like what happened in the shift from mainframes to x86-based servers.

Opportunity in the Downfall of Traditional Management Platforms

There is still opportunity as businesses navigate the challenging waters of the modern data era. The opening to capitalize on business opportunities through customer analytics and Internet-of-Things strategies is there, but organizations first must overcome the barriers created by the breakdown of legacy solutions and find ways to better manage the growing stress of big data. Until that is achieved, they will continue to sit on a wealth of untapped data, limited by the commercial and technical constraints of their traditional management systems. So what will it take to break free?

There is no magic bullet, but the best answer lies in the amalgamation of existing and new technologies. Businesses want to use their domain expertise and continue leveraging their legacy investments while still enjoying the benefits of more modern environments like Hadoop – and there is a way to do this through solutions like SQL-in-Hadoop. Hadoop already can address the issues of cost and size for data storage. All that’s left is turning it into a big data analytics platform adept for the enterprise.

Fortunately, existing applications and queries based on SQL – the lingua franca of relational databases – don’t have to be rewritten to work with Hadoop, and data don’t have to be brought out of it either, making Hadoop the perfect complement for legacy deployments. Using SQL also enables users to leverage existing BI and visualization tools, not to mention existing dashboards and reports. In all, it allows organizations who have invested extensively in legacy data management platforms to retain these existing systems and maximize their ROI while working to match the demands of analyzing today’s big data with more modern solutions.

Just like mainframes that could not keep up with compute and scale requirements, traditional data management systems are floundering in the depths created by the current waves of data workloads. They are pinned by architectural limitations and expensive commercial models. But due to extenuating financial circumstances – and the lack of an enterprise-ready modern alternative – they will not be replaced soon. Perhaps it’s time for organizations to invest those dollars in a different approach that leverages new technology to complement the faltering data-management platforms that have proven so hard to replace.

Ashish Gupta joined Actian in 2013, where he is responsible for marketing and business development. Ashish brings more than 21 years of experience in enterprise software companies, where he focused on creating go-to-market approaches that scale rapidly and building product portfolios that became category leaders in the industry. Ashish was formerly at Vidyo, where the company grew to be the leading software-based videoconferencing platform and was named to the Wall Street Journal’s “Next Big Thing” list for three years and was selected as a “Tech Innovator” by the World Economic Forum.

Previously, Ashish led the Business Development and Strategy, Marketing, and Corporate Sales teams for Microsoft Office Division’s Unified Communications Group, responsible for introducing the industry-leading Microsoft Lync product. Prior to Microsoft, Ashish was VP of Product and Solutions at Alcatel/Genesys Telecommunications and VP of Marketing and Business Development at Telera Inc. (acquired by Alcatel), and management consultant for Braxton/Deloitte Consulting. He also held marketing leadership positions at HP and Covad. He holds an MBA from UCLA and a bachelor’s degree in Economics and Computer Science from Grinnell College.

Three Ways Enterprises Can Eliminate Useless Data

It’s an often-repeated adage in the business world that an organization’s information is its most valuable resource. But do we know what kinds of data corporations are actually storing? This may seem like a simple question to answer, but with the explosion of corporate data, most enterprises are unsure about what data they have, where they are stored, and even the value the data hold for the organization.

According to Veritas’ inaugural Data Genomics Index – a study that analyzed billions of files within actual companies’ storage environments – 41 percent of files within the average enterprise have not been modified in the last three years. And, even worse, 12 percent of files haven’t been opened in the last seven years. To put that into perspective, if 41 percent of data is stale, it means that 9.5 billion files in a 10PB environment have not been touched in more than three years.

These findings are so shocking that one would think that IT leaders were unaware of this potentially wasteful behavior. As it turns out, they had a hunch it was happening, but were blind to the details. An additional study from Veritas, the Databerg Report, revealed that global IT leaders believe only 15 percent of their data has any business value. And the remaining 85 percent is classified as redundant, obsolete, or trivial (ROT), or “dark data,” meaning the value is unknown – the data could be either critical to the business or completely worthless.

The lack of visibility into the composition of enterprise environments restricts IT leaders to a singular information management approach: assigning resources based purely on the volume of data stored rather than based on the actual value of the information to the business.

With this information management model, it’s easy to see how storage budgets can get out of hand quickly. For example, with more than 40 percent of the storage environment unmodified in three years, the average enterprise could spend as much as $20.5 million storing potentially unused data. In addition to the storage costs, the sheer clutter makes it harder for IT departments to identify valuable information throughout their environment potentially at risk.

To understand the immensity of the decision-making challenge that this presents, we can apply the perspective of a file-by-file oriented industry – legal document review. Contract-review attorneys churn through 50 documents an hour. At that pace, the average stale environment would take a little under 22,000 years to clean up. You could employ 22,000 contract attorneys for the next 365 days and pay them roughly $5.4 billion to clean up all the data. This is slightly more expensive than just moving the whole 10PB to Google Nearline for just $100,000 a month.

This hypothetical scenario may be exaggerated, but it hits at the heart of the information-growth conundrum. When facing an overwhelming number of information management decisions and drastically discounted storage costs, how can IT leaders break away from inefficient information management practices and catalyze change within their organization?

The Databerg report surfaced the beginnings of a path forward. It’s imperative for IT leaders to manage data based on its business value, not on the associated volume. This approach will free up budget through basic deletion of the ROT data and allow the enterprise to change its culture by taking the following steps:

    • Look to overrepresented file types. Traditional “office” formats like presentations, documents, text files, and spreadsheets account for 20 percent of the total stale population, so an archiving project focused just on these formats can cut costs by $2 million.

 

    • Understand the risk of orphaned data. Five percent of the average environment is orphaned data, or data without an active associated owner, typically the result of departed employees. When compared to the normal distribution of file types, this orphaned data is significantly more content rich – heavier in size and typically in the form of presentations, images, videos, and spreadsheets. This orphaned data is more likely to contain sensitive intellectual property, payment card industry, personally identifiable information, and customer information.

 

    • Create, implement, and enforce classification policies on users’ data. This can be difficult but it’s necessary to remain compliant and manage risk. Using classification to understand basic characteristics of the environment makes it easier to understand where critical information resides and who can access it. It’s also important to ensure that employees understand enterprise data policies through regular trainings.

 

By understanding the basic composition of the storage environment, organizations can take steps to focus their energy on smart classification, archiving, deletion, and data migration efforts. Regardless of where they start, organizations need this basic level of visibility to prioritize information governance efforts and start to save their environments from the crippling growth dynamic that enterprise data storage environments are currently experiencing.

Chris Talbott is Sr. Product Marketing Manager at Veritas. He works to bring Veritas File Analysis and Protection products to market and leads the Data Genomics Project. Before managing product marketing for the File Analysis portfolio, Talbott focused on the eDiscovery product line at Symantec, marketing, writing and speaking at industry events on the subjects of predictive coding and eDiscovery. Talbott joined Symantec from Clearwell Systems where he helped grow Clearwell into one of Sequoia Capital’s most profitable portfolio companies. Talbott graduated from the University of California, Berkeley with a degree focused on Globalization and Consumer Behavior.