There’s an institutionalized, seemingly intractable and ever widening gap in enterprise IT between data growth and HPDA’s (high performance data analytics) ability to access more than a small piece of it and convert it into “actionable insight.” Sadly, there’s plenty of life left in the trite, tiresome old truism about “drowning in data and starving for insight.” As things stand today, organizations still only leverage about 10 percent of their data, and data volume growth (and the emergence of new data types) continues to outpace the development of HPDA technology.
Same as it ever was.
However, immense effort and investment is targeting the treasure trove in the totality of corporate data. We hear talk of future systems that will contain a decade – yes, a decade – of data in active memory, and we see the upward climb in machine learning/deep learning algorithms capable of taking on bigger and increasingly complex problems. As with complex systems (see related story), the challenge cries out for the power of AI.
This is the basis of a content analytics partnership announced this week by archival storage veteran Iron Mountain and Google Cloud. The goal: deliver “AI-powered SaaS solutions on (Google Cloud Platform) to help organizations analyze their vast physical and digital information and data repositories, to unlock insights, improve decision making and create new revenue streams.”
Boston-based Iron Mountain, founded in 1951 and with 650 million cubes of archived information from 95 percent of the Fortune 1000 in 1,400-plus facilities around the world, said it expects to deliver new subscription-based content analytics services built on GCP this September.
Jim O’Dorisio, a 30-year veteran of the storage industry, was hired by Iron Mountain last year as SVP of emerging commercial solutions to help lead the company’s digital strategy development. “We’re known for data management solutions relative to compliance, governance, policy and records management,” he told EnterpriseTech, “and we started looking at those elements and at information and customer needs, and we realized we wanted to find a way to help customers monetize all that information. So we started defining a platform that would combine what Iron Mountain is known for doing with the ability to unlock insights in the information and enable revenue streams for our customers.”
He said Google’s AI technology and cloud service made the tech giant a good fit, and that the platform will run on Google Cloud and interoperate with other public cloud services along with the company’s own Iron Cloud.
Keys to the new offering at its ability ingest content from multiple sources and leverage metadata for search. And metadata contained in massive stores of object-based data is a critical enabler for high performance, machine learning-based analytics.
“Object storage was uniquely designed to store metadata with an object,” said O’Dorisio. “It’s very difficult to store metadata in flat file systems. So object storage itself lends itself to storing metadata. And in our case we have an extensible database that allows us to store lots of metadata.”
On the ingestion side, O’Dorisio cited the platform’s ability to take in not just digital data but data by scanning physical documents, along with content off tape. “A key thing for us is the extraction of metadata, such as the legend off of a map, and store the content of that legend as metadata. That information is not available electronically. So the trick is our ability to use the machine learning algorithms to extract the metadata that can then be used to search for those assets later.”
Search can be conducted based on multiple metadata fields, such as by asset type, by date, author, location. “All that content is brought up in a thumbnail so you can easily search and open the content, it’s very responsive, easy to access and customizable.”
The platform will be offered by market verticals. “It’s very important to classify the content, so if we’re in mortgage, for example, there might be 20 different forms, our ability to classify each of those forms and to know how to extract metadata from those forms is key,” said O’Dorisio.
He also cited the seismic industry and accessing data regarding subsurface access, such as core samples, well logs, drilling reports. “The ability to ingest that content, classify and extract metadata from it and give the user the ability to search and sort all that content easily has huge value as they try to understand the value of their assets.”
The goal of the platform, O’Dorisio said, is to combine storage, data management and AI capabilities that would otherwise pose major obstacles for organizations trying to build home-grown solutions. “It’s pretty difficult for our customers to do this,” he said. “Many of them hired data scientists and they’re looking to use machine learning but it’s not easy and those skill sets are hard to come by.”
“We’ve heard from many companies about the challenges of trying to manage data and utilize information for insights,” said Holly Muscolino, research vice president, content technologies and document workflow, at analyst firm IDC. “They have too much of it to analyze, and when they try to, it’s highly complex and resource-intensive to do on their own. The combined expertise, technology and customer relationships that Iron Mountain and Google Cloud bring to the table is well positioned to unlock this potential.”