A New Model for File Cleanup

by   |   June 23, 2017 5:30 am   |   1 Comments

Kon Leong, President, CEO and Co-founder, ZL Technologies

Kon Leong, President, CEO and Co-founder, ZL Technologies

In The Life-Changing Magic of Tidying Up, Marie Kondo argues that rather than repeatedly tidying your house, a better strategy is to clean it only once, properly. In fact, by learning how to clean it right the first time and shifting your state of mind, you can adopt new behaviors that will enable you to keep it clean long-term.

Though cleaning enterprise file shares and repositories is somewhat more complicated, similar principles apply. It doesn’t make sense to return to file cleanup every so often, indefinitely. It’s simply not practical or cost-effective — however, this is the way far too many organizations look at file management. What these companies need is not another annual spring cleaning; they need a new strategy.

With that being said, here are some suggestions that may prove helpful for maintaining file systems in the long run.


Though the idea of cleaning the entirety of enterprise file shares can be daunting, the currently available analytics capabilities make it much more attainable than many would expect. Experience has shown that 80% of insight from file analysis comes from metadata alone. When analyzed together, information such as creation date, author and file type can direct organizations to delete large volumes of redundant, obsolete and trivial data (ROT). This is a great first step for long-term management of enterprise files. Getting rid of unused files can also reduce unneeded liability presented by security breaches, insider threats, and ransomware attacks such as WannaCry. In this case, the old IT mentality may be appropriate: The less you have, the less you have to protect.

Consider Legal

In a recent survey by Osterman Research, only 39 percent of organizations were prepared to store, retain or produce word processing, spreadsheet or presentation files. With such documents so often becoming unofficial business “records,” these employee-created files can be a major blind spot for legal departments that don’t have access to them. Organizations that are able to classify, govern and ultimately discover these documents give counsel a major leg up in early case assessment and eDiscovery.

My advice is to therefore give special consideration to your in-house counsel when devising a management strategy for enterprise files. Organizations should have a process in place for identifying, classifying and governing business-related documents, which often lie scattered across enterprise systems, and ensure they are made searchable for legal. As a crucial component of the initiative, organizations may wish to implement automated policies for remediation to ensure important documents are accessible and to prevent clutter from accumulating over time. This goes hand in hand with having an ongoing governance strategy, which should be clearly defined before organizations begin a cleanup initiative.

Think About the Future

With the General Data Protection Regulation (GDPR) approaching, organizations that process EU resident data will be accountable for a new set of standards. Among these is an obligation to obtain explicit consent from subjects for the processing of personal data, to respond to subject requests for deletion, and an obligation to delete personal data once it is no longer being used for its original purposes. With fines of up to 20 million EUR, or 4% of annual worldwide turnover, GDPR is a top priority for 92 percent of US organizations, according to PwC research. However, organizations focusing on structured database systems that typically hold consumer information can forget to look in less obvious spots, such as file shares and SharePoint sites.

For these more elusive unstructured systems, in-place advanced sampling can illuminate areas that hold personal data and demand more attention. From there, indexing and content analysis allows organizations to perform enterprise-wise searches for personally identifiable information, health information, and credit card information. The ability to find such sensitive data on the fly will become increasingly valuable once GDPR hits in May of 2018, but organizations that want to be ready by then should get an early start.

Look at the Bigger Picture

Even now, many organizations still forgo file cleanup, an instrumental reason simply being the fear of deleting important documents. While understandable, this outlook has risks of its own. For instance, having to find important relevant documents in a sea of ROT in a timely manner during eDiscovery can be like trying to find a needle in a haystack. Neglecting management of enterprise files adds undue pressure to your legal team, and can significantly increase already-astronomical review costs.

Furthermore, when GDPR goes into effect, organizations will be obligated to delete personal data when it’s no longer needed, as well as to find and take action on personal data upon request. There will be an expectation of best efforts, and an organizational policy to not delete files is unlikely to cut it.

Meeting these requirements doesn’t happen overnight. Organizations may be best served by taking a holistic governance approach that accounts for both email and files, while implementing “privacy by design” at the architectural level. In this approach, organizations can begin to consolidate data silos and regain control of their data, while ensuring they have the framework to properly manage personal data.

In Marie Kondo’s philosophy of cleaning, she says that learning how to tidy up properly can spark lifestyle changes that lead to a long-lasting, positive impact. It’s these changes, not the initial cleaning, that keeps the house in order over time. Her reasoning has resonated with a lot of people, and for good reason. It may be due time for organizations to adopt a similar philosophy.


Kon Leong, President, CEO and Co-founder, ZL Technologies, is responsible for managing all aspects of the business, including strategy, finance, sales and marketing. Earlier, Kon was co-founder and president of GigaLabs, a vendor of high speed networking switches. Prior to that, Kon was First Vice President of Mergers and Acquisitions at Deutsche Bank. He was at the General Motors Treasurer’s Office in New York City, where he managed GM’s venture capital investments in high tech. He also spent eight years in various IT engineering and management positions at Burroughs, Philips and Union Bank. Kon earned an MBA with Distinction from the Wharton School and received an undergraduate degree in Computer Science from Concordia (Loyola) University, after completing a year at the Indian Institute of Technology.


Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.


Tableau whitepaper - why business analytics in the cloud?

Tags: , , , , , ,

One Comment

  1. Ed Rawson
    Posted August 23, 2017 at 10:07 am | Permalink

    This is a very good article. However I am not sure how new this Model for File Cleanup is. I have been using analytics for over 8 years now to clean up ROT and Sensitive information. Today the new model is a more “Holistic” approach that cleanse, classifies, semantic tags metadata, and manages the lifecycle of all your information.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>