Start-Up Empowers 'Citizen Data Scientists'

Taking a cue from the growing citizen developer movement, a Big Data start-up is courting "citizen data scientists" with a new product for self-service analytics.

While that DIY developer approach typically leverages low-code/no-code, visual, point-and-click tools, Waterline Data this week unveiled Smart Data Catalog 3.0, which it describes as the industry's most comprehensive data catalog.

The 2013 start-up based in San Jose, Calif., says it "empowers business analysts and data scientists to find, understand, and provision the trusted data sets they need to do self-service data preparation and analytics."

Specifically, the company said in a news release Monday, its Smart Data Catalog "allows organizations to replace manual tagging of metadata with an automated process that rapidly classifies the data assets in their data lake, including new data even as it's created."

Rather than relying upon typical cataloging approaches -- such as the scanning of historical SQL logs -- the Waterline Data tool catalogs all of the files and fields contained in an enterprise data lake, in the process benefitting from the capture of "tribal knowledge."

"With the company's latest version of Smart Data Catalog, data engineers, data scientists and business analysts get even easier self-service access to trusted, high-quality data for faster discovery, understanding, use and governance," the company said.

The Waterline Data Smart Data Catalog
[Click on image for larger view.] The Waterline Data Smart Data Catalog (source: Waterline Data)

The catalog comes in three editions, including a free Community Edition and Professional and Enterprise subscription bundles. It works on platforms from Cloudera Inc., Hortonworks Inc., MapR Technologies Inc., Pivotal Software Inc. and Amazon Web Services Inc.

Highlights of the 3.0 edition listed by the company include:

  • Data browsing by business category. A new catalog view includes enhanced self-service that allows business analysts to browse data by business category.
  • Integration with the Cloudera Navigator and Hortonworks Atlas data governance frameworks. Customers can now accelerate data discovery, governance and time to value through Waterline Data's smart data discovery capabilities. Both frameworks are automatically updated with all the metadata Waterline Data uncovers, and all sensitive data and data lineage is cataloged to ensure data compliance and trust.
  • Universal integration with analytics, visualization and more. Support for integration with any analytics, visualization or other third-party tool offers flexibility and the ability to derive benefits from innovative applications.

"Waterline Data is giving many organizations the green light on Hadoop, whether it's expanding on projects or bringing it into the enterprise for the first time," said CEO Alex Gorelik. "This means organizations can sharpen their competitive edge by putting their data to work faster and in more cutting-edge ways."

About the Author

David Ramel is an editor and writer for Converge360.