#s-lg-box-27659446.s-lib-box .s-lib-box-title { background-color: #ff0099; } Skip to Main Content
J. Robert Van Pelt and John and Ruanne Opie Library

Data Management

Data and Intellectual Property Rights

Data that is factual has no copyright protection under U.S. law; facts cannot be copyrighted. However, not all data is in the public domain. For example, a project might be built around copyrighted photographs and these photographs are part of the project’s “data." But in many cases, the data in a data management system as well as the metadata describing that data will be factual, and hence not protected by copyright.

The organization of the data in a database, on the other hand, can have a thin layer of copyright protection. Deciding what data needs to be included in a database, how to organize the data, and how to relate different data elements are all creative decisions that may receive copyright protection. Datasets can include other creative ways of documenting and explaining the data, such as annotations or visualizations. Charts and figures, if they are sufficiently original, are protected by copyright. Datasets can also include different subsets of data, some of which are covered by copyright, and some of which aren't. An example of this would be a collection of CSV files, factual and not protected by copyright, and a collection of software programs that creatively combine, operate and visualize the data. 

Licensing Your Research Data

In order to facilitate the reuse of data, it is imperative that others know the terms on which you are making both the database and the data content available. Data producers may want to consider applying a Creative Commons (http://www.creativecommons.org) license. These are free, standardized licenses and some of them can be applied to data and databases. Digital Commons @ Michigan Tech can also easily apply a Creative Commons license.

The three CC licenses that are of greatest relevance to data management are:

  1. CC0 (i.e., "CC Zero"): When an owner wishes to waive her copyright and/or database rights, the CC0 mark can be used. It effectively places the database and data into the public domain.
  2. Public Domain mark (PDM): It is used to mark works that are in the public domain, and for which there are no known copyright or database restrictions. Factual data in a database, for example, might be flagged as PDM in order to make it clear it is free to use.
  3. CC-BY: It is used when an owner wishes to allow their copyrighted work to be reused and shared with the condition that appropriate credit is given. 

However, CC BY licenses are based on copyright ownership of the underlying work. Fortunately, the Open Data Commons group (http://opendatacommons.org/) has been developing legally binding tools to govern the use of data sets. Using a combination of copyright and contractual standards, they have created three standard licenses that can be used in conjunction with data projects.

The three ODC licenses are:

  1. Public Domain Dedication and License (PDDL): This dedicates the database and its content to the public domain, free for everyone to use as they see fit. It is the functional equivalent of a CC0 license.
  2. Attribution License (ODC-By): Users are free to use the database and its content in new and different ways, provided they provide attribution to the source of the data and/or the database. The ODC-By license is the equivalent of a Creative Commons Attribution license (CC BY).
  3. Open Database License (ODC-ODbL): ODbL stipulates that any subsequent use of the database must provide attribution, an unrestricted version of the new product must always be accessible, and any new products made using ODbL material must be distributed using the same terms. It is the most restrictive of all ODC licenses.

Open Definition maintains a list of licenses, including CC and ODC licenses, that can be used for data here.