OPEN DATA HANDBOOK California Health and Human Services Agency

Purpose of the CHHS Open Data Handbook

The California Health and Human Services (CHHS) Open Data Handbook provides guidelines to identify, review, prioritize and prepare publishable CHHS data for access by the public via the CHHS Open Data Portal – with a foundational emphasis on value, quality, data and metadata standards, and governance. This handbook is meant to serve as an internal resource and is also freely offered to any party that may be interested in improving the general public’s online access to data and to provide an understanding of the processes by which CHHS makes its publishable data tables available. The handbook focuses on general guidelines and thoughtful processes but also provides linked tools/resources that operationalize those processes. The CHHS Open Data Handbook is based on and builds upon the New York State Open Data Handbook, and we would like to acknowledge and thank the New York staff who created that document and made it available for public use.

The breadth of data and participation by departments and offices within CHHS are continually being enhanced and expanded, making open data a dynamic, living initiative. This handbook, providing guidelines for broad publication of publishable state data in electronic, machine-readable form, is the first step in a major shift in the way CHHS departments and offices share information publicly to promote efficiency, accessibility and transparency; and a significant improvement in the way CHHS government engages citizens and fosters innovation and discovery in the scientific and business communities. It begins the process of standardizing the state’s data, which will make it easier to discover and use the data. Working in collaboration with others, this Handbook will be supplemented, as needed, with technical and working documents addressing specific formatting, data preparation, data refresh and data submission requirements. CHHS and its departments and offices will use this handbook in their work as they consider various perspectives involved in governing business processes, data, and technology assets.

Key Definitions

These four terms are highlighted because they are frequently used throughout this document. Additional terms and definitions are listed in the Glossary.

Data: A value or set of values representing a specific concept or concepts. Data includes but is not limited to lists, tables, graphs, charts, and images. Data may be structured or unstructured and can be digitally transmitted or processed.

Dataset: An organized collection of related data records maintained on a storage device, with the collection containing data organized or formatted in a specific or prescribed way, often in tabular form. In this handbook the dataset refers to the master, primary, or original authoritative collection of the data.

Data Table: A data table, in this handbook, refers to a subset of the dataset which may include a selection and/or aggregation of data from the original dataset.

Publishable State Data: Data is Publishable State Data if it meets one of the following criteria: (1) data that are public by law such as via the Public Records Act or (2) the data are not prohibited from being released by any laws, regulations, policies, rules, rights, court order, or any other restriction. Data shall not be releasedif it is highly restricted due to the Health Insurance Portability and Accountability Act (“HIPAA”), state or federal law (such data are defined as Level 3 later in this handbook).