Welcome to pyDataverse
pyDataverse is a Python library for working with Dataverse research data repositories. It wraps the Dataverse HTTP APIs in high level Python classes so that you can think in terms of installations, collections, datasets, and files instead of raw requests.
Dataverse itself is an open source platform for publishing, sharing, and preserving research data. A single Dataverse installation can host many collections, each of which contains datasets and their files. pyDataverse connects to such an installation and lets you explore and manage its content from Python.
Install pyDataverse
Section titled “Install pyDataverse”Install pyDataverse with pip in your preferred virtual environment.
pip install pyDataverseOn some systems you may need to use pip3 or python -m pip instead. This installs pyDataverse together with its dependencies.
Example Code
Section titled “Example Code”The example below shows the full flow to connect to a Dataverse installation, select a collection and dataset, and read the contents of a file in that dataset. It also demonstrates the filesystem like interface that pyDataverse provides for reading and writing files.
from pyDataverse import Dataverse
# 1. Connect to a Dataverse installationdv = Dataverse(base_url="https://demo.dataverse.org")
# 2. Select a collection and a datasetcollection = dv.collections["my-collection"]dataset = collection.datasets["doi:10.5072/FK2/ABC123"]
# 3. Open a file from the dataset and read its contentswith dataset.open("data/example.txt", mode="r") as f: text = f.read()
# 4. Update metadata of the datasetdataset["citation"]["title"] = "My new title"dataset.update_metadata()
# 4. Create a new file in the dataset from in memory contentwith dataset.open("data/hello.txt", mode="w") as f: f.write("Hello, Dataverse!")In this short script you connect to a Dataverse installation, navigate down to a collection and dataset, and then read a file inside that dataset as plain text. You can adapt the identifiers and file path to match your own Dataverse installation.
For many use cases this pattern is all you need to get started. As you become more familiar with Dataverse, you can move on to creating datasets, uploading files, and working with advanced metadata.