TESS data available on AWS

tl;dr - Sectors 1 & 2 from TESS are available on Amazon Web Services (AWS). In this first post, we’ll introduce a basic method for accessing the data programmatically through the astroquery.mast client library.

With the release of TESS sectors 1 & 2, we’re making calibrated and uncalibrated full frame images, two-minute cadence target pixel and light curve files, and co-trending basis vectors, and FFI cubes (for the Astrocut tool) available in the s3://stpupdata/tess S3 bucket on AWS.

These data are available under the same terms as the public dataset for Hubble, that is, if you compute against the data from the AWS US-East region, then data access is free.

Accessing the data

In what follows, we are going to assume you already have an AWS account, have created AWS secret access keys and are able to create an authenticated session using the boto3 Python package with these keys.

Astroquery & Boto3

The astroquery.mast Python package has built in support for working with the cloud-hosted data. To retrieve data from the cloud, you’ll need to enable_cloud_dataset as follows and then use the Boto3 library to download the files:

What data are available, how often will they be updated?

With this initial release of sectors 1 & 2 data, calibrated and uncalibrated full frame images, two-minute cadence target pixel and light curve files, and co-trending basis vectors, and FFI cubes used by the MAST TESSCut and Astrocut library.

The data in this S3 bucket will be automatically updated as further sectors of data are released.

FAQ & Resources

Where are the data?: AWS US East

How can I access the data?: You’ll need an AWS account. See this example of how to use your AWS account with boto3 and Python.

How much does it cost to access the data?: Within the AWS US-East region it’s free. To download outside of US-East standard S3 charges apply.

So now you’re charging for TESS data?: No, TESS data is, and will always be, free from MAST. This copy of the TESS data in MAST is being provided in a ‘highly available’ environment next to the significant computational resources of the AWS platform.

I like this idea but I’d rather use a different cloud vendor.: Please get in touch and let us know.