<div style="text-align: center">
<img src="https://github.com/LinkedEarth/Logos/blob/master/PyleoTUPS/pyleotups_logo.png?raw=true" alt="PyleoTUPS logo" width="400">
</div>

# Querying data from the NOAA database

## Authors

Deborah Khider
<a href="https://orcid.org/0000-0001-7501-8430" target="_blank" rel="noopener noreferrer">
  <img src="https://orcid.org/sites/default/files/images/orcid_16x16.png" alt="ORCID iD" style="vertical-align: text-bottom;"/>
</a>

## Preamble

This tutorial will get you familiar with searching the NOAA database for relevant datasets and search them for relevant information such as variables, data, geographical information.

### Goals

 - Searching for a specific study and obtaining information such as location, publication, variable metadata and the associated data.
 - Searching over a geographical area

### Pre-requisites

### Reading time

Let's import our packages!

In [1]:
from pyleotups import Dataset
import pandas as pd

## From a study ID

This is the most basic search you can perform, using a NOAA ID to access a dataset. First, you need to create a `Dataset` object that will store the information.

In [2]:
ds=Dataset()

Let's do a simple search, knowing the NOAA study ID. For this example, let's use the dataset from [Clemens et al. (2021)](https://www.science.org/doi/10.1126/sciadv.abg3848), which can be accessed through [NOAA Paleo portal](https://www.ncei.noaa.gov/access/paleo-search/study/33213).

In [3]:
ds.search_studies(noaa_id=33213)

Parsing NOAA studies: 100%|██████████| 1/1 [00:00<00:00, 3355.44it/s]


Unnamed: 0,StudyID,XMLID,StudyName,DataType,EarliestYearBP,MostRecentYearBP,EarliestYearCE,MostRecentYearCE,StudyNotes,ScienceKeywords,Investigators,Publications,Sites,Funding
0,33213,74834,"Bay of Bengal, Northeast Indian Margin Stable ...",PALEOCEANOGRAPHY,1462580,280,-1460630,1670,"Provided Keywords: Indian monsoon, South Asian...",,"Steven Clemens, Masanobu Yamamoto, Kaustubh Th...","[{'Author': 'Clemens, Steven; Yamamoto, Masano...","[[{'DataTableID': '45857', 'DataTableName': 'U...",[{'fundingAgency': 'US National Science Founda...


The `summary` method provides basic information about the dataset, such as the name of the study, the NOAA [DataType](https://www.ncei.noaa.gov/products/paleoclimatology), the time coverage and the associated publication. The function retuns a `pandas.DataFrame`.

In [4]:
df = ds.get_summary()
df

Unnamed: 0,StudyID,XMLID,StudyName,DataType,EarliestYearBP,MostRecentYearBP,EarliestYearCE,MostRecentYearCE,StudyNotes,ScienceKeywords,Investigators,Publications,Sites,Funding
0,33213,74834,"Bay of Bengal, Northeast Indian Margin Stable ...",PALEOCEANOGRAPHY,1462580,280,-1460630,1670,"Provided Keywords: Indian monsoon, South Asian...",,"Kaustubh Thirumalai, Liviu Giosan, Julie Riche...","[{'Author': 'Clemens, Steven; Yamamoto, Masano...","[[{'DataTableID': '45857', 'DataTableName': 'U...",[{'fundingAgency': 'US National Science Founda...


`PyleoTUPS` allows you to get metadata information that will be returned in separate dataframes. For instance, let's look at the information about the publication associated with the dataset. This function also returns BibTeX entries along with the DataFrame.

In [8]:
bib, df = ds.get_publications()
df

Unnamed: 0,Author,Title,Journal,Year,Volume,Number,Pages,Type,DOI,URL,CitationKey,StudyID,StudyName
0,"Clemens, Steven; Yamamoto, Masanobu; Thirumala...",Remote and Local Drivers of Pleistocene South ...,Science Advances,2021,7,23,,publication,10.1126/sciadv.abg3848,http://dx.doi.org/10.1126/sciadv.abg3848,M._Remote_2021_33213,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."


<div style="border-left: 4px solid #1f77b4; background: #f0f8ff; padding: 0.5em 1em; border-radius: 4px;">
  <strong>Note:</strong> You can set the `save` parameter to True if you want to save a copy of the BibTeX entries.
</div>

Similarly, you can return information about the location of the record: 

In [5]:
df_geo = ds.get_geo()
df_geo

Unnamed: 0,StudyID,DataType,SiteID,SiteName,LocationName,Latitude,Longitude,MinElevation,MaxElevation
0,33213,PALEOCEANOGRAPHY,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440


And the funding that supported the study:

In [7]:
df_fund = ds.get_funding()
df_fund

Unnamed: 0,StudyID,StudyName,FundingAgency,FundingGrant
0,33213,"Bay of Bengal, Northeast Indian Margin Stable ...",US National Science Foundation,OCE1634774
1,33213,"Bay of Bengal, Northeast Indian Margin Stable ...",Japan Society for the Promotion of Science (JSPS),JPMXS05R2900001
2,33213,"Bay of Bengal, Northeast Indian Margin Stable ...",UK Natural Environment Research Council (NERC),"JPMXS05R2900001, 19H05595"
3,33213,"Bay of Bengal, Northeast Indian Margin Stable ...",United States Geological Survey (USGS),NE/L002493/1


<div style="border-left: 4px solid #1f77b4; background: #f0f8ff; padding: 0.5em 1em; border-radius: 4px;">
  <strong>Note:</strong> As you may have noticed, each table has the NOAA study ID, this is useful when you are looking at multiple studies after performing a search for multiple datasets.
</div>


Next, let's have a look at the tables of data present in the dataset:

In [9]:
df_tables = ds.get_tables()
df_tables

Unnamed: 0,DataTableID,DataTableName,TimeUnit,FileURL,Variables,FileDescription,TotalFilesAvailable,SiteID,SiteName,LocationName,Latitude,Longitude,MinElevation,MaxElevation,StudyID,StudyName
0,45857,U1446 Benthic Isotopes Clemens2021,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/contr...,"[Site, Hole, Type, Section, Core, Section_Dept...",NOAA Template File,1,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."
1,45858,U1446 Planktic Isotopes Clemens2021,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/contr...,"[Site, Hole, Type, Section, Comment, Core, Sec...",NOAA Template File,1,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."
2,45859,U1446 TEX86H_SST Clemens2021,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/contr...,"[Site, Hole, Type, Section, Core, Section_Dept...",NOAA Template File,1,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."
3,45860,U1446 d18Osw Clemens2021,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/contr...,"[Age, SL_Scaled, SL_Scaled_averaged, d18O_G_ru...",NOAA Template File,1,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."
4,45861,U1446 LeafWax CarbonIsotope Clemens2021,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/contr...,"[Site, Hole, Type, Section, Core, Section_Dept...",NOAA Template File,1,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."
5,45862,U1446 Mg/Ca Clemens2021,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/contr...,"[Site, Hole, Type, Analytical_Facility, Core, ...",NOAA Template File,1,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."
6,45863,U1446 Rb/Ca Clemens2021,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/contr...,"[Site, Hole, Type, Section, Core, Section_Dept...",NOAA Template File,1,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."
7,45864,U1446 Age Model Clemens2021,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/contr...,"[Site, Comments, Sample_Depth, Age]",NOAA Template File,1,58697,IODP U1446,Ocean>Indian Ocean,19.083,85.733,-1440,-1440,33213,"Bay of Bengal, Northeast Indian Margin Stable ..."


The dataset contains 8 tables. To get the data, you need to pass the `DataTableID` or `FileURL` to the `get_data()` method. Let's have a look at the TEX86 data:

In [10]:
dfs = ds.get_data(dataTableIDs="45859")
dfs[0].head()

Unnamed: 0,Site,Hole,Core,Type,Section,Section_Depth,Sample_Depth,Age,TEX86H,SST
0,U1446,C,1,H,1,4.5,0.045,0.31,-0.1177,30.55
1,U1446,C,1,H,1,31.5,0.315,0.66,-0.1216,30.28
2,U1446,C,1,H,1,61.5,0.615,1.04,-0.1183,30.51
3,U1446,C,1,H,1,90.5,0.905,1.41,-0.1089,31.15
4,U1446,C,1,H,1,121.5,1.215,1.8,-0.1155,30.7


<div style="border-left: 4px solid #1f77b4; background: #f0f8ff; padding: 0.5em 1em; border-radius: 4px;">
  <strong>Note:</strong> You can pass multiple tables IDs as a list. The function will always return a list of DataFrames (hence why we selected the first one in the code above.)
</div>

Some relevant metadata for each column is stored in the DataFrame attributes:

In [12]:
df.attrs

{'variables': ['Site',
  'Hole',
  'Core',
  'Type',
  'Section',
  'Section_Depth',
  'Sample_Depth',
  'Age',
  'TEX86H',
  'SST'],
 'NOAAStudyId': '33213',
 'StudyName': 'Bay of Bengal, Northeast Indian Margin Stable Isotope, Biomarker and SST Reconstructions since the Mid-Pleistocene'}

To get more information about the variables, you can use the `get_variables` function:

In [13]:
df_var = ds.get_variables(dataTableIDs="45859")
df_var

Unnamed: 0_level_0,StudyID,SiteID,FileURL,VariableName,cvDataType,cvWhat,cvMaterial,cvError,cvUnit,cvSeasonality,cvDetail,cvMethod,cvAdditionalInfo,cvFormat,cvShortName
DataTableID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,Site,PALEOCEANOGRAPHY,sampling metadata>sample identification,,,,,,,Site identification,Character,Site
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,Hole,PALEOCEANOGRAPHY,sampling metadata>sample identification,,,,,,,Hole drilled at Site U1446,Character,Hole
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,Type,PALEOCEANOGRAPHY,sampling metadata>sample identification,,,,,,,H (9 m hydraulic piston core) F (4.5 m hydrau...,Character,Type
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,Section,PALEOCEANOGRAPHY,sampling metadata>sample identification,,,,,,,Section number ( 1 through 7 and core catcher ...,Character,Section
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,Core,PALEOCEANOGRAPHY,sampling metadata>sample identification,,,,,,,Core number,Numeric,Core
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,Section_Depth,PALEOCEANOGRAPHY,sampling metadata>sample identification,,,length unit>centimeter,,,,Mid-depth of sample within section,Numeric,Section_Depth
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,Sample_Depth,CLIMATE RECONSTRUCTIONS|PALEOCEANOGRAPHY,depth variable>depth,,,length unit>meter,,,,Spliced core composite depth below sea floor ...,Numeric,Sample_Depth
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,Age,CLIMATE RECONSTRUCTIONS|PALEOCEANOGRAPHY,age variable>age,,,time unit>age unit>calendar kiloyear before pr...,,,,,Numeric,Age
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,TEX86H,PALEOCEANOGRAPHY,chemical composition>compound>organic compound...,,,dimensionless,,,laboratory method>chromatography>liquid chroma...,index of Schouten et al. 2002; extracted lipids,Numeric,TEX86H
45859,33213,58697,https://www.ncei.noaa.gov/pub/data/paleo/contr...,SST,CLIMATE RECONSTRUCTIONS|PALEOCEANOGRAPHY,earth system variable>temperature variable>tem...,reconstruction material>organic compound index...,,temperature unit>degree Celsius,,,,index of Kim et al. 2010,Numeric,SST


Instead of the table ID, you can also pass the file URL to get the data:

In [14]:
df1 = ds.get_data(file_urls="https://www.ncei.noaa.gov/pub/data/paleo/contributions_by_author/clemens2021/clemens2021-u1446-mgca-noaa.txt")[0]
df1.head()

Unnamed: 0,Site,Hole,Core,Type,Section,Section_Depth,Sample_Depth,Age,Mg/Ca,Analytical_Facility,Mg/Ca_SST
0,U1446,C,1,H,1,5,0.05,0.32,4.63,Rosenthal (Rutgers),28.41
1,U1446,C,1,H,1,32,0.32,0.66,4.62,Rosenthal (Rutgers),28.46
2,U1446,C,1,H,1,62,0.62,1.05,4.72,Rosenthal (Rutgers),28.43
3,U1446,C,1,H,1,91,0.91,1.42,4.58,Rosenthal (Rutgers),28.63
4,U1446,C,1,H,1,122,1.22,1.81,4.7,Rosenthal (Rutgers),28.61


This shows you how you can retrieve information about a dataset from NOAA and get information about the dataset. However, in most cases, you may not know the NOAA ID. NOAA offers an API search across several query parameters such as geographical extent, datatype... 

Let's have a look!

## Geographical Query

Let's start with a simple geographical query. Let's look for all the datasets within 5°S-5°N and 109-125°E, roughly corresponding to the Indo-Pacific Warm Pool. 

Let's create another `Dataset` object to store this new search:

In [17]:
ds2 = Dataset()

Querying terms and their definitions are available through NOAA: https://www.ncei.noaa.gov/access/paleo-search/api

In [16]:
ds2.search_studies(max_lat=5, min_lat=-5, max_lon=109,
                       min_lon=125)

Parsing NOAA studies: 100%|██████████| 100/100 [00:00<00:00, 21706.28it/s]


Unnamed: 0,StudyID,XMLID,StudyName,DataType,EarliestYearBP,MostRecentYearBP,EarliestYearCE,MostRecentYearCE,StudyNotes,ScienceKeywords,Investigators,Publications,Sites,Funding
0,11194,9632,"1,100 Year El Niño/Southern Oscillation (ENSO)...",CLIMATE RECONSTRUCTIONS,1050.0,-52.0,900.0,2002.0,An index of canonical ENSO variability for the...,[Atmospheric and Oceanic Circulation Patterns ...,"Jinbao Li, Shang-Ping Xie, Edward Cook, Rosann...","[{'Author': 'Li, J., S.-P. Xie, E.R. Cook, G. ...","[[{'DataTableID': '19791', 'DataTableName': 'e...",[{'fundingAgency': 'US National Science Founda...
1,22031,20009,1200 Year Atlantic Multidecadal Variability an...,CLIMATE RECONSTRUCTIONS,1150.0,-60.0,800.0,2010.0,Summer (May-September) Atlantic Multidecadal V...,[Atmospheric and Oceanic Circulation Patterns ...,"Jianglin Wang, Bao Yang, Fredrik Ljungqvist, J...","[{'Author': 'Jianglin Wang, Bao Yang, Fredrik ...","[[{'DataTableID': '33108', 'DataTableName': 'W...",[{'fundingAgency': 'Deutsche Forschungsgemeins...
2,39047,80187,1500 Year Sedimentological and Geochemical Dat...,PALEOLIMNOLOGY,1200.0,-50.0,750.0,2000.0,Elevations of lakes: Tota = 3015m; Siscunsi = ...,[Precipitation Reconstruction],"Broxton Bird, Byron Steinman, Jaime Escobar, A...","[{'Author': 'Bird, B.W., B.A. Steinman, J. Esc...","[[{'DataTableID': '51864', 'DataTableName': 'M...","[{'fundingAgency': 'Indiana University', 'fund..."
3,2614,1685,350 KYr Sea Level Reconstruction and Foraminif...,PALEOCEANOGRAPHY,361500.0,1000.0,-359550.0,950.0,,[Sea Level Reconstruction],"David Lea, Pamela Martin, Dorothy Pak, Howard ...","[{'Author': 'Lea, D.W., P.A. Martin, D.K. Pak,...","[[{'DataTableID': '4301', 'DataTableName': 'TR...",[]
4,14632,12613,700 Year El Niño/Southern Oscillation (ENSO) N...,CLIMATE RECONSTRUCTIONS,649.0,-55.0,1301.0,2005.0,An index of canonical El Niño/Southern Oscilla...,"[ENSO, Atmospheric and Oceanic Circulation Pat...","Jinbao Li, Shang-Ping Xie, Edward Cook, Marian...","[{'Author': 'Li, J., S.-P. Xie, E.R. Cook, G. ...","[[{'DataTableID': '24679', 'DataTableName': 'L...",[{'fundingAgency': 'US National Science Founda...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,25611,23814,Eastern Equatorial Pacific d18O and d13C Data ...,PALEOCEANOGRAPHY,24734.0,0.0,-22784.0,1950.0,,[Last Glacial Maximum],"Heather Ford, Celia McChesney, Jennifer Hertzb...","[{'Author': 'Ford, H.L., C.L. McChesney, J.E. ...","[[{'DataTableID': '37453', 'DataTableName': 'O...",[{'fundingAgency': 'US National Science Founda...
96,30152,71852,Eastern Pacific Ocean TEX86 and Mg/Ca SST Reco...,CLIMATE RECONSTRUCTIONS,24734.0,1239.0,-22784.0,711.0,Metadata are from the Temperature-12k project ...,,"Jennifer Hertzberg, Matthew Schmidt, Thomas Bi...","[{'Author': 'Kaufman, D., McKay, N., Routson, ...","[[{'DataTableID': '42637', 'DataTableName': 'M...",[]
97,2624,1695,Eastern Pacific Pleistocene Alkenone Data and ...,PALEOCEANOGRAPHY,1833700.0,4472.0,-1831750.0,-2522.0,,[Sea Surface Temperature Reconstruction],"Zhonghui Liu, Timothy Herbert","[{'Author': 'Liu, Z. and T.D. Herbert', 'Title...","[[{'DataTableID': '4320', 'DataTableName': 'OD...",[]
98,19139,16805,"Eastern Tropical Indian Ocean 45,000 Year d18O...",PALEOCEANOGRAPHY,45330.0,60.0,-43380.0,1890.0,High-resolution (~30-80 years) foraminiferal o...,[Sea Surface Temperature Reconstruction],"Mahyar Mohtadi, Matthias Prange, Delia Oppo, R...","[{'Author': 'Mahyar Mohtadi, Matthias Prange, ...","[[{'DataTableID': '29599', 'DataTableName': 'M...",[{'fundingAgency': 'Bundesministerium für Bild...


<div style="border-left: 4px solid #d62728; background: #ffe6e6; padding: 0.5em 1em; border-radius: 4px;">
  <strong>Warning:</strong> One of the default parameters in `search_studies` is how many studies can be returned by query. The default is 100. 
</div>

We got 100 datasets, which means we may have hit the limit. Let's increase to 500 and see what happens:

In [18]:
ds2.search_studies(max_lat=5, min_lat=-5, max_lon=109,
                       min_lon=125, limit=500)

Parsing NOAA studies: 100%|██████████| 383/383 [00:00<00:00, 7918.97it/s]


Unnamed: 0,StudyID,XMLID,StudyName,DataType,EarliestYearBP,MostRecentYearBP,EarliestYearCE,MostRecentYearCE,StudyNotes,ScienceKeywords,Investigators,Publications,Sites,Funding
0,11194,9632,"1,100 Year El Niño/Southern Oscillation (ENSO)...",CLIMATE RECONSTRUCTIONS,1050.0,-52.0,900.0,2002.0,An index of canonical ENSO variability for the...,[Atmospheric and Oceanic Circulation Patterns ...,"Jinbao Li, Shang-Ping Xie, Edward Cook, Rosann...","[{'Author': 'Li, J., S.-P. Xie, E.R. Cook, G. ...","[[{'DataTableID': '19791', 'DataTableName': 'e...",[{'fundingAgency': 'US National Science Founda...
1,22031,20009,1200 Year Atlantic Multidecadal Variability an...,CLIMATE RECONSTRUCTIONS,1150.0,-60.0,800.0,2010.0,Summer (May-September) Atlantic Multidecadal V...,[Atmospheric and Oceanic Circulation Patterns ...,"Jianglin Wang, Bao Yang, Fredrik Ljungqvist, J...","[{'Author': 'Jianglin Wang, Bao Yang, Fredrik ...","[[{'DataTableID': '33108', 'DataTableName': 'W...",[{'fundingAgency': 'Deutsche Forschungsgemeins...
2,39047,80187,1500 Year Sedimentological and Geochemical Dat...,PALEOLIMNOLOGY,1200.0,-50.0,750.0,2000.0,Elevations of lakes: Tota = 3015m; Siscunsi = ...,[Precipitation Reconstruction],"Broxton Bird, Byron Steinman, Jaime Escobar, A...","[{'Author': 'Bird, B.W., B.A. Steinman, J. Esc...","[[{'DataTableID': '51864', 'DataTableName': 'M...","[{'fundingAgency': 'Indiana University', 'fund..."
3,2614,1685,350 KYr Sea Level Reconstruction and Foraminif...,PALEOCEANOGRAPHY,361500.0,1000.0,-359550.0,950.0,,[Sea Level Reconstruction],"David Lea, Pamela Martin, Dorothy Pak, Howard ...","[{'Author': 'Lea, D.W., P.A. Martin, D.K. Pak,...","[[{'DataTableID': '4301', 'DataTableName': 'TR...",[]
4,14632,12613,700 Year El Niño/Southern Oscillation (ENSO) N...,CLIMATE RECONSTRUCTIONS,649.0,-55.0,1301.0,2005.0,An index of canonical El Niño/Southern Oscilla...,"[ENSO, Atmospheric and Oceanic Circulation Pat...","Jinbao Li, Shang-Ping Xie, Edward Cook, Marian...","[{'Author': 'Li, J., S.-P. Xie, E.R. Cook, G. ...","[[{'DataTableID': '24679', 'DataTableName': 'L...",[{'fundingAgency': 'US National Science Founda...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
378,6400,2421,"Western Tropical Pacific, 21Kyr BP to present,...",PALEOCEANOGRAPHY,21921.0,2.0,-19971.0,1948.0,Foraminiferal stable isotope and Mg/Ca data fo...,,"Axel Timmermann, Lowell Stott, Robert Thunell","[{'Author': 'Stott, L.D., A. Timmermann, and R...","[[{'DataTableID': '9544', 'DataTableName': 'M2...",[{'fundingAgency': 'US National Science Founda...
379,16791,14498,Western Ugandan Crater Lakes 1000Yr Diatom-inf...,PALEOLIMNOLOGY,878.0,-57.0,1072.0,2007.0,Diatom census data and reconstructed Lake Leve...,[hydrology],"Keely Mills, David Ryves","[{'Author': 'Keely Mills and David B. Ryves ',...","[[{'DataTableID': '26997', 'DataTableName': 'K...",[{'fundingAgency': 'UK Natural Environment Res...
380,8722,2801,Yasuhara et al. 2009 Equatorial Atlantic ODP92...,PALEOCEANOGRAPHY,500000.0,0.0,-498050.0,1950.0,,,"Moriaki Yasuhara, Gene Hunt, Thomas Cronin, Hi...","[{'Author': 'Yasuhara, M., G. Hunt, T.M. Croni...","[[{'DataTableID': '32695', 'DataTableName': 'Y...",[]
381,6266,2549,de Garidel-Thoron et al. 2005 Western Pacific ...,CLIMATE RECONSTRUCTIONS,1750000.0,0.0,-1748050.0,1950.0,,[Sea Surface Temperature Reconstruction],"Franck Bassinot, Thibault de Garidel-Thoron, Y...","[{'Author': 'de Garidel-Thoron, T., Y. Rosenth...","[[{'DataTableID': '38369', 'DataTableName': 'd...",[]


We have 383 studies now, far from the maximum of 500 so it seams we collected all records. 

You can use the same functions as described above the retrieve information. For instance, let's have a look at the data tables:

In [19]:
df_tables_geo = ds2.get_tables()
df_tables_geo.head()

Unnamed: 0,DataTableID,DataTableName,TimeUnit,FileURL,Variables,FileDescription,TotalFilesAvailable,SiteID,SiteName,LocationName,Latitude,Longitude,MinElevation,MaxElevation,StudyID,StudyName
0,19791,enso_li2011,CE,https://www.ncei.noaa.gov/pub/data/paleo/treer...,"[age, ensoi, ensovar]",NOAA Template File,1,48144,Tropical Pacific,Ocean>Pacific Ocean,-20.0,20.0,,,11194,"1,100 Year El Niño/Southern Oscillation (ENSO)..."
1,33108,Wang2017AMV-AMO,CE,https://www.ncei.noaa.gov/pub/data/paleo/recon...,"[age, Atlantic Multidecadal Variability Index,...",Formatted Text Data File,1,56395,North Atlantic Ocean,Ocean>Atlantic Ocean>North Atlantic Ocean,0.0,70.0,,,22031,1200 Year Atlantic Multidecadal Variability an...
2,51864,MCEOF Bird2024,cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/paleo...,"[pumacocha_1sig_hi, pumacocha_2sig_hi, quelcca...",NOAA Template File,1,60174,Circum-Andes Synthesis,Continent>South America,-13.933,10.712,3015.0,3680.0,39047,1500 Year Sedimentological and Geochemical Dat...
3,55412,bird2024 RecordsUsed,calendar year before present,https://www.ncei.noaa.gov/pub/data/paleo/paleo...,[],NOAA Template File,1,60174,Circum-Andes Synthesis,Continent>South America,-13.933,10.712,3015.0,3680.0,39047,1500 Year Sedimentological and Geochemical Dat...
4,51863,"Tota C/N, Clay Bird2024",cal yr BP,https://www.ncei.noaa.gov/pub/data/paleo/paleo...,"[Tota_avg_depth, Tota_H13_%clay, Tota_H13_cn]",NOAA Template File,1,60952,Lake Tota,Continent>South America>Colombia,5.497382,-72.978709,3015.0,3015.0,39047,1500 Year Sedimentological and Geochemical Dat...


Other searches possible under the `search_studies` method:

In [20]:
?ds.search_studies

[31mSignature:[39m
ds.search_studies(
    xml_id=[38;5;28;01mNone[39;00m,
    noaa_id=[38;5;28;01mNone[39;00m,
    data_publisher=[33m'NOAA'[39m,
    data_type_id=[38;5;28;01mNone[39;00m,
    keywords=[38;5;28;01mNone[39;00m,
    investigators=[38;5;28;01mNone[39;00m,
    max_lat=[38;5;28;01mNone[39;00m,
    min_lat=[38;5;28;01mNone[39;00m,
    max_lon=[38;5;28;01mNone[39;00m,
    min_lon=[38;5;28;01mNone[39;00m,
    location=[38;5;28;01mNone[39;00m,
    publication=[38;5;28;01mNone[39;00m,
    search_text=[38;5;28;01mNone[39;00m,
    earliest_year=[38;5;28;01mNone[39;00m,
    latest_year=[38;5;28;01mNone[39;00m,
    cv_whats=[38;5;28;01mNone[39;00m,
    recent=[38;5;28;01mFalse[39;00m,
    limit=[32m100[39m,
)
[31mDocstring:[39m
Search for NOAA studies using the specified parameters.

At least one parameter must be provided to perform a search. This method interfaces with
the NOAA NCEI Paleo Study Search API. Use it to filter studies based on lo