
Querying data from the NOAA database#
Preamble#
This tutorial will get you familiar with searching the NOAA database for relevant datasets and search them for relevant information such as variables, data, geographical information.
Goals#
Searching for a specific study and obtaining information such as location, publication, variable metadata and the associated data.
Searching over a geographical area
Pre-requisites#
Reading time#
Let’s import our packages!
from pyleotups import Dataset
import pandas as pd
From a study ID#
This is the most basic search you can perform, using a NOAA ID to access a dataset. First, you need to create a Dataset
object that will store the information.
ds=Dataset()
Let’s do a simple search, knowing the NOAA study ID. For this example, let’s use the dataset from Clemens et al. (2021), which can be accessed through NOAA Paleo portal.
ds.search_studies(noaa_id=33213)
Parsing NOAA studies: 100%|██████████| 1/1 [00:00<00:00, 3355.44it/s]
StudyID | XMLID | StudyName | DataType | EarliestYearBP | MostRecentYearBP | EarliestYearCE | MostRecentYearCE | StudyNotes | ScienceKeywords | Investigators | Publications | Sites | Funding | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 33213 | 74834 | Bay of Bengal, Northeast Indian Margin Stable ... | PALEOCEANOGRAPHY | 1462580 | 280 | -1460630 | 1670 | Provided Keywords: Indian monsoon, South Asian... | None | Steven Clemens, Masanobu Yamamoto, Kaustubh Th... | [{'Author': 'Clemens, Steven; Yamamoto, Masano... | [[{'DataTableID': '45857', 'DataTableName': 'U... | [{'fundingAgency': 'US National Science Founda... |
The summary
method provides basic information about the dataset, such as the name of the study, the NOAA DataType, the time coverage and the associated publication. The function retuns a pandas.DataFrame
.
df = ds.get_summary()
df
StudyID | XMLID | StudyName | DataType | EarliestYearBP | MostRecentYearBP | EarliestYearCE | MostRecentYearCE | StudyNotes | ScienceKeywords | Investigators | Publications | Sites | Funding | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 33213 | 74834 | Bay of Bengal, Northeast Indian Margin Stable ... | PALEOCEANOGRAPHY | 1462580 | 280 | -1460630 | 1670 | Provided Keywords: Indian monsoon, South Asian... | None | Kaustubh Thirumalai, Liviu Giosan, Julie Riche... | [{'Author': 'Clemens, Steven; Yamamoto, Masano... | [[{'DataTableID': '45857', 'DataTableName': 'U... | [{'fundingAgency': 'US National Science Founda... |
PyleoTUPS
allows you to get metadata information that will be returned in separate dataframes. For instance, let’s look at the information about the publication associated with the dataset. This function also returns BibTeX entries along with the DataFrame.
bib, df = ds.get_publications()
df
Author | Title | Journal | Year | Volume | Number | Pages | Type | DOI | URL | CitationKey | StudyID | StudyName | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Clemens, Steven; Yamamoto, Masanobu; Thirumala... | Remote and Local Drivers of Pleistocene South ... | Science Advances | 2021 | 7 | 23 | NaN | publication | 10.1126/sciadv.abg3848 | http://dx.doi.org/10.1126/sciadv.abg3848 | M._Remote_2021_33213 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
Similarly, you can return information about the location of the record:
df_geo = ds.get_geo()
df_geo
StudyID | DataType | SiteID | SiteName | LocationName | Latitude | Longitude | MinElevation | MaxElevation | |
---|---|---|---|---|---|---|---|---|---|
0 | 33213 | PALEOCEANOGRAPHY | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 |
And the funding that supported the study:
df_fund = ds.get_funding()
df_fund
StudyID | StudyName | FundingAgency | FundingGrant | |
---|---|---|---|---|
0 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... | US National Science Foundation | OCE1634774 |
1 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... | Japan Society for the Promotion of Science (JSPS) | JPMXS05R2900001 |
2 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... | UK Natural Environment Research Council (NERC) | JPMXS05R2900001, 19H05595 |
3 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... | United States Geological Survey (USGS) | NE/L002493/1 |
Next, let’s have a look at the tables of data present in the dataset:
df_tables = ds.get_tables()
df_tables
DataTableID | DataTableName | TimeUnit | FileURL | Variables | FileDescription | TotalFilesAvailable | SiteID | SiteName | LocationName | Latitude | Longitude | MinElevation | MaxElevation | StudyID | StudyName | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 45857 | U1446 Benthic Isotopes Clemens2021 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/contr... | [Site, Hole, Type, Section, Core, Section_Dept... | NOAA Template File | 1 | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
1 | 45858 | U1446 Planktic Isotopes Clemens2021 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/contr... | [Site, Hole, Type, Section, Comment, Core, Sec... | NOAA Template File | 1 | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
2 | 45859 | U1446 TEX86H_SST Clemens2021 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/contr... | [Site, Hole, Type, Section, Core, Section_Dept... | NOAA Template File | 1 | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
3 | 45860 | U1446 d18Osw Clemens2021 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/contr... | [Age, SL_Scaled, SL_Scaled_averaged, d18O_G_ru... | NOAA Template File | 1 | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
4 | 45861 | U1446 LeafWax CarbonIsotope Clemens2021 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/contr... | [Site, Hole, Type, Section, Core, Section_Dept... | NOAA Template File | 1 | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
5 | 45862 | U1446 Mg/Ca Clemens2021 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/contr... | [Site, Hole, Type, Analytical_Facility, Core, ... | NOAA Template File | 1 | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
6 | 45863 | U1446 Rb/Ca Clemens2021 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/contr... | [Site, Hole, Type, Section, Core, Section_Dept... | NOAA Template File | 1 | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
7 | 45864 | U1446 Age Model Clemens2021 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/contr... | [Site, Comments, Sample_Depth, Age] | NOAA Template File | 1 | 58697 | IODP U1446 | Ocean>Indian Ocean | 19.083 | 85.733 | -1440 | -1440 | 33213 | Bay of Bengal, Northeast Indian Margin Stable ... |
The dataset contains 8 tables. To get the data, you need to pass the DataTableID
or FileURL
to the get_data()
method. Let’s have a look at the TEX86 data:
dfs = ds.get_data(dataTableIDs="45859")
dfs[0].head()
Site | Hole | Core | Type | Section | Section_Depth | Sample_Depth | Age | TEX86H | SST | |
---|---|---|---|---|---|---|---|---|---|---|
0 | U1446 | C | 1 | H | 1 | 4.5 | 0.045 | 0.31 | -0.1177 | 30.55 |
1 | U1446 | C | 1 | H | 1 | 31.5 | 0.315 | 0.66 | -0.1216 | 30.28 |
2 | U1446 | C | 1 | H | 1 | 61.5 | 0.615 | 1.04 | -0.1183 | 30.51 |
3 | U1446 | C | 1 | H | 1 | 90.5 | 0.905 | 1.41 | -0.1089 | 31.15 |
4 | U1446 | C | 1 | H | 1 | 121.5 | 1.215 | 1.80 | -0.1155 | 30.70 |
Some relevant metadata for each column is stored in the DataFrame attributes:
df.attrs
{'variables': ['Site',
'Hole',
'Core',
'Type',
'Section',
'Section_Depth',
'Sample_Depth',
'Age',
'TEX86H',
'SST'],
'NOAAStudyId': '33213',
'StudyName': 'Bay of Bengal, Northeast Indian Margin Stable Isotope, Biomarker and SST Reconstructions since the Mid-Pleistocene'}
To get more information about the variables, you can use the get_variables
function:
df_var = ds.get_variables(dataTableIDs="45859")
df_var
StudyID | SiteID | FileURL | VariableName | cvDataType | cvWhat | cvMaterial | cvError | cvUnit | cvSeasonality | cvDetail | cvMethod | cvAdditionalInfo | cvFormat | cvShortName | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DataTableID | |||||||||||||||
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | Site | PALEOCEANOGRAPHY | sampling metadata>sample identification | None | None | None | None | None | None | Site identification | Character | Site |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | Hole | PALEOCEANOGRAPHY | sampling metadata>sample identification | None | None | None | None | None | None | Hole drilled at Site U1446 | Character | Hole |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | Type | PALEOCEANOGRAPHY | sampling metadata>sample identification | None | None | None | None | None | None | H (9 m hydraulic piston core) F (4.5 m hydrau... | Character | Type |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | Section | PALEOCEANOGRAPHY | sampling metadata>sample identification | None | None | None | None | None | None | Section number ( 1 through 7 and core catcher ... | Character | Section |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | Core | PALEOCEANOGRAPHY | sampling metadata>sample identification | None | None | None | None | None | None | Core number | Numeric | Core |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | Section_Depth | PALEOCEANOGRAPHY | sampling metadata>sample identification | None | None | length unit>centimeter | None | None | None | Mid-depth of sample within section | Numeric | Section_Depth |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | Sample_Depth | CLIMATE RECONSTRUCTIONS|PALEOCEANOGRAPHY | depth variable>depth | None | None | length unit>meter | None | None | None | Spliced core composite depth below sea floor ... | Numeric | Sample_Depth |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | Age | CLIMATE RECONSTRUCTIONS|PALEOCEANOGRAPHY | age variable>age | None | None | time unit>age unit>calendar kiloyear before pr... | None | None | None | None | Numeric | Age |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | TEX86H | PALEOCEANOGRAPHY | chemical composition>compound>organic compound... | None | None | dimensionless | None | None | laboratory method>chromatography>liquid chroma... | index of Schouten et al. 2002; extracted lipids | Numeric | TEX86H |
45859 | 33213 | 58697 | https://www.ncei.noaa.gov/pub/data/paleo/contr... | SST | CLIMATE RECONSTRUCTIONS|PALEOCEANOGRAPHY | earth system variable>temperature variable>tem... | reconstruction material>organic compound index... | None | temperature unit>degree Celsius | None | None | None | index of Kim et al. 2010 | Numeric | SST |
Instead of the table ID, you can also pass the file URL to get the data:
df1 = ds.get_data(file_urls="https://www.ncei.noaa.gov/pub/data/paleo/contributions_by_author/clemens2021/clemens2021-u1446-mgca-noaa.txt")[0]
df1.head()
Site | Hole | Core | Type | Section | Section_Depth | Sample_Depth | Age | Mg/Ca | Analytical_Facility | Mg/Ca_SST | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | U1446 | C | 1 | H | 1 | 5 | 0.05 | 0.32 | 4.63 | Rosenthal (Rutgers) | 28.41 |
1 | U1446 | C | 1 | H | 1 | 32 | 0.32 | 0.66 | 4.62 | Rosenthal (Rutgers) | 28.46 |
2 | U1446 | C | 1 | H | 1 | 62 | 0.62 | 1.05 | 4.72 | Rosenthal (Rutgers) | 28.43 |
3 | U1446 | C | 1 | H | 1 | 91 | 0.91 | 1.42 | 4.58 | Rosenthal (Rutgers) | 28.63 |
4 | U1446 | C | 1 | H | 1 | 122 | 1.22 | 1.81 | 4.70 | Rosenthal (Rutgers) | 28.61 |
This shows you how you can retrieve information about a dataset from NOAA and get information about the dataset. However, in most cases, you may not know the NOAA ID. NOAA offers an API search across several query parameters such as geographical extent, datatype…
Let’s have a look!
Geographical Query#
Let’s start with a simple geographical query. Let’s look for all the datasets within 5°S-5°N and 109-125°E, roughly corresponding to the Indo-Pacific Warm Pool.
Let’s create another Dataset
object to store this new search:
ds2 = Dataset()
Querying terms and their definitions are available through NOAA: https://www.ncei.noaa.gov/access/paleo-search/api
ds2.search_studies(max_lat=5, min_lat=-5, max_lon=109,
min_lon=125)
Parsing NOAA studies: 100%|██████████| 100/100 [00:00<00:00, 21706.28it/s]
StudyID | XMLID | StudyName | DataType | EarliestYearBP | MostRecentYearBP | EarliestYearCE | MostRecentYearCE | StudyNotes | ScienceKeywords | Investigators | Publications | Sites | Funding | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 11194 | 9632 | 1,100 Year El Niño/Southern Oscillation (ENSO)... | CLIMATE RECONSTRUCTIONS | 1050.0 | -52.0 | 900.0 | 2002.0 | An index of canonical ENSO variability for the... | [Atmospheric and Oceanic Circulation Patterns ... | Jinbao Li, Shang-Ping Xie, Edward Cook, Rosann... | [{'Author': 'Li, J., S.-P. Xie, E.R. Cook, G. ... | [[{'DataTableID': '19791', 'DataTableName': 'e... | [{'fundingAgency': 'US National Science Founda... |
1 | 22031 | 20009 | 1200 Year Atlantic Multidecadal Variability an... | CLIMATE RECONSTRUCTIONS | 1150.0 | -60.0 | 800.0 | 2010.0 | Summer (May-September) Atlantic Multidecadal V... | [Atmospheric and Oceanic Circulation Patterns ... | Jianglin Wang, Bao Yang, Fredrik Ljungqvist, J... | [{'Author': 'Jianglin Wang, Bao Yang, Fredrik ... | [[{'DataTableID': '33108', 'DataTableName': 'W... | [{'fundingAgency': 'Deutsche Forschungsgemeins... |
2 | 39047 | 80187 | 1500 Year Sedimentological and Geochemical Dat... | PALEOLIMNOLOGY | 1200.0 | -50.0 | 750.0 | 2000.0 | Elevations of lakes: Tota = 3015m; Siscunsi = ... | [Precipitation Reconstruction] | Broxton Bird, Byron Steinman, Jaime Escobar, A... | [{'Author': 'Bird, B.W., B.A. Steinman, J. Esc... | [[{'DataTableID': '51864', 'DataTableName': 'M... | [{'fundingAgency': 'Indiana University', 'fund... |
3 | 2614 | 1685 | 350 KYr Sea Level Reconstruction and Foraminif... | PALEOCEANOGRAPHY | 361500.0 | 1000.0 | -359550.0 | 950.0 | [Sea Level Reconstruction] | David Lea, Pamela Martin, Dorothy Pak, Howard ... | [{'Author': 'Lea, D.W., P.A. Martin, D.K. Pak,... | [[{'DataTableID': '4301', 'DataTableName': 'TR... | [] | |
4 | 14632 | 12613 | 700 Year El Niño/Southern Oscillation (ENSO) N... | CLIMATE RECONSTRUCTIONS | 649.0 | -55.0 | 1301.0 | 2005.0 | An index of canonical El Niño/Southern Oscilla... | [ENSO, Atmospheric and Oceanic Circulation Pat... | Jinbao Li, Shang-Ping Xie, Edward Cook, Marian... | [{'Author': 'Li, J., S.-P. Xie, E.R. Cook, G. ... | [[{'DataTableID': '24679', 'DataTableName': 'L... | [{'fundingAgency': 'US National Science Founda... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
95 | 25611 | 23814 | Eastern Equatorial Pacific d18O and d13C Data ... | PALEOCEANOGRAPHY | 24734.0 | 0.0 | -22784.0 | 1950.0 | None | [Last Glacial Maximum] | Heather Ford, Celia McChesney, Jennifer Hertzb... | [{'Author': 'Ford, H.L., C.L. McChesney, J.E. ... | [[{'DataTableID': '37453', 'DataTableName': 'O... | [{'fundingAgency': 'US National Science Founda... |
96 | 30152 | 71852 | Eastern Pacific Ocean TEX86 and Mg/Ca SST Reco... | CLIMATE RECONSTRUCTIONS | 24734.0 | 1239.0 | -22784.0 | 711.0 | Metadata are from the Temperature-12k project ... | None | Jennifer Hertzberg, Matthew Schmidt, Thomas Bi... | [{'Author': 'Kaufman, D., McKay, N., Routson, ... | [[{'DataTableID': '42637', 'DataTableName': 'M... | [] |
97 | 2624 | 1695 | Eastern Pacific Pleistocene Alkenone Data and ... | PALEOCEANOGRAPHY | 1833700.0 | 4472.0 | -1831750.0 | -2522.0 | [Sea Surface Temperature Reconstruction] | Zhonghui Liu, Timothy Herbert | [{'Author': 'Liu, Z. and T.D. Herbert', 'Title... | [[{'DataTableID': '4320', 'DataTableName': 'OD... | [] | |
98 | 19139 | 16805 | Eastern Tropical Indian Ocean 45,000 Year d18O... | PALEOCEANOGRAPHY | 45330.0 | 60.0 | -43380.0 | 1890.0 | High-resolution (~30-80 years) foraminiferal o... | [Sea Surface Temperature Reconstruction] | Mahyar Mohtadi, Matthias Prange, Delia Oppo, R... | [{'Author': 'Mahyar Mohtadi, Matthias Prange, ... | [[{'DataTableID': '29599', 'DataTableName': 'M... | [{'fundingAgency': 'Bundesministerium für Bild... |
99 | 19874 | 17739 | Eastern Tropical Pacific 1,000 Year Foraminfer... | PALEOCEANOGRAPHY | 1088.0 | -59.0 | 862.0 | 2009.0 | Oxygen isotopes and Mg/Ca data from planktonic... | [Little Ice Age (LIA), Medieval Warm Period] | Athanasios Koutavas, Thomas Marchitto, Braddoc... | [{'Author': 'Gerald T. Rustic, Athanasios Kout... | [[{'DataTableID': '30534', 'DataTableName': 'R... | [{'fundingAgency': 'US National Science Founda... |
100 rows × 14 columns
We got 100 datasets, which means we may have hit the limit. Let’s increase to 500 and see what happens:
ds2.search_studies(max_lat=5, min_lat=-5, max_lon=109,
min_lon=125, limit=500)
Parsing NOAA studies: 100%|██████████| 383/383 [00:00<00:00, 7918.97it/s]
StudyID | XMLID | StudyName | DataType | EarliestYearBP | MostRecentYearBP | EarliestYearCE | MostRecentYearCE | StudyNotes | ScienceKeywords | Investigators | Publications | Sites | Funding | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 11194 | 9632 | 1,100 Year El Niño/Southern Oscillation (ENSO)... | CLIMATE RECONSTRUCTIONS | 1050.0 | -52.0 | 900.0 | 2002.0 | An index of canonical ENSO variability for the... | [Atmospheric and Oceanic Circulation Patterns ... | Jinbao Li, Shang-Ping Xie, Edward Cook, Rosann... | [{'Author': 'Li, J., S.-P. Xie, E.R. Cook, G. ... | [[{'DataTableID': '19791', 'DataTableName': 'e... | [{'fundingAgency': 'US National Science Founda... |
1 | 22031 | 20009 | 1200 Year Atlantic Multidecadal Variability an... | CLIMATE RECONSTRUCTIONS | 1150.0 | -60.0 | 800.0 | 2010.0 | Summer (May-September) Atlantic Multidecadal V... | [Atmospheric and Oceanic Circulation Patterns ... | Jianglin Wang, Bao Yang, Fredrik Ljungqvist, J... | [{'Author': 'Jianglin Wang, Bao Yang, Fredrik ... | [[{'DataTableID': '33108', 'DataTableName': 'W... | [{'fundingAgency': 'Deutsche Forschungsgemeins... |
2 | 39047 | 80187 | 1500 Year Sedimentological and Geochemical Dat... | PALEOLIMNOLOGY | 1200.0 | -50.0 | 750.0 | 2000.0 | Elevations of lakes: Tota = 3015m; Siscunsi = ... | [Precipitation Reconstruction] | Broxton Bird, Byron Steinman, Jaime Escobar, A... | [{'Author': 'Bird, B.W., B.A. Steinman, J. Esc... | [[{'DataTableID': '51864', 'DataTableName': 'M... | [{'fundingAgency': 'Indiana University', 'fund... |
3 | 2614 | 1685 | 350 KYr Sea Level Reconstruction and Foraminif... | PALEOCEANOGRAPHY | 361500.0 | 1000.0 | -359550.0 | 950.0 | [Sea Level Reconstruction] | David Lea, Pamela Martin, Dorothy Pak, Howard ... | [{'Author': 'Lea, D.W., P.A. Martin, D.K. Pak,... | [[{'DataTableID': '4301', 'DataTableName': 'TR... | [] | |
4 | 14632 | 12613 | 700 Year El Niño/Southern Oscillation (ENSO) N... | CLIMATE RECONSTRUCTIONS | 649.0 | -55.0 | 1301.0 | 2005.0 | An index of canonical El Niño/Southern Oscilla... | [ENSO, Atmospheric and Oceanic Circulation Pat... | Jinbao Li, Shang-Ping Xie, Edward Cook, Marian... | [{'Author': 'Li, J., S.-P. Xie, E.R. Cook, G. ... | [[{'DataTableID': '24679', 'DataTableName': 'L... | [{'fundingAgency': 'US National Science Founda... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
378 | 6400 | 2421 | Western Tropical Pacific, 21Kyr BP to present,... | PALEOCEANOGRAPHY | 21921.0 | 2.0 | -19971.0 | 1948.0 | Foraminiferal stable isotope and Mg/Ca data fo... | None | Axel Timmermann, Lowell Stott, Robert Thunell | [{'Author': 'Stott, L.D., A. Timmermann, and R... | [[{'DataTableID': '9544', 'DataTableName': 'M2... | [{'fundingAgency': 'US National Science Founda... |
379 | 16791 | 14498 | Western Ugandan Crater Lakes 1000Yr Diatom-inf... | PALEOLIMNOLOGY | 878.0 | -57.0 | 1072.0 | 2007.0 | Diatom census data and reconstructed Lake Leve... | [hydrology] | Keely Mills, David Ryves | [{'Author': 'Keely Mills and David B. Ryves ',... | [[{'DataTableID': '26997', 'DataTableName': 'K... | [{'fundingAgency': 'UK Natural Environment Res... |
380 | 8722 | 2801 | Yasuhara et al. 2009 Equatorial Atlantic ODP92... | PALEOCEANOGRAPHY | 500000.0 | 0.0 | -498050.0 | 1950.0 | None | None | Moriaki Yasuhara, Gene Hunt, Thomas Cronin, Hi... | [{'Author': 'Yasuhara, M., G. Hunt, T.M. Croni... | [[{'DataTableID': '32695', 'DataTableName': 'Y... | [] |
381 | 6266 | 2549 | de Garidel-Thoron et al. 2005 Western Pacific ... | CLIMATE RECONSTRUCTIONS | 1750000.0 | 0.0 | -1748050.0 | 1950.0 | None | [Sea Surface Temperature Reconstruction] | Franck Bassinot, Thibault de Garidel-Thoron, Y... | [{'Author': 'de Garidel-Thoron, T., Y. Rosenth... | [[{'DataTableID': '38369', 'DataTableName': 'd... | [] |
382 | 8678 | 2764 | de Garidel-Thoron et al. 2007 Warm Pool 30ka M... | PALEOCEANOGRAPHY | 30000.0 | 0.0 | -28050.0 | 1950.0 | None | [Sea Surface Temperature Reconstruction] | Corinne Sonzogni, Edouard Bard, Luc Beaufort, ... | [{'Author': 'de Garidel-Thoron, T., Y. Rosenth... | [[{'DataTableID': '34940', 'DataTableName': 'M... | [] |
383 rows × 14 columns
We have 383 studies now, far from the maximum of 500 so it seams we collected all records.
You can use the same functions as described above the retrieve information. For instance, let’s have a look at the data tables:
df_tables_geo = ds2.get_tables()
df_tables_geo.head()
DataTableID | DataTableName | TimeUnit | FileURL | Variables | FileDescription | TotalFilesAvailable | SiteID | SiteName | LocationName | Latitude | Longitude | MinElevation | MaxElevation | StudyID | StudyName | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 19791 | enso_li2011 | CE | https://www.ncei.noaa.gov/pub/data/paleo/treer... | [age, ensoi, ensovar] | NOAA Template File | 1 | 48144 | Tropical Pacific | Ocean>Pacific Ocean | -20 | 20 | None | None | 11194 | 1,100 Year El Niño/Southern Oscillation (ENSO)... |
1 | 33108 | Wang2017AMV-AMO | CE | https://www.ncei.noaa.gov/pub/data/paleo/recon... | [age, Atlantic Multidecadal Variability Index,... | Formatted Text Data File | 1 | 56395 | North Atlantic Ocean | Ocean>Atlantic Ocean>North Atlantic Ocean | 0 | 70 | None | None | 22031 | 1200 Year Atlantic Multidecadal Variability an... |
2 | 51864 | MCEOF Bird2024 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/paleo... | [pumacocha_1sig_hi, pumacocha_2sig_hi, quelcca... | NOAA Template File | 1 | 60174 | Circum-Andes Synthesis | Continent>South America | -13.933 | 10.712 | 3015 | 3680 | 39047 | 1500 Year Sedimentological and Geochemical Dat... |
3 | 55412 | bird2024 RecordsUsed | calendar year before present | https://www.ncei.noaa.gov/pub/data/paleo/paleo... | [] | NOAA Template File | 1 | 60174 | Circum-Andes Synthesis | Continent>South America | -13.933 | 10.712 | 3015 | 3680 | 39047 | 1500 Year Sedimentological and Geochemical Dat... |
4 | 51863 | Tota C/N, Clay Bird2024 | cal yr BP | https://www.ncei.noaa.gov/pub/data/paleo/paleo... | [Tota_avg_depth, Tota_H13_%clay, Tota_H13_cn] | NOAA Template File | 1 | 60952 | Lake Tota | Continent>South America>Colombia | 5.497382 | -72.978709 | 3015 | 3015 | 39047 | 1500 Year Sedimentological and Geochemical Dat... |
Other searches possible under the search_studies
method:
?ds.search_studies
Signature:
ds.search_studies(
xml_id=None,
noaa_id=None,
data_publisher='NOAA',
data_type_id=None,
keywords=None,
investigators=None,
max_lat=None,
min_lat=None,
max_lon=None,
min_lon=None,
location=None,
publication=None,
search_text=None,
earliest_year=None,
latest_year=None,
cv_whats=None,
recent=False,
limit=100,
)
Docstring:
Search for NOAA studies using the specified parameters.
At least one parameter must be provided to perform a search. This method interfaces with
the NOAA NCEI Paleo Study Search API. Use it to filter studies based on location,
investigators, time range, keywords, and more.
Parameters
----------
xml_id : str, optional
Specify the internal XML document ID. Must be an exact match (e.g., '1840').
noaa_id : str, optional
Provide the unique NOAA Study ID as a number (e.g., '13156').
search_text : str, optional
General text search across study content. Supports wildcards (%) and logical operators (AND, OR).
Examples: 'younger dryas', 'loess AND stratigraphy'
data_publisher : by default 'NOAA'
Choose from: 'NOAA', 'NEOTOMA', or 'PANGAEA'.
Example: 'NOAA'
data_type_id : str, optional
Filter by data type. Use one or more type IDs separated by '|'.
Available IDs:
1: BOREHOLE, 2: CLIMATE FORCING, 3: CLIMATE RECONSTRUCTIONS, 4: CORALS AND SCLEROSPONGES,
6: HISTORICAL, 7: ICE CORES, 8: INSECT, 9: LAKE LEVELS, 10: LOESS,
11: PALEOCLIMATIC MODELING, 12: FIRE HISTORY, 13: PALEOLIMNOLOGY, 14: PALEOCEANOGRAPHY,
15: PLANT MACROFOSSILS, 16: POLLEN, 17: SPELEOTHEMS, 18: TREE RING,
19: OTHER COLLECTIONS, 20: INSTRUMENTAL, 59: SOFTWARE, 60: REPOSITORY
Example: '4|18'
keywords : str, optional
Use hierarchical terms separated by '>'. Separate multiple values using '|'.
Example: 'earth science>paleoclimate>paleocean>biomarkers'
investigators : str, optional
Specify one or more investigator names. Use '|' to separate multiple names.
Example: 'Wahl, E.R.|Vose, R.S.'
max_lat : float, optional
Upper bound for latitude. Must be between -90 and 90.
Example: 90
min_lat : float, optional
Lower bound for latitude. Must be between -90 and 90.
Example: -90
max_lon : float, optional
Upper bound for longitude. Must be between -180 and 180.
Example: 180
min_lon : float, optional
Lower bound for longitude. Must be between -180 and 180.
Example: -180
location : str, optional
Use region hierarchy separated by '>'.
Example: 'Continent>Africa>Eastern Africa>Zambia'
publication : str, optional
Match against publication metadata such as title, author, or citation.
Example: 'Khider'
earliest_year : int, optional
Starting year (can be negative for BCE). Used with `timeFormat` and `timeMethod`.
Example: -500
latest_year : int, optional
Ending year. Used with `timeFormat` and `timeMethod`.
Example: 2020
cv_whats : str, optional
Search using controlled vocabulary terms for measured variables.
Format: Hierarchical string using '>'
Example: 'chemical composition>compound>inorganic compound>carbon dioxide'
recent : bool, optional
Set to True to only return studies from the last two years. Results are sorted by newest.
limit : int, optional
Set to 100 by default. Limits the number of studies retrieved.
Returns
-------
pandas.DataFrame
Response DataFrame
Fills the internal `studies` attribute with structured NOAA study data.
Raises
------
ValueError
If no inputs are passed.
requests.HTTPError
If the HTTP request returned an unsuccessful status code.
Notes
-----
- At least one parameter must be specified, otherwise the API call will fail.
File: ~/Documents/GitHub/PyTUPS/pyleotups/core/Dataset.py
Type: method