Open Scientific Data
Data Sources
Every metric on OpenButterfly comes from peer-reviewed scientific organizations with open APIs. Here's exactly where the data comes from, how to access it, and how to integrate it.
The original Keeling Curve. Monthly mean CO₂ in ppm from 1958 to present. The most important climate dataset in existence. Available as CSV or JSON. No API key needed.
📅 Monthly + Daily
🔑 No auth needed
📊 CSV / JSON / TXT
https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_mm_mlo.txt
Docs → NOAA GML
ERA5 reanalysis data, global temperature anomalies, extreme event indices, sea level pressure, and more. One of the most comprehensive climate datasets available.
📅 Hourly / Daily / Monthly
🔑 Free registration
📊 NetCDF / GRIB
import cdsapi; c = cdsapi.Client(); c.retrieve('reanalysis-era5-...', {...})
Docs → Copernicus CDS
Real-time AQI data (PM2.5, PM10, O₃, NO₂, SO₂, CO) for 10,000+ cities. Free tier: 10,000 calls/month. Perfect for the air quality dashboard section.
📅 Real-time
🔑 Free API key
📊 JSON REST
GET https://api.airvisual.com/v2/city?city=Delhi&state=Delhi&country=India&key=API_KEY
Docs → IQAir API
Aggregates real-time and historical air quality data from 15,000+ monitoring stations globally. Fully open API, no auth required for basic use. Powers many major climate dashboards.
📅 Real-time + Historical
🔑 Optional API key
📊 JSON REST
GET https://api.openaq.org/v3/locations?coordinates=28.6,77.2&radius=25000
Docs → OpenAQ
Satellite altimetry data (TOPEX/Poseidon → Jason-3 → Sentinel-6). Global mean sea level with 60-day resolution. Used by the IPCC. Download CSV directly.
📅 60-day cycles
🔑 No auth
📊 CSV / NetCDF
https://climate.nasa.gov/vital-signs/sea-level/ → CSV download
Docs → NASA Sea Level
Arctic and Antarctic sea ice extent, area, and concentration. Daily and monthly values from 1979. Published by the National Snow and Ice Data Center (NSIDC), University of Colorado.
📅 Daily + Monthly
🔑 No auth
📊 CSV / GeoTIFF
https://nsidc.org/data/g02135/versions/3 → Direct FTP + HTTP download
Docs → NSIDC
Sea surface temperature, salinity, chlorophyll, ocean currents, and more. Global coverage with analysis and forecast products. Python toolbox (motuclient / copernicusmarine).
📅 Daily + Reanalysis
🔑 Free registration
📊 NetCDF / Zarr
import copernicusmarine; ds = copernicusmarine.open_dataset(dataset_id="...")
Docs → CMEMS
Satellite-derived SST and Degree Heating Weeks (DHW) for coral bleaching risk. WMS/WCS endpoints allow direct map tile integration. Updated twice weekly.
📅 Twice weekly
🔑 No auth
📊 NetCDF / WMS / JSON
https://coralreefwatch.noaa.gov/product/5km/data.php → NetCDF global
Docs → NOAA CRW
Real-time active fire detections from MODIS and VIIRS satellites. Data available in CSV, JSON, and GeoJSON. You can query a bounding box and get fire points updated every 3 hours.
📅 3-hour updates
🔑 Free API key (MAP_KEY)
📊 CSV / JSON / GeoJSON
GET https://firms.modaps.eosdis.nasa.gov/api/country/csv/{MAP_KEY}/VIIRS_SNPP_NRT/BRA/1
Docs → NASA FIRMS API
Tree cover loss, GLAD deforestation alerts, fire alerts, and land use data. REST API with GeoJSON support. Powers forest monitoring for 150+ countries. Requires free registration.
📅 Weekly alerts
🔑 Free API key
📊 JSON / GeoJSON / CSV
POST https://data-api.globalforestwatch.org/dataset/gfw_integrated_alerts/latest/query
Docs → GFW API
Annual tree cover gain/loss at 30-meter resolution via Google Earth Engine. The definitive global forest dataset. Access via GEE Python/JS API or pre-processed downloads.
📅 Annual
🔑 Google Earth Engine account
📊 GeoTIFF / Earth Engine
ee.Image("UMD/hansen/global_forest_change_2023_v1_11")
Docs → GLAD / GEE
Official Brazilian government Amazon deforestation monitoring. Annual and near-real-time (DETER) alerts. Shapefile, GeoPackage, and API access. The gold standard for Amazon data.
📅 Annual + Near-real-time
🔑 No auth
📊 GeoPackage / Shapefile
https://terrabrasilis.dpi.inpe.br/en/home-page/ → API & downloads
Docs → TerraBrasilis
NASA's global surface temperature anomaly dataset from 1880 to present. Monthly updates. The most-cited global temperature record. Direct CSV download, no auth needed.
📅 Monthly
🔑 No auth
📊 CSV / TXT
https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts+dSST.csv
Docs → NASA GISTEMP
Free, open-source weather API with hourly data, historical reanalysis (ERA5), and climate projections (CMIP6). No API key required for personal/open-source projects. Perfect for this site.
📅 Hourly / Historical
🔑 No auth needed
📊 JSON REST
GET https://api.open-meteo.com/v1/forecast?latitude=52.52&longitude=13.41&hourly=temperature_2m
Docs → Open-Meteo
2.4 billion occurrence records of species globally. Track species distribution, population trends, and range shifts driven by climate change. Full REST API, no auth for read access.
📅 Continuously updated
🔑 No auth for reads
📊 JSON REST
GET https://api.gbif.org/v1/occurrence/search?taxonKey=2440447&limit=10
Docs → GBIF API
Our World in Data's CO₂ and greenhouse gas dataset (based on Global Carbon Project). Per-country emissions, sector breakdowns, and cumulative historical data in CSV. Widely used by researchers.
📅 Annual
🔑 No auth
📊 CSV
https://github.com/owid/co2-data/raw/master/owid-co2-data.csv
GitHub → OWID CO₂ Data
-
Push to GitHub
Create a GitHub repo and push this entire project folder. Cloudflare Pages connects directly to GitHub for CI/CD. Every push to main auto-deploys.
-
Connect to Cloudflare Pages
Go to Cloudflare dashboard → Pages → Create Application → Connect to Git. Select your repo. Build command: none (static site). Output directory: / (root).
-
Add Custom Domain
In Pages settings → Custom Domains → Add openbutterfly.com. Since your domain is already on Cloudflare, DNS is configured automatically. HTTPS is free and automatic.
-
Create Cloudflare Workers for API Proxying
Use Workers to proxy calls to NOAA, NASA, etc. — this keeps your API keys secret server-side, adds caching, and avoids CORS issues. Workers are free up to 100,000 req/day.
-
Schedule Data Refreshes with Cron Triggers
Use wrangler.toml with crons = ["*/15 * * * *"] to fetch and cache data every 15 minutes into Cloudflare KV storage. Your frontend then reads from KV instead of hitting the origin API each time.
Sample: Cloudflare Worker — NOAA CO₂ Proxy with KV Cache
export default {
async scheduled(event, env) {
const res = await fetch(
'https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_weekly_mlo.txt'
);
const text = await res.text();
const lines = text.split('\n').filter(l => !l.startsWith('#'));
const latest = lines[lines.length - 2].trim().split(/\s+/);
const ppm = latest[4];
await env.ENV_DATA.put('co2_latest', JSON.stringify({
ppm: parseFloat(ppm),
date: latest[0] + '-' + latest[1],
updatedAt: new Date().toISOString()
}), { expirationTtl: 3600 });
},
async fetch(request, env) {
const data = await env.ENV_DATA.get('co2_latest', { type: 'json' });
return new Response(JSON.stringify(data), {
headers: {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*',
'Cache-Control': 'public, max-age=900'
}
});
}
};
Sample: Fetching NASA FIRMS Fire Data in the Browser
async function fetchActiveFires() {
const MAP_KEY = 'YOUR_MAP_KEY';
const url = `https://firms.modaps.eosdis.nasa.gov/api/area/csv/
${MAP_KEY}/VIIRS_SNPP_NRT/-180,-90,180,90/1`;
const res = await fetch(url);
const csv = await res.text();
const rows = csv.split('\n').slice(1);
return rows.filter(Boolean).map(row => {
const [lat, lon, bright, scan, track, date, time, sat, instrument, confidence] = row.split(',');
return { lat: +lat, lon: +lon, confidence, date, bright: +bright };
});
}
fetchActiveFires().then(fires => {
fires.forEach(f => {
L.circleMarker([f.lat, f.lon], {
radius: 4, color: '#ef4444', fillOpacity: 0.8
}).addTo(map).bindPopup(`Fire ${f.date} — Brightness: ${f.bright}K`);
});
});