Proportion of river basins classified as clean, slightly polluted, or polluted based on 3 primary indicators of water pollution.
0 viewsΒ·0 downloads
The Department of Environment (DOE) is mandated to monitor the quality of environmental resources in Malaysia. As part of this mandate, the DOE monitors the water quality of rivers in Malaysia via a combination of manual and automated monitoring at over 1300 stations, covering nearly 700 rivers and 150 river basins. This dataset presents the results of this monitoring on an annual basis.
There are 3 primary water pollution indicators monitored by the DOE presented in this dataset:
River basins are classified as clean, slightly polluted, or polluted based on Water Quality Index (WQI) values for each of these 3 indicators. For a deeper understanding of the methodology, please refer to the DOE's Environmental Quality Report.
β
β
Proportion of river basins classified as clean, slightly polluted, or polluted based on 3 primary indicators of water pollution.
Name in Dataset | Variable | Definition |
---|---|---|
date (Date) | Date | The date in YYYY-MM-DD format, with MM-DD set to 01-01 as the data is at yearly frequency. |
basins_monitored (Integer) | Basins Monitored | The number of river basins monitored during the year. |
measure (Categorical) | Pollution Indicator | The pollution indicator measured, either Biochemical Oxygen Demand ('bod5'), Ammoniacal Nitrogen ('nh3n'), or Suspended Solids ('ss'). |
status (Categorical) | Pollution Status | The classification of the river basins based on pollution levels, either Clean ('clean'), Slightly Polluted ('slightly_polluted'), or Polluted ('polluted'). |
n_basins (Integer) | Number of Basins | The number of basins that fall under each pollution classification. |
proportion (Float) | Proportion | The number of basins divided by the total number of basins monitored, expressed as a percentage. |
01 Sept 2024, 12:00
31 Dec 2024, 12:00
This data is made open under the Creative Commons Attribution 4.0 International License (CC BY 4.0). A copy of the license is available Here.
Full Dataset (CSV)
Recommended for individuals seeking an Excel-friendly format.
0
Full Dataset (Parquet)
Recommended for data scientists seeking to work with data via code.
0
Connect directly to the data with Python.
# If not already installed, do: pip install pandas fastparquet
import pandas as pd
URL_DATA = 'https://storage.data.gov.my/environment/water_pollution_basin.parquet'
df = pd.read_parquet(URL_DATA)
if 'date' in df.columns: df['date'] = pd.to_datetime(df['date'])
print(df)
The following code is an example of how to make an API query to retrieve the data catalogue mentioned above. You can use different programming languages by switching the code accordingly. For a complete guide on possible query parameters and syntax, please refer to the official Open API Documentation.
import requests
import pprint
url = "https://api.data.gov.my/data-catalogue?id=water_pollution_basin&limit=3"
response_json = requests.get(url=url).json()
pprint.pprint(response_json)
Department of Statistics Malaysia
Β© 2024 Public Sector Open Data
Open Data
data.gov.my