Percentile-resolution household income data at national level.
0 viewsΒ·0 downloads
This data is is based on Household Income Surveys (HIS) and Household Income & Expenditure Surveys (HIES) carried out from 1970 to 2022. The survey is carried out at least twice in any rolling five-year period to produce representative data regarding income, poverty and access to basic amenities for Malaysian households. In particular, this dataset provides deeper insight into the distribution of household income at national level, by tabulating key household income metrics by percentile (100 equally sized groups), rather than the typically-published decile (10 groups) or quartile (4 groups). There are 4 metrics on monthly household income provided in the dataset:
This data should be used with caution as the relative standard errors are larger due to the higher resolution. For a full account of the relative standard errors for the survey as well as the detailed survey methodology, please refer to the HIES Technical Notes.
Percentile-resolution household income data at national level.
Name in Dataset | Variable | Definition |
---|---|---|
date (Date) | Date | The date in YYYY-MM-DD format, with MM-DD set to 01-01 as this is annual data |
percentile (Integer) | Percentile | Number from 1 to 100 |
variable (Categorical) | Variable | Either mean, median, minimum or maximum, as explained in the methodology |
income (Integer, RM) | Income | Gross monthly household income, in RM. Values for the minimum for the 1st percentile and maximum for the 100th percentile have been nulled due to identifiability concerns. |
28 Jul 2023, 12:00
N/A
This data is made open under the Creative Commons Attribution 4.0 International License (CC BY 4.0). A copy of the license is available Here.
Full Dataset (CSV)
Recommended for individuals seeking an Excel-friendly format.
0
Full Dataset (Parquet)
Recommended for data scientists seeking to work with data via code.
0
Connect directly to the data with Python.
# If not already installed, do: pip install pandas fastparquet
import pandas as pd
URL_DATA = 'https://storage.dosm.gov.my/hies/hies_malaysia_percentile.parquet'
df = pd.read_parquet(URL_DATA)
if 'date' in df.columns: df['date'] = pd.to_datetime(df['date'])
print(df)
The following code is an example of how to make an API query to retrieve the data catalogue mentioned above. You can use different programming languages by switching the code accordingly. For a complete guide on possible query parameters and syntax, please refer to the official Open API Documentation.
import requests
import pprint
url = "https://api.data.gov.my/data-catalogue?id=hies_malaysia_percentile&limit=3"
response_json = requests.get(url=url).json()
pprint.pprint(response_json)
Department of Statistics Malaysia
Β© 2024 Public Sector Open Data
Open Data
data.gov.my