Repository of University of Nova Gorica

Show document
A+ | A- | Help | SLO | ENG

Title:Measuring data quality across open government datasets
Authors:ID Gupta, Rajan (Author)
ID Yadav, Sushmita (Author)
ID Prasad, Avinash (Author)
ID Pal, Saibal K. (Author)
Files: This document has no files that are freely available to the public. This document may have a physical copy in the library of the organization, check the status via COBISS. Link is opened in a new window
Language:English
Work type:Unknown
Typology:1.08 - Published Scientific Conference Contribution
Organization:UNG - University of Nova Gorica
Abstract:Data Quality has become the base for any analytical operation or modelling. Poor Quality of data can lead to poor analytical modeling, which in turn can lead to poor decision making and predictions, which can finally impact the revenue and working of an organization. This is true for both public and private sector organizations. With rise in E-Governance, lot of nations and their respective public sector units are making use of publicly available datasets. But are these datasets reliable and have good quality. This is the major research question studied in this paper. The study collected publicly available datasets from Open Government Data platforms across 8 different nations around the world. More than 300 datasets having roughly 3.5 million rows were assessed for various data quality measures. The various parameters studied for the data were valid data types, correctness, completeness, statistical features, variability, comparability, duplicacy and the likes. Script was written in R to check the value for various measures. It was found that different countries had advantages on different parameters. Not one country was found to have all the parameters to be of high quality. Different ranges were found for the dataset for various parameters which was helpful in determining the overall quality of the dataset. This will be helpful for various public and private sector organizations in assessing the quality of datasets they intend to work on. Substantial efforts and resources can be saved on Advanced Analytics if the quality of dataset can be determined in advance. The proposed data quality assessment model can be applied on any private or public dataset. Different industry and organizations can set different threshold values for the parameters to benchmark their analytical process. Both practitioners and researchers can be benefitted from this research work.
Keywords:data quality assessment, open government datasets, e-governance, data quality measures
Year of publishing:2019
Number of pages:Str. 442-451
PID:20.500.12556/RUNG-6420 New window
COBISS.SI-ID:58258435 New window
UDC:004
NUK URN:URN:SI:UNG:REP:W9RJVDDB
Publication date in RUNG:05.04.2021
Views:2094
Downloads:0
Metadata:XML RDF-CHPDL DC-XML DC-RDF
:
Copy citation
  
Average score:(0 votes)
Your score:Voting is allowed only for logged in users.
Share:Bookmark and Share


Hover the mouse pointer over a document title to show the abstract or click on the title to get all document metadata.

Record is a part of a monograph

Title:Facets of business excellence in IT : proceedings of the International Conference on Facets of Business Excellence
Editors:Renato Pereira
Place of publishing:New Delhi [etc.]
Publisher:Bloomsbury Prime
Year of publishing:2019
ISBN:978-93-88630-06-1
COBISS.SI-ID:58258179 New window

Back