Repository of University of Nova Gorica

Show document
A+ | A- | SLO | ENG

Title:Measuring data quality across open government datasets
Authors:Gupta, Rajan (Author)
Yadav, Sushmita (Author)
Prasad, Avinash (Author)
Pal, Saibal K. (Author)
Files:This document has no files. This document may have a phisical copy in the library of the organization, check the status via COBISS. Link is opened in a new window
Language:English
Work type:Unknown ()
Tipology:1.08 - Published Scientific Conference Contribution
Organization:UNG - University of Nova Gorica
Abstract:Data Quality has become the base for any analytical operation or modelling. Poor Quality of data can lead to poor analytical modeling, which in turn can lead to poor decision making and predictions, which can finally impact the revenue and working of an organization. This is true for both public and private sector organizations. With rise in E-Governance, lot of nations and their respective public sector units are making use of publicly available datasets. But are these datasets reliable and have good quality. This is the major research question studied in this paper. The study collected publicly available datasets from Open Government Data platforms across 8 different nations around the world. More than 300 datasets having roughly 3.5 million rows were assessed for various data quality measures. The various parameters studied for the data were valid data types, correctness, completeness, statistical features, variability, comparability, duplicacy and the likes. Script was written in R to check the value for various measures. It was found that different countries had advantages on different parameters. Not one country was found to have all the parameters to be of high quality. Different ranges were found for the dataset for various parameters which was helpful in determining the overall quality of the dataset. This will be helpful for various public and private sector organizations in assessing the quality of datasets they intend to work on. Substantial efforts and resources can be saved on Advanced Analytics if the quality of dataset can be determined in advance. The proposed data quality assessment model can be applied on any private or public dataset. Different industry and organizations can set different threshold values for the parameters to benchmark their analytical process. Both practitioners and researchers can be benefitted from this research work.
Keywords:data quality assessment, open government datasets, e-governance, data quality measures
Year of publishing:2019
Number of pages:Str. 442-451
COBISS_ID:58258435 Link is opened in a new window
UDC:004
URN:URN:SI:UNG:REP:W9RJVDDB
Views:506
Downloads:0
Metadata:XML RDF-CHPDL DC-XML DC-RDF
Categories:Document is not linked to any category.
:
  
Average score:(0 votes)
Your score:Voting is allowed only for logged in users.

Hover the mouse pointer over a document title to show the abstract or click on the title to get all document metadata.

Record is a part of a monograph

Title:Facets of business excellence in IT
Subtitle:proceedings of the International Conference on Facets of Business Excellence
Publisher:Bloomsbury Prime
ISBN:978-93-88630-06-1
COBISS.SI-ID:58258179 New window
Place of publishing:New Delhi [etc.]
Year of publishing:2019
Editors:Renato Pereira

Back