Contact Us

Import data as per specifications

Import data as per specifications
Data Scientist

About

This unit is about using a variety of techniques to import data into datasets or data frames.

Scope

define data type and sources, acquire the data

Define data type and sources
  • identify the objective of the analysis
  • define the type of data to be imported
  • define the volume of data to be imported
  • define the key variables to be imported
  • identify suitable sources for the data
Acquire the data
  • perform operations to acquire the data and store it in datasets or data frames
  • populate metadata for the imported data
  • validate imported data using appropriate tools & processes
  • validate the desired output with the relevant stakeholders within the organization, if required

Required Knowledge & Understanding

Technical Skills
  • the difference between various types of data. For example: enterprise vs consumer data, qualitative vs quantitative data, processed vs unprocessed data
  • different statistical analysis softwares, packages, libraries and tools that can be used to import & validate data such as R or Pandas
  • different functions to read data from various file formats and import it to a dataset or dataframe
  • the metadata associated with imported data and how to populate it
  • how to store and retrieve information
  • how to work on various operating systems such as linux, ubuntu, or windows
Soft Skills
Reading Skills
follow instructions, guidelines, procedures, rules and service level agreements
Analytical Thinking
impact analysis of the various actions performed and disseminating relevant information to others
Attention to Detail
check your work is complete and free from errors