Call Us

Home / Blog / Interview Questions on Data Science / Interview Questions on Data Types & Measurements

Interview Questions on Data Types & Measurements

  • July 14, 2025
  • 5890
  • 21
Author Images

Meet the Author : Mr. Bharani Kumar

Bharani Kumar Depuru is a well known IT personality from Hyderabad. He is the Founder and Director of AiSPRY and 360DigiTMG. Bharani Kumar is an IIT and ISB alumni with more than 18+ years of experience, he held prominent positions in the IT elites like HSBC, ITC Infotech, Infosys, and Deloitte. He is a prevalent IT consultant specializing in Industrial Revolution 4.0 implementation, Data Analytics practice setup, Artificial Intelligence, Big Data Analytics, Industrial IoT, Business Intelligence and Business Management. Bharani Kumar is also the chief trainer at 360DigiTMG with more than Ten years of experience and has been making the IT transition journey easy for his students. 360DigiTMG is at the forefront of delivering quality education, thereby bridging the gap between academia and industry.

Read More >

Why Understanding Data Types is Essential in Data Science

In today’s ever-evolving world of data science, understanding the types of data is crucial for developing reliable models and making informed decisions. From simple datasets to complex, high-volume information sets, the ability to classify and analyze data is an essential skill. The structural differences in data define its primary use: structured data is organized in tables with specific types of information per column, while unstructured data consists of raw information that often requires more processing but can lead to quicker insights in the analytics process.

This blog explores some of the foundational concepts related to types and scales of data, including the differences between nominal, ordinal, continuous, and discrete data. It also highlights the significance of dummy variables and provides an overview of how these concepts play a critical role in data science. Once you grasp these building blocks, you'll be better equipped to analyze data and make confident, data-driven decisions.

1. What is Nominal Data?

Name of categories (there is no natural order among categories).

There is no inherent order and has a limited set of entries.

Usually, nominal data is either alphabetical (string) or in text format.

Nominal data has to be converted into a dummy variable encoding format for ML algorithms to understand the same.

2. What is Ordinal Data?

Categories that have particular order (Inherent order).

Ordinal data has to be converted to its numeric equivalent using encoding techniques.

Eg: Shirt size: S, M, L, XL, XXL; Gate numbers in the airport: 1, 2, 3, 4.....

The difference in the different levels or values of the ordinal data is consistent in direction and consistency need not be in magnitude.

The difference in the different levels has no meaning.

We can perform 'Count' as well as 'Rank' the items in an order.

Start learning the fundamentals of data science with a Data Science Course in Chennai to dive deeper into such concepts.

3. What is Interval Data?

Interval scales are the numeric scales where we know both the order of the values along with the exact differences between the values.

The difference between the levels has a meaningful rationale.

No natural zero (Absence of absolute zero). This means, if the temperature is zero, it does not mean there is no temperature.

Eg: Time, Temperature, Date, and IQ level.

We can perform mathematical operations - Addition & Subtraction

4. What is the Ratio?

Ratio data is very much like the interval data – the values must be numerical where the difference between points is standardized and quite meaningful.

Whereas, in order for data to be considered as the ratio data, it must have a true zero value, which means ratio data cannot have negative values.

Eg: Height, Weight, etc. If we have zero money then it means there is no money.

We can perform mathematical operations such as Addition, Subtraction, Multiplication, and Division.

5. What is a Factor?

The factor is a variable, which can take a limited set of values.

For example: 'Gender' is a variable that can take two levels - 'Male' & 'Female'.

Another example is 'Month', which can take '12' levels - Jan, Feb, Mar,....., Dec.

6. What are the broad classifications of data types?

Broadly speaking data can be classified as Continuous data and Discrete data.

Discrete data can be further classified as Categorical data and Count data.

Categorical data is further classified as Binary categorical data and Multiple categorical data.

Multiple categorical data is further classified as Nominal data and Ordinal data.

Continuous data & Count data are considered Quantitative data, whereas Categorical data is considered as Qualitative data.

7.Differences Between Structured, Semi-Structured, and Unstructured Data

Differences Between Structured, Semi-Structured
Aspect Structured Data Semi-Structured Data Unstructured Data
Definition Data organized in a neat tabular format with rows and columns, adhering to a fixed schema. Data that has elements of structure but does not conform strictly to tabular formats or a rigid schema. Data that lacks any predefined structure, stored in a raw and unorganized format.
Schema A well-defined and rigid schema ensures consistent organization and formatting. A flexible schema, allowing variations in data organization. No schema; data organization is free-form and undefined.
Storage Stored in databases like relational databases (e.g., MySQL, PostgreSQL). Stored in formats like XML, JSON, or NoSQL databases (e.g., MongoDB). Stored in file systems or specialized platforms for managing multimedia, logs, or documents.
Processing Easy to query and analyze using SQL or BI tools. Requires specialized tools to parse and extract meaningful information (e.g., parsing XML or JSON). Requires advanced tools like natural language processing (NLP), image recognition, or video analysis.
Examples Customer details in a relational database (Name, Age, Address). JSON data storing user profiles.
XML files defining product configurations.
Images, videos, audio recordings.
Social media posts or emails.
Advantages Highly organized, easy to store and retrieve.
Efficient for analysis using querying.
Combines flexibility with some structure, making it adaptable for evolving data formats. Can capture rich, diverse information like multimedia, enabling a broader scope of analysis.
Disadvantages Traditional query languages.
Lacks flexibility for non-tabular data.
Limited to predefined fields and formats.
Processing can be more complex than structured data.
Requires tools for extraction and parsing.
Difficult to process and analyze.
Requires significant storage and computational resources.

8. Difference Between Big Data and Non-Big Data

Aspect Big Data Non-Big Data
Definition Data that cannot be stored or processed using traditional storage and hardware/software. Data that can be stored and processed using traditional tools.
Key Characteristics Defined by 5 Vs: Velocity, Veracity, Volume, Variety, and Value. Does not rely on 5 Vs; simpler in nature.
Examples Large-scale analytics platforms like Hadoop, Spark. Small-scale SQL databases, spreadsheets.

9. Differences Between Cross-Sectional, Time Series, and Longitudinal Data

Aspect Cross-Sectional Data Time Series Data Longitudinal/Panel Data
Time Sequence Sequence based on time is not important. Sequence based on time is important. Sequence based on time is important.
Variables Contains multiple variables. Contains a single variable. Contains multiple variables.
Examples Predicting loan defaulters using age, income, and gender data. Predicting monthly, weekly, or daily sales trends. Predicting sales across various countries over time.

10. Differences Between Balanced and Imbalanced/Rare Datasets

Aspect Balanced Dataset Imbalanced Dataset
Categorical (Binary) Classes are evenly represented (e.g., 50% Default, 50% Not Default). One class representation is less than 30% (e.g., 29% Default, 71% Not Default).
Categorical (Multiple) All classes have approximately equal representation. One or more classes have significantly less or more representation.
Continuous Data Data follows a normal distribution. Data may be bimodal or non-normal.
Examples Balanced classes in customer satisfaction (50% satisfied, 50% unsatisfied). Handwritten digits recognition where class ‘1’ has 2% representation and ‘10’ has 10%.

11. Differences Between Offline and Online Processing

Offline and Online Processing
Aspect Offline Processing Online Processing
Definition Data is processed without requiring an internet connection. Data is processed in real-time as it arrives, requiring an internet connection.
Processing Style Processes data in batches (Batch Processing). Processes data as a stream (Real-time or Streaming Processing).
Examples Generating monthly sales reports from stored data. Analyzing live stock market feeds.

12. What is a Random Variable?

Any variable whose output varies and has a chance associated with the output values is called as Random variable.

Eg: Flipping a coin has Head or Tail as output and Flipping a coin is a random variable. Note: Random Variables are always represented using capital letter and values, which are not random variables are represented using small letter.

13. What are Measurement levels?

Measurement levels are a way to interpret the calculations that can be applied to the data for extracting the information. There are 4 levels of measurements that we can learn: Nominal, Ordinal, Interval, and Ratio.

14. What does Nominal type in measurement levels mean?

Name of Categories (There is no natural order among categories) There is no inherent order.

Eg: Color names, Gender

15.What is the ordinal measurement level?
What is the ordinal measurement level?

Categories that have Particular order (Inherent order).

Eg: Shirt size : S, M, L, XL, XXL.

16. What does Interval measurement level represent?

The Interval level is a numeric measure of the data. This numeric measure will explain the relative value of a data point in the data set. The values will always lie in a defined boundary. Hence these values are said to be a measure of local scale.

Eg: Temperature, and Date.

17. What is a Ratio measure?

Ratio data is very much like the interval data – the values must be numerical where the difference between points is standardized and quite meaningful. Whereas, for data to be considered as the ratio data, it must have a true zero value, which means ratio data cannot have negative values.

Eg: Height, Weight.

18. What is the Factor variable?

The Factor variable is nothing but it has limited values (or) labels.

Eg: Month (Jan, Feb, …., Dec) ---- Only 12 values for Month variable.

Data Science Placement Success Story

Data Science Training Institutes in Other Locations

Agra, Ahmedabad, Amritsar, Anand, Anantapur, Bangalore, Bhopal, Bhubaneswar, Chengalpattu, Chennai, Cochin, Dehradun, Malaysia, Dombivli, Durgapur, Ernakulam, Erode, Gandhinagar, Ghaziabad, Gorakhpur, Gwalior, Hebbal, Hyderabad, Jabalpur, Jalandhar, Jammu, Jamshedpur, Jodhpur, Khammam, Kolhapur, Kothrud, Ludhiana, Madurai, Meerut, Mohali, Moradabad, Noida, Pimpri, Pondicherry, Pune, Rajkot, Ranchi, Rohtak, Roorkee, Rourkela, Shimla, Shimoga, Siliguri, Srinagar, Thane, Thiruvananthapuram, Tiruchchirappalli, Trichur, Udaipur, Yelahanka, Andhra Pradesh, Anna Nagar, Bhilai, Borivali, Calicut, Chandigarh, Chromepet, Coimbatore, Dilsukhnagar, ECIL, Faridabad, Greater Warangal, Guduvanchery, Guntur, Gurgaon, Guwahati, Hoodi, Indore, Jaipur, Kalaburagi, Kanpur, Kharadi, Kochi, Kolkata, Kompally, Lucknow, Mangalore, Mumbai, Mysore, Nagpur, Nashik, Navi Mumbai, Patna, Porur, Raipur, Salem, Surat, Thoraipakkam, Trichy, Uppal, Vadodara, Varanasi, Vijayawada, Visakhapatnam, Tirunelveli, Aurangabad

Data Analyst Courses in Other Locations

ECIL, Jaipur, Pune, Gurgaon, Salem, Surat, Agra, Ahmedabad, Amritsar, Anand, Anantapur, Andhra Pradesh, Anna Nagar, Aurangabad, Bhilai, Bhopal, Bhubaneswar, Borivali, Calicut, Cochin, Chengalpattu , Dehradun, Dombivli, Durgapur, Ernakulam, Erode, Gandhinagar, Ghaziabad, Gorakhpur, Guduvanchery, Gwalior, Hebbal, Hoodi , Indore, Jabalpur, Jaipur, Jalandhar, Jammu, Jamshedpur, Jodhpur, Kanpur, Khammam, Kochi, Kolhapur, Kolkata, Kothrud, Ludhiana, Madurai, Mangalore, Meerut, Mohali, Moradabad, Pimpri, Pondicherry, Porur, Rajkot, Ranchi, Rohtak, Roorkee, Rourkela, Shimla, Shimoga, Siliguri, Srinagar, Thoraipakkam , Tiruchirappalli, Tirunelveli, Trichur, Trichy, Udaipur, Vijayawada, Vizag, Warangal, Chennai, Coimbatore, Delhi, Dilsukhnagar, Hyderabad, Kalyan, Nagpur, Noida, Thane, Thiruvananthapuram, Uppal, Kompally, Bangalore, Chandigarh, Chromepet, Faridabad, Guntur, Guwahati, Kharadi, Lucknow, Mumbai, Mysore, Nashik, Navi Mumbai, Patna, Pune, Raipur, Vadodara, Varanasi, Yelahanka

Navigate to Address

360DigiTMG - Data Science, Data Scientist Course Training in Bangalore

No 23, 2nd Floor, 9th Main Rd, 22nd Cross Rd, 7th Sector, HSR Layout, Bengaluru, Karnataka 560102

+91 9665066683

Get Direction: Data Science Course

Read Success Stories Read
Success Stories
Make an Enquiry