Call Us

Home / Blog / Interview Questions on Data Science / Interview Questions on Data Types & Measurements

Interview Questions on Data Types & Measurements

  • July 14, 2025
  • 5687
  • 21
Author Images

Meet the Author : Mr. Bharani Kumar

Bharani Kumar Depuru is a well known IT personality from Hyderabad. He is the Founder and Director of AiSPRY and 360DigiTMG. Bharani Kumar is an IIT and ISB alumni with more than 18+ years of experience, he held prominent positions in the IT elites like HSBC, ITC Infotech, Infosys, and Deloitte. He is a prevalent IT consultant specializing in Industrial Revolution 4.0 implementation, Data Analytics practice setup, Artificial Intelligence, Big Data Analytics, Industrial IoT, Business Intelligence and Business Management. Bharani Kumar is also the chief trainer at 360DigiTMG with more than Ten years of experience and has been making the IT transition journey easy for his students. 360DigiTMG is at the forefront of delivering quality education, thereby bridging the gap between academia and industry.

Read More >

Introduction to Data Types and Measurements in Data Science

In today’s ever-evolving world of data science, understanding the types of data is crucial for developing reliable models and making informed decisions. From simple datasets to complex, high-volume information sets, the ability to classify and analyze data is an essential skill. The structural differences in data define its primary use: structured data is organized in tables with specific types of information per column, while unstructured data consists of raw information that often requires more processing but can lead to quicker insights in the analytics process.

This blog explores some of the foundational concepts related to types and scales of data, including the differences between nominal, ordinal, continuous, and discrete data. It also highlights the significance of dummy variables and provides an overview of how these concepts play a critical role in data science. Once you grasp these building blocks, you'll be better equipped to analyze data and make confident, data-driven decisions.

1. What is Nominal Data?

Name of categories (there is no natural order among categories).

There is no inherent order and has a limited set of entries.

Usually, nominal data is either alphabetical (string) or in text format.

Nominal data has to be converted into a dummy variable encoding format for ML algorithms to understand the same.

2. What is Ordinal Data?

Categories that have particular order (Inherent order).

Ordinal data has to be converted to its numeric equivalent using encoding techniques.

Eg: Shirt size: S, M, L, XL, XXL; Gate numbers in the airport: 1, 2, 3, 4.....

The difference in the different levels or values of the ordinal data is consistent in direction and consistency need not be in magnitude.

The difference in the different levels has no meaning.

We can perform 'Count' as well as 'Rank' the items in an order.

Start learning the fundamentals of data science with a Data Science Course in Chennai to dive deeper into such concepts.

3. What is Interval Data?

Interval scales are the numeric scales where we know both the order of the values along with the exact differences between the values.

The difference between the levels has a meaningful rationale.

No natural zero (Absence of absolute zero). This means, if the temperature is zero, it does not mean there is no temperature.

Eg: Time, Temperature, Date, and IQ level.

We can perform mathematical operations - Addition & Subtraction

4. What is the Ratio?

Ratio data is very much like the interval data – the values must be numerical where the difference between points is standardized and quite meaningful.

Whereas, in order for data to be considered as the ratio data, it must have a true zero value, which means ratio data cannot have negative values.

Eg: Height, Weight, etc. If we have zero money then it means there is no money.

We can perform mathematical operations such as Addition, Subtraction, Multiplication, and Division.

5. What is a Factor?

The factor is a variable, which can take a limited set of values.

For example: 'Gender' is a variable that can take two levels - 'Male' & 'Female'.

Another example is 'Month', which can take '12' levels - Jan, Feb, Mar,....., Dec.

6. What are the broad classifications of data types?

Broadly speaking data can be classified as Continuous data and Discrete data.

Discrete data can be further classified as Categorical data and Count data.

Categorical data is further classified as Binary categorical data and Multiple categorical data.

Multiple categorical data is further classified as Nominal data and Ordinal data.

Continuous data & Count data are considered Quantitative data, whereas Categorical data is considered as Qualitative data.

7.Differences Between Structured, Semi-Structured, and Unstructured Data

Differences Between Structured, Semi-Structured
Aspect Structured Data Semi-Structured Data Unstructured Data
Definition Data organized in a neat tabular format with rows and columns, adhering to a fixed schema. Data that has elements of structure but does not conform strictly to tabular formats or a rigid schema. Data that lacks any predefined structure, stored in a raw and unorganized format.
Schema A well-defined and rigid schema ensures consistent organization and formatting. A flexible schema, allowing variations in data organization. No schema; data organization is free-form and undefined.
Storage Stored in databases like relational databases (e.g., MySQL, PostgreSQL). Stored in formats like XML, JSON, or NoSQL databases (e.g., MongoDB). Stored in file systems or specialized platforms for managing multimedia, logs, or documents.
Processing Easy to query and analyze using SQL or BI tools. Requires specialized tools to parse and extract meaningful information (e.g., parsing XML or JSON). Requires advanced tools like natural language processing (NLP), image recognition, or video analysis.
Examples Customer details in a relational database (Name, Age, Address). JSON data storing user profiles.
XML files defining product configurations.
Images, videos, audio recordings.
Social media posts or emails.
Advantages Highly organized, easy to store and retrieve.
Efficient for analysis using querying.
Combines flexibility with some structure, making it adaptable for evolving data formats. Can capture rich, diverse information like multimedia, enabling a broader scope of analysis.
Disadvantages Traditional query languages.
Lacks flexibility for non-tabular data.
Limited to predefined fields and formats.
Processing can be more complex than structured data.
Requires tools for extraction and parsing.
Difficult to process and analyze.
Requires significant storage and computational resources.

8. Difference Between Big Data and Non-Big Data

Aspect Big Data Non-Big Data
Definition Data that cannot be stored or processed using traditional storage and hardware/software. Data that can be stored and processed using traditional tools.
Key Characteristics Defined by 5 Vs: Velocity, Veracity, Volume, Variety, and Value. Does not rely on 5 Vs; simpler in nature.
Examples Large-scale analytics platforms like Hadoop, Spark. Small-scale SQL databases, spreadsheets.

9. Differences Between Cross-Sectional, Time Series, and Longitudinal Data

Aspect Cross-Sectional Data Time Series Data Longitudinal/Panel Data
Time Sequence Sequence based on time is not important. Sequence based on time is important. Sequence based on time is important.
Variables Contains multiple variables. Contains a single variable. Contains multiple variables.
Examples Predicting loan defaulters using age, income, and gender data. Predicting monthly, weekly, or daily sales trends. Predicting sales across various countries over time.

10. Differences Between Balanced and Imbalanced/Rare Datasets

Aspect Balanced Dataset Imbalanced Dataset
Categorical (Binary) Classes are evenly represented (e.g., 50% Default, 50% Not Default). One class representation is less than 30% (e.g., 29% Default, 71% Not Default).
Categorical (Multiple) All classes have approximately equal representation. One or more classes have significantly less or more representation.
Continuous Data Data follows a normal distribution. Data may be bimodal or non-normal.
Examples Balanced classes in customer satisfaction (50% satisfied, 50% unsatisfied). Handwritten digits recognition where class ‘1’ has 2% representation and ‘10’ has 10%.

11. Differences Between Offline and Online Processing

Offline and Online Processing
Aspect Offline Processing Online Processing
Definition Data is processed without requiring an internet connection. Data is processed in real-time as it arrives, requiring an internet connection.
Processing Style Processes data in batches (Batch Processing). Processes data as a stream (Real-time or Streaming Processing).
Examples Generating monthly sales reports from stored data. Analyzing live stock market feeds.

12. What is a Random Variable?

Any variable whose output varies and has a chance associated with the output values is called as Random variable.

Eg: Flipping a coin has Head or Tail as output and Flipping a coin is a random variable. Note: Random Variables are always represented using capital letter and values, which are not random variables are represented using small letter.

13. What are Measurement levels?

Measurement levels are a way to interpret the calculations that can be applied to the data for extracting the information. There are 4 levels of measurements that we can learn: Nominal, Ordinal, Interval, and Ratio.

14. What does Nominal type in measurement levels mean?

Name of Categories (There is no natural order among categories) There is no inherent order.

Eg: Color names, Gender

15.What is the ordinal measurement level?
What is the ordinal measurement level?

Categories that have Particular order (Inherent order).

Eg: Shirt size : S, M, L, XL, XXL.

16. What does Interval measurement level represent?

The Interval level is a numeric measure of the data. This numeric measure will explain the relative value of a data point in the data set. The values will always lie in a defined boundary. Hence these values are said to be a measure of local scale.

Eg: Temperature, and Date.

17. What is a Ratio measure?

Ratio data is very much like the interval data – the values must be numerical where the difference between points is standardized and quite meaningful. Whereas, for data to be considered as the ratio data, it must have a true zero value, which means ratio data cannot have negative values.

Eg: Height, Weight.

18. What is the Factor variable?

The Factor variable is nothing but it has limited values (or) labels.

Eg: Month (Jan, Feb, …., Dec) ---- Only 12 values for Month variable.

19. What are Continuous and Discrete Data?

Type Description Examples
Continuous Data Data that can take any value within a range. These values are often measured and can be expressed with decimals or fractions. Height (e.g., 175.5 cm)
Weight (e.g., 68.7 kg)
Temperature (e.g., 36.6°C)
Discrete Data Data that consists of countable, finite values. These values are distinct and often whole numbers. Number of students in a class (e.g., 25)
Number of cars in a parking lot (e.g., 50)
Number of books in a library (e.g., 1000)

Advance your understanding of these data types with a Data Science Course in Bangalore tailored to industry demands.

20. Difference Between Qualitative and Quantitative Data
Aspect Qualitative Data Quantitative Data
Definition Describes qualities, characteristics, or attributes, often represented in categories or labels. Represents numeric values that can be measured or counted, often used for mathematical analysis.
Nature Non-numeric and descriptive. Numeric and measurable.
Subtypes Nominal and ordinal data. Continuous and discrete data.
Representation Categories, labels, or descriptive text. Numbers, decimals, or whole values.
Examples Eye color (e.g., blue, brown)
Cuisine type (e.g., Italian)
Customer feedback (e.g., "satisfied", "unsatisfied")
Age (e.g., 30 years)
Temperature (e.g., 25°C)
Number of items sold (e.g., 200 units)
Analysis Techniques Content analysis
Thematic coding
Statistical analysis
Mathematical computations
Advantages Useful for understanding subjective experiences.
Ideal for exploratory research.
Allows precise measurements.
Enables robust statistical modeling.
Disadvantages Cannot perform statistical calculations.
Subject to interpretation bias.
May not capture complex behaviors or experiences.
Over-reliance on numeric data.
What is Binary Data?
21. What is Binary Data?

Binary data is a type of categorical data with only two possible outcomes. These outcomes are often represented as 0 and 1, true and false, or yes and no. For instance, in a dataset tracking customer purchases, the binary variable "Purchased" could have values such as 1 (if the customer made a purchase) and 0 (if not).

22. What are Dummy Variables?

Dummy variables, also called indicator variables, are numerical representations of categorical data. They are commonly used in statistical modeling to encode categories as binary values. For example, if a variable "Gender" has categories "Male" and "Female," dummy variables might represent Male as 1 and Female as 0, making it easier to analyze using mathematical models.

23. Difference Between Nominal and Dichotomous Data
Aspect Nominal Data Dichotomous Data
Definition Refers to categorical data with more than two distinct categories. A subset of nominal data with only two possible categories or groups.
Number of Categories More than two categories. Exactly two categories.
Order Categories do not have an inherent order or ranking. Categories also lack order but are limited to binary options.
Examples Colors: Red, Blue, Green
Brands: Nike, Adidas, Puma
Cities: New York, London, Tokyo
Yes/No
Present/Absent
Male/Female
Nature Provides broader classification for a dataset with multiple possible values. Simplifies classification into binary options.
Use Cases Useful for understanding and grouping diverse data points. Ideal for binary decision-making, such as determining presence/absence or success/failure.

Sharpen your skills on data handling and measurements with a comprehensive Data Science Course in Hyderabad designed for aspiring professionals.

24. What are Interval-Censored and Right-Censored Data?

Interval-Censored Data: This refers to data where the exact value is unknown, but it lies within a specific range. For example, if a medical test shows a patient’s blood sugar level is between 90 and 120 mg/dL, the data is interval-censored.

Right-Censored Data: This refers to cases where the value is not observed above a certain threshold. For instance, in survival analysis, if a study ends before some participants experience the event of interest (e.g., death), their data is right-censored.

25. What is the difference between Qualitative Variables and Quantitative Variables?

Qualitative Variables: Variables that categorize or label data, such as hair color, nationality, or types of music genres. These are descriptive and do not have numerical values.

Quantitative Variables: Variables that represent measurable quantities, such as income, weight, or distance traveled. These are numeric and allow for mathematical operations.

26. What is a Derived Variable?

A derived variable is a new variable created using existing data. It is often calculated or transformed based on other variables in a dataset. For example, if a dataset includes "Date of Birth" and "Current Date," a derived variable "Age" can be computed.

27. What are Polytomous Variables?

Polytomous variables are categorical variables that have more than two categories. For instance, "Education Level" could have categories like "High School," "Bachelor's," "Master's," and "Doctorate." These variables can be ordinal if the categories have a logical order or nominal if they do not.

28. What is the difference between Data Levels and Data Scales?

Data Levels: Refer to the way data is categorized, ordered, or measured, such as nominal, ordinal, interval, and ratio levels.

Data Scales: Represent the tools or methods used to measure the data, such as a thermometer for temperature or a weighing scale for weight. Scales are typically associated with interval and ratio data.

29. What is a Latent Variable?

A latent variable is a variable that is not directly observed but is inferred from other observable variables. For instance, "intelligence" is a latent variable often inferred through test scores, academic performance, and problem-solving abilities.

30. What is a Continuous Scale?

A continuous scale represents data that can take any value within a range. Examples include temperature measured in Celsius or Fahrenheit, or distance measured in kilometers or miles. Continuous scales are associated with interval and ratio data.

31. Difference Between Aggregated and Disaggregated Data
Aspect Aggregated Data Disaggregated Data
Definition Data that is summarized or combined for analysis purposes, providing an overview or high-level insights. Data that is detailed and not summarized, offering a granular view of each individual record or transaction.
Purpose Used to identify trends, patterns, or insights at a macro level. Used for in-depth analysis, allowing exploration of finer details and variations.
Representation Represents combined values, such as totals, averages, or percentages. Represents individual data points, transactions, or records.
Analysis Simplifies analysis and is ideal for high-level decision-making. Enables detailed and exploratory analysis but may require more effort to process.
Storage Requirements Requires less storage space as data is condensed. Requires more storage space due to the large number of individual records.
Examples Monthly sales averages.
Annual customer retention rates.
Total revenue by region.
Individual sales transactions.
Customer feedback per product.
Daily stock prices.
Advantages Easier to interpret.
Useful for presentations and strategic decisions.
Reduces data complexity.
Provides maximum detail.
Enables customized and specific queries.
Ideal for root cause analysis.
Disadvantages Loses granularity and specific details.
May hide important patterns or outliers.
Can be overwhelming due to data volume.
Requires significant computational resources for processing.
32. What are Dummy Variables in Regression Analysis?

In regression analysis, dummy variables are used to include categorical data in a numerical model. For example, to include "Marital Status" with categories "Single," "Married," and "Divorced" in a regression model, two dummy variables can be created to represent these categories numerically.

Conclusion
Conclusion

Understanding data types and measurement levels is fundamental for anyone delving into data-driven fields such as data science, analytics, and research. These concepts form the backbone of data collection, organization, and analysis. From distinguishing between nominal and ordinal data to comprehending the nuances of continuous and discrete variables, each type of data plays a crucial role in deriving actionable insights.

At 360DigiTMG, we help learners develop a deep understanding of these foundational concepts through our comprehensive courses and hands-on training. Our expert-led programs are designed to simplify complex topics, such as nominal and ordinal data, dummy variables, and advanced measurement techniques, ensuring learners are well-equipped to handle real-world datasets. By offering practical exposure, interactive sessions, and industry-relevant projects, 360DigiTMG ensures that students not only grasp theoretical knowledge but also build the confidence to excel in interviews and professional scenarios.

Whether you're starting your journey in data science or looking to refine your skills, 360DigiTMG provides the guidance and resources necessary to master these essential topics, setting you up for success in the competitive data science landscape.

FAQs

1. What is the Importance of Understanding Data Types in Data Science?

Recognizing the different types of data is crucial because it helps you choose the right analysis methods and models. Knowing whether your data is nominal, ordinal, or continuous can influence the way you interpret patterns and relationships.

2. How Do Data Types Affect Data Processing?

Different data types require distinct processing techniques. For instance, numerical data can be analyzed using statistical methods, while categorical data may need to be encoded into a numerical format before analysis. Understanding these differences is key to effective data processing.

For example, numerical data can be analyzed with statistical tools, whereas categorical data may need encoding techniques to be included in machine learning models.

3. Why is it Necessary to Convert Data Types for Machine Learning?

Machine learning models work best with numeric data. Converting categorical data into numerical formats, such as using dummy variables, enables models to process and analyze the information effectively.

4. Can Unstructured Data Be Analyzed?

Yes, while unstructured data like images, text, or videos can be difficult to analyze, advancements in machine learning techniques such as NLP (Natural Language Processing) and computer vision have made it possible to extract insights from such data.

5. How Do Data Types Influence the Quality of Insights?

The data type determines how you can interpret the data and draw conclusions. For instance, while ordinal data shows ranking or order, continuous data provides precise values that enable more detailed analysis, which can impact the decision-making process.

Data Science Placement Success Story

Data Science Training Institutes in Other Locations

Agra, Ahmedabad, Amritsar, Anand, Anantapur, Bangalore, Bhopal, Bhubaneswar, Chengalpattu, Chennai, Cochin, Dehradun, Malaysia, Dombivli, Durgapur, Ernakulam, Erode, Gandhinagar, Ghaziabad, Gorakhpur, Gwalior, Hebbal, Hyderabad, Jabalpur, Jalandhar, Jammu, Jamshedpur, Jodhpur, Khammam, Kolhapur, Kothrud, Ludhiana, Madurai, Meerut, Mohali, Moradabad, Noida, Pimpri, Pondicherry, Pune, Rajkot, Ranchi, Rohtak, Roorkee, Rourkela, Shimla, Shimoga, Siliguri, Srinagar, Thane, Thiruvananthapuram, Tiruchchirappalli, Trichur, Udaipur, Yelahanka, Andhra Pradesh, Anna Nagar, Bhilai, Borivali, Calicut, Chandigarh, Chromepet, Coimbatore, Dilsukhnagar, ECIL, Faridabad, Greater Warangal, Guduvanchery, Guntur, Gurgaon, Guwahati, Hoodi, Indore, Jaipur, Kalaburagi, Kanpur, Kharadi, Kochi, Kolkata, Kompally, Lucknow, Mangalore, Mumbai, Mysore, Nagpur, Nashik, Navi Mumbai, Patna, Porur, Raipur, Salem, Surat, Thoraipakkam, Trichy, Uppal, Vadodara, Varanasi, Vijayawada, Visakhapatnam, Tirunelveli, Aurangabad

Data Analyst Courses in Other Locations

ECIL, Jaipur, Pune, Gurgaon, Salem, Surat, Agra, Ahmedabad, Amritsar, Anand, Anantapur, Andhra Pradesh, Anna Nagar, Aurangabad, Bhilai, Bhopal, Bhubaneswar, Borivali, Calicut, Cochin, Chengalpattu , Dehradun, Dombivli, Durgapur, Ernakulam, Erode, Gandhinagar, Ghaziabad, Gorakhpur, Guduvanchery, Gwalior, Hebbal, Hoodi , Indore, Jabalpur, Jaipur, Jalandhar, Jammu, Jamshedpur, Jodhpur, Kanpur, Khammam, Kochi, Kolhapur, Kolkata, Kothrud, Ludhiana, Madurai, Mangalore, Meerut, Mohali, Moradabad, Pimpri, Pondicherry, Porur, Rajkot, Ranchi, Rohtak, Roorkee, Rourkela, Shimla, Shimoga, Siliguri, Srinagar, Thoraipakkam , Tiruchirappalli, Tirunelveli, Trichur, Trichy, Udaipur, Vijayawada, Vizag, Warangal, Chennai, Coimbatore, Delhi, Dilsukhnagar, Hyderabad, Kalyan, Nagpur, Noida, Thane, Thiruvananthapuram, Uppal, Kompally, Bangalore, Chandigarh, Chromepet, Faridabad, Guntur, Guwahati, Kharadi, Lucknow, Mumbai, Mysore, Nashik, Navi Mumbai, Patna, Pune, Raipur, Vadodara, Varanasi, Yelahanka

Navigate to Address

360DigiTMG - Data Science, Data Scientist Course Training in Bangalore

No 23, 2nd Floor, 9th Main Rd, 22nd Cross Rd, 7th Sector, HSR Layout, Bengaluru, Karnataka 560102

+91 9665066683

Get Direction: Data Science Course

Read Success Stories Read
Success Stories
Make an Enquiry