Data Exploration and Insight Generation Assignment help and

Post New Homework

Data Exploration and Insight Generation

Part 1

Understand Your Customer

Open the "ExploringCustData1.csv". If you have MS Excel, then it would be automatically opened in MS Excel. Otherwise, open in any format, familiar to you.
Study the data and answer following:
1. How big is this data? How many rows and columns? What are they?
2. Do you find any problem with this data?
3. How would you confirm the correctness of Customer Identification numbers (ID)?
4. Are there any duplicated records?
5. Are there any missing values?
6. What should be the strategy for replacing missing values?
7. Are there any strange values?
8. What should be the strategy for replacing strange/out of bound values?

Part 2
Exploring Retail Data for Insights

Read the "superstore.xlsx" data into Power BI and find answers of the following:

1. How many total products are there? How many regions wise products are sold? Note the count of subcategory wise products.
2. Which Category of a product having maximum Profit and loss?
3. What is the total sale, profit for each segment?
4. What is the total sale, profit for each State?
5. What is the region wise total sale and profit of products in each sub-category?
6. In which year there is a high profit on Category?
7. In which month there is a high sale on category?
8. For which region there is a high sale of a category?
9. What is the monthly profit for each year for all states?
10. What is the quarterly sales for each year for all regions?

Part 3
Exploring Flight Delays Data

Open the Power BI file "AirlineDelays.pbix".

Alternatively, three CSV files are given in the "AirlineDelays.zip" if you wish to start from reading the data into Power BI or Orange or any other data exploration tool. Establish the relationship between columns.

Explore to answer following:

1. What is the total number of operational flights of each airline?
2. What is the overall Number of delayed & On time flight?
3. What is the Total number of operational flights at Origin airport and Destination airport?
4. How many delayed as well as on time flights for each airline at the time of Departure & Arrival?
5. What is the total number of Delayed as well as on time flights on each day of week at Origin airport & Destination airport?
6. What is the number of Route wise delayed & on time flights at Origin airport & Destination airport?
7. What is the Airlines, origin airport & Destination airport wise Average, minimum & maximum delay time?
8. Is there any relation between distance & delayed flights?

Part 4

Exploring your own data

Pick up any of your current data on the job or the one you deal with day in day out. Please feel free to mask or change with dummy values to protect the data privacy. If you are want, you may use any suitable/interesting data set
Let us consider ONLY structured data for now. Take a sample of 10-100 values(rows) based on the nature of your data and do the following:
1. Understand and note down nature of every column as numeric or categorical? Discrete or continuous?
2. Do you see any relationship between the columns stated? If yes, are there any dependent/resultant/outcome variables? If yes, can you define nature of the relationship by looking at it?
3. Explore the data numerically and graphically.
4. What are the insights you could gather from the data?
5. Did you learn anything new today about your data? If yes, could you explain?

Part 5

Explore Data in Any Tool of Your Choice

Use the "telecom_customer_churn.csv" data and repeat the following steps in Orange or any tool of your choice (other than Excel and Power BI).

1. Load your data in CSV file format
2. View Data in a table
3. View Data Info
4. Analyze each variable one at a time to realize the details of individual distribution
5. Look at each independent variable and understand the overall behavior of the variable
6. Analyse two variables at a time to understand the relationships among variables.
7. What are your findings?
8. Did you explore this tool for the first time? If yes, how was your learning experience? How did you find the tool you used in comparison with Excel and Power BI?

Attachment:- Data Exploration.rar

Post New Homework
Captcha

Looking tutor’s service for getting help in UK studies or college assignments? Order Now