diff --git a/PracticalSkillTask.ipynb b/PracticalSkillTask.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..6823b570343e599fc05eb5b60f36996dbf352f1a --- /dev/null +++ b/PracticalSkillTask.ipynb @@ -0,0 +1,863 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "fad1b5f1-0d16-4624-b335-ec8671a64401", + "metadata": {}, + "source": [ + "# PROGRAMMING SKILLS ASSESSMENT TASK -100 Marks\n" + ] + }, + { + "cell_type": "markdown", + "id": "fdb4ff43-d42b-4565-bad0-621210f8ab1b", + "metadata": {}, + "source": [ + "## 1. Programming Task -75 Marks" + ] + }, + { + "cell_type": "markdown", + "id": "adf43d6e-0448-48f8-845a-be11236d7266", + "metadata": {}, + "source": [ + "## 2. VERSION CONTROL using GIT - 10 Marks\n", + "#### create GIT,Readme updates with student UWE ID number and providing access" + ] + }, + { + "cell_type": "markdown", + "id": "5f6d504e-f5ba-4ea4-b0f7-88c30dac60c4", + "metadata": {}, + "source": [ + "## 3. Reflective Diary -10 Marks" + ] + }, + { + "cell_type": "markdown", + "id": "4b075dff-357c-45c0-ae29-60b80ea3e79b", + "metadata": {}, + "source": [ + "## 4.Good Coding standards and Practice - 5 Marks" + ] + }, + { + "cell_type": "markdown", + "id": "7495336e-b26d-45d7-973d-d09055304651", + "metadata": {}, + "source": [ + "## 1. Read data using Functions " + ] + }, + { + "cell_type": "markdown", + "id": "76e282f0-4709-43c3-822c-490e383b00bd", + "metadata": {}, + "source": [ + "#### 1.1 Write a function to read the dataset from the Life_Expectancy_Data.csv file. Ensure the function handles errors, such as the file not being found." + ] + }, + { + "cell_type": "markdown", + "id": "81fb97c9-12d0-49e7-bdb6-c2352cade976", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **5 marks**\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "a37d6353-f25d-482d-b496-37b408b5cbaf", + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "6053c969-3ec2-4e26-aedf-b7437786fd8b", + "metadata": {}, + "outputs": [], + "source": [ + "### add your function here\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "88dd85ef-adf8-46c0-9fa4-dc05541510bc", + "metadata": {}, + "outputs": [], + "source": [ + "### call your function here\n" + ] + }, + { + "cell_type": "markdown", + "id": "8e64032d-880b-4886-91b7-a3649e917d72", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "f74026e3-13e1-4614-843b-88538dffce77", + "metadata": {}, + "source": [ + " # 2. Data Cleaning and Manipulation" + ] + }, + { + "cell_type": "markdown", + "id": "a4e77a6d-9afc-46f5-9606-c565fd92e4e1", + "metadata": {}, + "source": [ + "#### 2.1 Find missing values in each column. How can we handle them?" + ] + }, + { + "cell_type": "markdown", + "id": "69490e51-b0b8-48ee-a7b4-ecdfa0787d90", + "metadata": {}, + "source": [ + "**Marks Breakdown**: \n", + "1. **Finding Missing** **3 marks**\n", + "2. **Filling Values** **2 marks**\n" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "23f7ba36-af40-43a3-a5bd-fb8f69e4a09f", + "metadata": {}, + "outputs": [], + "source": [ + "# Checking for missing values\n", + "\n", + "\n", + "# Fill missing values \n" + ] + }, + { + "cell_type": "markdown", + "id": "740919d0-df75-4f3e-bcb8-4570656afd49", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "5a6af338-e11a-4f5a-ae4e-e971208a2b38", + "metadata": {}, + "source": [ + "##### 2.2 create new features based on existing data\n", + "##### add the new column \"Health Expenditure per Capita\" by multiplying \"percentage expenditure\" (converted from a percentage to a decimal) by \"GDP\".\n" + ] + }, + { + "cell_type": "markdown", + "id": "485add2a-742b-40b5-811d-ca80f4b70605", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **3 marks**\n" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "a82ccc3d-eaf8-4649-9876-e2575386a33f", + "metadata": {}, + "outputs": [], + "source": [ + "# Creating a new column 'Life Expectancy Group' to categorize countries\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "9263fa67-f91a-47bc-bc34-ff37bd10632f", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "6a942903-fb68-432f-af26-15e775216303", + "metadata": {}, + "source": [ + "#### 2.3 How can we handle categorical variables such as 'Status' (Developed/Developing)?\n", + "#### Hint convert Category to numerical" + ] + }, + { + "cell_type": "markdown", + "id": "1fb76f78-2b6e-4015-b455-306f3f2abe5f", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **3 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "7a2a2845-2ae8-400d-ab65-a9c2fe8c2df5", + "metadata": {}, + "outputs": [], + "source": [ + "# Using one-hot encoding to convert the 'Status' column to numeric\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "314a0df2-24ea-4317-be53-dfc954c10727", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "2dad843d-bf06-4099-a4c1-57cb2004c864", + "metadata": {}, + "source": [ + "# 3.Exploratory Data Analysis (EDA)" + ] + }, + { + "cell_type": "markdown", + "id": "ad7f7900-7dd6-46f6-8be6-ab23bd6a3a13", + "metadata": {}, + "source": [ + "#### 3.1 Find distribution of life expectancy across different countries?" + ] + }, + { + "cell_type": "markdown", + "id": "007be825-77e3-483c-ba1c-5f50e99e0ea9", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **3 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "id": "80d50b32-741e-4fea-8902-a5b1d9ee0a99", + "metadata": {}, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "import seaborn as sns\n", + "\n", + "# Distribution of life expectancy\n" + ] + }, + { + "cell_type": "markdown", + "id": "afb55781-bd96-461e-a59c-0d353cf5d1c7", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "26233e00-d870-4a71-a24e-0c53b3ca4ba4", + "metadata": {}, + "source": [ + "#### 3.2 How does life expectancy vary between developing and developed countries?\n" + ] + }, + { + "cell_type": "markdown", + "id": "1bd519e1-e68e-416d-b6e1-4779fcaa6284", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **3 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "id": "26948666-c49d-4282-94df-6add7a5712d5", + "metadata": {}, + "outputs": [], + "source": [ + "# code here\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "6ec5adba-bd66-453c-bed3-bd720e19739d", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "48772f3f-28c1-4cab-82ed-e8c906e86e4f", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "43e7ad4d-e15b-4b94-80fe-6e5173425d9e", + "metadata": {}, + "source": [ + "#### 3.3 Is there a significant correlation between life expectancy and variables like GDP, schooling, or healthcare expenditure?" + ] + }, + { + "cell_type": "markdown", + "id": "758ec804-fceb-429d-ade5-82f62b29640f", + "metadata": {}, + "source": [ + "**Marks Breakdown**: \n", + "1. **Finding correlation** **3 marks**\n", + "2. **visualization** **2 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "id": "e4696265-f038-471f-956c-7c1b6a96f429", + "metadata": {}, + "outputs": [], + "source": [ + "# find Correlation\n", + "\n", + "\n", + "\n", + "# visualize with appropriate map\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "5c1a2cd9-27f5-425c-817b-d09eec81c9f0", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "62544ad8-1013-49ff-a111-475a2bf3c632", + "metadata": {}, + "source": [ + "#### 3.4 Which countries have the highest and lowest life expectancies, and what are their common characteristics?" + ] + }, + { + "cell_type": "markdown", + "id": "aeed723f-afa7-4da5-8c2b-d2e21556e298", + "metadata": {}, + "source": [ + "**Marks Breakdown**: \n", + "1. **High_expectancy Countries** **2 marks**\n", + "2. **Low_Expectancy Countries** **2 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "id": "f9b31ff8-e258-4fc1-858f-e08701b94481", + "metadata": {}, + "outputs": [], + "source": [ + "# find and display High_expectancy Countries\n", + "\n", + "\n", + "\n", + "# find and display Low_expectancy Countries\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "03fa0b6a-4c2d-4ef9-9e88-b5e8c9943f33", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "a8a14bff-e757-4e88-896d-5bc05952e1f1", + "metadata": {}, + "source": [ + " #### 3.5 How has life expectancy evolved over time in different countries?" + ] + }, + { + "cell_type": "markdown", + "id": "6be8bb40-2b40-4d77-aabd-9b4ff0912388", + "metadata": {}, + "source": [ + "**Marks Breakdown**: \n", + "1. **Looping Countires** **3 marks**\n", + "2. **Visualization** **2 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "id": "9cad005f-75c2-4512-87fe-a349066ffda7", + "metadata": {}, + "outputs": [], + "source": [ + "# find Expectency on all countries using loop\n", + "\n", + "\n", + "\n", + "# visualize results\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "dac3950e-b8c1-47c9-b418-c81acf2d5144", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "c50a74a7-9732-44b5-9e7e-d3b586595c4c", + "metadata": {}, + "source": [ + "# 4.1 Numerical Methods Calculations" + ] + }, + { + "cell_type": "markdown", + "id": "cfcd1858-e53e-4c72-8e98-227dee49cfd7", + "metadata": {}, + "source": [ + "#### 4.1 Find the mean, median, min,max and standard deviation of life expectancy across all countries" + ] + }, + { + "cell_type": "markdown", + "id": "49053544-3cf8-4357-a31f-42f5aefc0dbc", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **5 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "id": "510993ff-5ed2-45f2-8ad4-e765b9348904", + "metadata": {}, + "outputs": [], + "source": [ + "# add and Display code here" + ] + }, + { + "cell_type": "markdown", + "id": "220f9107-5cfa-43c0-88ea-ab92142914cd", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "0338eca2-41fa-4059-9eb3-5ec4855d2397", + "metadata": {}, + "source": [ + "#### 4.2 Perform a linear regression to predict life expectancy using features like GDP, schooling, and healthcare expenditure. What are the most important predictors?" + ] + }, + { + "cell_type": "markdown", + "id": "abed9d91-6146-4ba7-add8-cf8ed92f5bb4", + "metadata": {}, + "source": [ + "# **Marks Breakdown**: **5 Marks**\n", + "1. **Cleaning Data** **1 mark**\n", + "2. **split training and testing** **1 marks**\n", + "3. **Linear Regression Model** **1 marks**\n", + "4. **Prediction** **1 marks**\n", + "5. **Evaluation-MSE** **1 marks**\n", + "6. " + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "id": "c047757f-2e2e-4318-8f98-02f1eb1f2ee8", + "metadata": {}, + "outputs": [], + "source": [ + "# import libraries\n", + "\n", + "# Drop missing values\n", + "\n", + "\n", + "# Split data into training and testing sets\n", + "\n", + "\n", + "\n", + "# Linear Regression\n", + "\n", + "\n", + "\n", + "# Predict\n", + "\n", + "\n", + "# Evaluate the model\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "d08edd0f-967f-4d42-ab70-e2e03b0faa00", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "48e323d9-ebc6-412d-97fb-6b3c9a643829", + "metadata": {}, + "source": [ + "#### Design an algorithm for this Linear Regression" + ] + }, + { + "cell_type": "markdown", + "id": "3ff568a2-f90e-48e9-92b4-8d6824842dba", + "metadata": {}, + "source": [] + }, + { + "cell_type": "markdown", + "id": "f4e7a728-edce-42ec-905f-d7f5b14e37cd", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "407fa92b-a0f7-4485-b6b9-0a30a94a6b4e", + "metadata": {}, + "source": [ + "#### 4.3 Can we calculate the growth rate of life expectancy over time for different countries using numerical differentiation?" + ] + }, + { + "cell_type": "markdown", + "id": "097e54a7-4039-4b1f-abee-151afee42f89", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **3 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "id": "ba88b07e-bfe2-478a-84fe-9c11a67be436", + "metadata": {}, + "outputs": [], + "source": [ + "# add code to calculate growth rate here\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "649aa202-c9d5-43e0-8ce3-63f5f24e4ccf", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "84f09af8-d3d7-4384-9a9d-d0e2b8c07b2f", + "metadata": {}, + "source": [ + "#### 4.4 Use interpolation to estimate missing values for life expectancy or other variables in the dataset.\n" + ] + }, + { + "cell_type": "markdown", + "id": "f8be4655-fd18-465d-bf6e-fd7267ac374f", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **3 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "id": "02b09e13-ab93-47a0-ac9a-0c3e8fb4c25c", + "metadata": {}, + "outputs": [], + "source": [ + "# add interpolation here\n" + ] + }, + { + "cell_type": "markdown", + "id": "e09b6235-fdc8-40c6-b8bc-e15acf87e957", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "2d4ece1e-d93f-439d-9b18-ff9016098c7b", + "metadata": {}, + "source": [ + "#### 4.5 Calculate the cumulative distribution of life expectancy across regions (e.g., developing vs. developed)." + ] + }, + { + "cell_type": "markdown", + "id": "47ca72a1-157e-4570-8e1c-de0644a4b506", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **3 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "id": "88cb46ef-4d75-4fe5-abe1-e8037114d172", + "metadata": {}, + "outputs": [], + "source": [ + "# add code here" + ] + }, + { + "cell_type": "markdown", + "id": "9ab70c5d-240c-4238-9559-19da05dd9247", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "fcfe7a40-c246-4318-bab3-d9f7c6f3a70a", + "metadata": {}, + "source": [ + "## Data Visualization" + ] + }, + { + "cell_type": "markdown", + "id": "e7c1e68f-542a-407d-a7cd-679efd595d89", + "metadata": {}, + "source": [ + "#### 5.1 Create a line chart to visualize how life expectancy has changed over time for different countries or regions." + ] + }, + { + "cell_type": "markdown", + "id": "55e1eeb0-3fc5-404b-83b3-e69c6457d4c2", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **5 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "id": "87a9e399-96bd-477a-bc97-6a3734c9e698", + "metadata": {}, + "outputs": [], + "source": [ + "# add code here for displat" + ] + }, + { + "cell_type": "markdown", + "id": "13b8e1a1-b90d-465d-af28-94c37507010c", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "a61fbc2e-a3a3-43a7-a3ec-2fd9b0f93192", + "metadata": {}, + "source": [ + "#### 5.2 Use a scatter plot to show the relationship between GDP and life expectancy for different countries" + ] + }, + { + "cell_type": "markdown", + "id": "af425e47-4ab2-43fd-bc8b-79f233197f3a", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **5 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 91, + "id": "355ced33-1ba0-4d47-b977-3cf16c7846b0", + "metadata": {}, + "outputs": [], + "source": [ + "# add code here\n" + ] + }, + { + "cell_type": "markdown", + "id": "bf37a786-c02f-4032-8db9-ea45afda9953", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "d2233a3a-fbed-4d60-acda-4eb341d11afd", + "metadata": {}, + "source": [ + "#### 5.3 Create chart to compare life expectancy across different continents or regions." + ] + }, + { + "cell_type": "markdown", + "id": "9e3a7bc6-ac27-4b93-9a30-b69cf79d0824", + "metadata": {}, + "source": [ + "**Marks Breakdown**: **5 marks**" + ] + }, + { + "cell_type": "code", + "execution_count": 89, + "id": "ae13c09b-81d5-4c99-9f61-00f38cff99d1", + "metadata": {}, + "outputs": [], + "source": [ + "# add code here" + ] + }, + { + "cell_type": "markdown", + "id": "f945a83d-9167-411c-a09d-d3ac78ff1ba7", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "markdown", + "id": "691f6b64-2b1a-4db7-9024-3fa3c80d943f", + "metadata": {}, + "source": [ + "# 6. Briefly express your reflective diary -reflecting on the process to develop a solution to this task.\tThe report should not exceed 500 word\n", + "\n", + "1. **include an explanation of how they approached the task**\n", + "2. **include any pseudo code or other algorithmic aid used to help complete the task**\n", + "3. **identify the strengths/weaknesses of the approach used**\n", + "4. **consider the approach used could be improved**\n", + "5. **suggest alternative approaches that could have been taken**ken\r\n", + "ds.\r\n" + ] + }, + { + "cell_type": "markdown", + "id": "2d126031-1407-45bd-a3c8-f19019d833b7", + "metadata": {}, + "source": [ + " **Marks Breakdown**: **10 marks**\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1b9b4b41-7760-4fc4-8c72-66d47c51985e", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "a7fdeca0-253b-48b0-afba-befc5f4217ba", + "metadata": {}, + "source": [ + "#### Feedback\n", + "#### Marks" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1d1304e4-2dbf-4bad-8cda-8605f6d0634d", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.4" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}